Oracle® Database Globalization Support Guide 10g Release 2 (10.2) Part Number B14225-02 |
|
|
View PDF |
This section describes new features of globalization support and provides pointers to additional information.
Unicode 4.0 Support
Unicode support has been enhanced to support the latest version of the Unicode standard.
Character Set Scanner Utilities Enhancements
The Database Character Set Scanner (CSSCAN
) introduces two new parameters, QUERY
and COLUMN
, which offer finer control in performing selective scanning. Support for multilevel varrays and nested tables has also been added.
The Language and Character Set File Scanner (LCSSCAN
) now supports the detection of HTML files. The detection quality of shorter text strings has also been enhanced.
Globalization Development Kit
The Globalization Development Kit (GDK) for PL/SQL provides new locale mapping functions, and offers support for Japanese Kana conversion using the new transliteration function in the UTL_I18N
package.
NCHAR
String Literal Support
SQL NCHAR
literals used in insert and update statements no longer rely on the database character set for conversion. This means that multilingual data can be added without restrictions such as having to provide hex Unicode values. The support for this feature is available in SQL, PL/SQL, OCI, and JDBC.
Consistent Linguistic Ordering Support
The support for all SQL functions and operators to honor the NLS_SORT
setting is now available using the new NLS_COMP
mode LINGUISTIC
. This feature ensures all SQL string comparisons are consistent, and that they follow the linguistic convention as specified in the NLS_SORT
parameter.
Recommended Database Character Sets and Statement of Direction
A list of character sets has been compiled that Oracle strongly recommends for usage as the database character set. Starting with the next major functional release after Oracle Database 10g Release 2, the choice for the database character set will be limited to this list of recommended character sets for new system deployment.
Accent Insensitive and Case-Insensitive Linguistic Sorts and Queries
Oracle provides linguistic sorts and queries that use information about base letter, accents, and case to sort character strings. This release enables you to specify a sort or query on the base letters only (accent-insensitive) or on the base letters and the accents (case-insensitive).
See Also:
"Linguistic Sort Features"Character Set Scanner Utilities Enhancements
The Database Character Set Scanner now supports object types.
The new LCSD
parameter enables the Database Character Set Scanner (CSSCAN
) to perform language and character set detection on the data cells categorized by the LCSDATA
parameter. The Database Character Set Scanner reports have also been enhanced.
Database Character Set Scanner CSALTER
Script
The CSALTER
script is a database administrator tool for special character set migration.
The Language and Character Set File Scanner Utility
The Language and Character Set File Scanner (LCSSCAN
) is a high-performance, statistically-based utility for determining the character set and language for unspecified plain file text.
Globalization Development Kit
The Globalization Development Kit (GDK) simplifies the development process and reduces the cost of developing Internet applications that will support a global multilingual market. GDK includes APIs, tools, and documentation that address many of the design, development, and deployment issues encountered in the creation of global applications. GDK lets a single program work with text in any language from anywhere in the world. It enables you to build a complete multilingual server application with little more effort than it takes to build a monolingual server application.
Regular Expressions
This release supports POSIX-compliant regular expressions to enhance search and replace capability in programming environments such as UNIX and Java. In SQL, this new functionality is implemented through new functions that are regular expression extensions to existing SQL functions such as LIKE
, REPLACE
, and INSTR
. This implementation supports multilingual queries and is locale-sensitive.
Displaying Code Charts for Unicode Character Sets
Oracle Locale Builder can display code charts for Unicode character sets.
Locale Variants
In previous releases, Oracle defined language and territory definitions separately. This resulted in the definition of a territory being independent of the language setting of the user. In this release, some territories can have different date, time, number, and monetary formats based on the language setting of a user. This type of language-dependent territory definition is called a locale variant.
See Also:
"Locale Variants"Transportable NLB Data
NLB files that are generated on one platform can be transported to another platform by, for example, FTP. The transported NLB files can be used the same way as the NLB files that were generated on the original platform. This is convenient because locale data can be modified on one platform and copied to other platforms.
See Also:
"Transportable NLB Data"NLS_LENGTH_SEMANTICS
NLS_LENGTH_SEMANTICS
is now supported as an environment variable.
See Also:
"NLS_LENGTH_SEMANTICS"Implicit Conversion Between CLOB
and NCLOB
Datatypes
Implicit conversion between CLOB
and NCLOB
datatypes is now supported.
See Also:
"Choosing a National Character Set"Updates to the Oracle Language and Territory Definition Files
Changes have been made to the content in some of the language and territory definition files in Oracle Database 10g Release 1.
See Also:
"Obsolete Locale Data"