What's New in Globalization Support?

This section describes new features of globalization support and provides pointers to additional information.

Oracle Database 10g Release 2 (10.2) New Features in Globalization

Unicode 4.0 Support

Unicode support has been enhanced to support the latest version of the Unicode standard.

See Also:
Chapter 6, "Supporting Multilingual Databases with Unicode"
Character Set Scanner Utilities Enhancements

The Database Character Set Scanner (CSSCAN) introduces two new parameters, QUERY and COLUMN, which offer finer control in performing selective scanning. Support for multilevel varrays and nested tables has also been added.

The Language and Character Set File Scanner (LCSSCAN) now supports the detection of HTML files. The detection quality of shorter text strings has also been enhanced.

See Also:
Chapter 12, "Character Set Scanner Utilities"
Globalization Development Kit

The Globalization Development Kit (GDK) for PL/SQL provides new locale mapping functions, and offers support for Japanese Kana conversion using the new transliteration function in the UTL_I18N package.

See Also:
Chapter 8, "Oracle Globalization Development Kit"
NCHAR String Literal Support

SQL NCHAR literals used in insert and update statements no longer rely on the database character set for conversion. This means that multilingual data can be added without restrictions such as having to provide hex Unicode values. The support for this feature is available in SQL, PL/SQL, OCI, and JDBC.

See Also:
"NCHAR String Literal Replacement" in Chapter 7, "Programming with Unicode"
Consistent Linguistic Ordering Support

The support for all SQL functions and operators to honor the NLS_SORT setting is now available using the new NLS_COMP mode LINGUISTIC. This feature ensures all SQL string comparisons are consistent, and that they follow the linguistic convention as specified in the NLS_SORT parameter.

See Also:
Chapter 5, "Linguistic Sorting and String Searching"
Recommended Database Character Sets and Statement of Direction

A list of character sets has been compiled that Oracle strongly recommends for usage as the database character set. Starting with the next major functional release after Oracle Database 10g Release 2, the choice for the database character set will be limited to this list of recommended character sets for new system deployment.

See Also:
Chapter 2, "Choosing a Character Set" and Appendix A, "Locale Data"

Oracle Database 10g Release 1 (10.1) New Features in Globalization

Accent Insensitive and Case-Insensitive Linguistic Sorts and Queries

Oracle provides linguistic sorts and queries that use information about base letter, accents, and case to sort character strings. This release enables you to specify a sort or query on the base letters only (accent-insensitive) or on the base letters and the accents (case-insensitive).

See Also:
"Linguistic Sort Features"
Character Set Scanner Utilities Enhancements

The Database Character Set Scanner now supports object types.

The new LCSD parameter enables the Database Character Set Scanner (CSSCAN) to perform language and character set detection on the data cells categorized by the LCSDATA parameter. The Database Character Set Scanner reports have also been enhanced.
- Database Character Set Scanner CSALTER Script
  
  The CSALTER script is a database administrator tool for special character set migration.
- The Language and Character Set File Scanner Utility
  
  The Language and Character Set File Scanner (LCSSCAN) is a high-performance, statistically-based utility for determining the character set and language for unspecified plain file text.
See Also:
Chapter 12, "Character Set Scanner Utilities"
Globalization Development Kit

The Globalization Development Kit (GDK) simplifies the development process and reduces the cost of developing Internet applications that will support a global multilingual market. GDK includes APIs, tools, and documentation that address many of the design, development, and deployment issues encountered in the creation of global applications. GDK lets a single program work with text in any language from anywhere in the world. It enables you to build a complete multilingual server application with little more effort than it takes to build a monolingual server application.

See Also:
Chapter 8, "Oracle Globalization Development Kit"
Regular Expressions

This release supports POSIX-compliant regular expressions to enhance search and replace capability in programming environments such as UNIX and Java. In SQL, this new functionality is implemented through new functions that are regular expression extensions to existing SQL functions such as LIKE, REPLACE, and INSTR. This implementation supports multilingual queries and is locale-sensitive.

See Also:
"SQL Regular Expressions in a Multilingual Environment"
Displaying Code Charts for Unicode Character Sets

Oracle Locale Builder can display code charts for Unicode character sets.

See Also:
"Displaying a Code Chart with the Oracle Locale Builder"
Locale Variants

In previous releases, Oracle defined language and territory definitions separately. This resulted in the definition of a territory being independent of the language setting of the user. In this release, some territories can have different date, time, number, and monetary formats based on the language setting of a user. This type of language-dependent territory definition is called a locale variant.

See Also:
"Locale Variants"
Transportable NLB Data

NLB files that are generated on one platform can be transported to another platform by, for example, FTP. The transported NLB files can be used the same way as the NLB files that were generated on the original platform. This is convenient because locale data can be modified on one platform and copied to other platforms.

See Also:
"Transportable NLB Data"
NLS_LENGTH_SEMANTICS

NLS_LENGTH_SEMANTICS is now supported as an environment variable.

See Also:
"NLS_LENGTH_SEMANTICS"
Implicit Conversion Between CLOB and NCLOB Datatypes

Implicit conversion between CLOB and NCLOB datatypes is now supported.

See Also:
"Choosing a National Character Set"
Updates to the Oracle Language and Territory Definition Files

Changes have been made to the content in some of the language and territory definition files in Oracle Database 10g Release 1.

See Also:
"Obsolete Locale Data"