13 The Oracle Driver : Unicode Support

Unicode Support
The Oracle driver uses the NLS_LANG environment variable setting of the Oracle client to determine how to transmit data to the client.
On Windows, UNIX, and Linux, a Unicode setting is determined if the NLS_LANG environment variable is set to:
LANGUAGE_TERRITORY.CHARSET
where CHARSET is either UTF8, AL24UTFFSS, or AL32UTF8. For example:
AMERICAN_AMERICA.UTF8
Alternatively, on Windows, instead of the NLS_LANG environment variable, the value of the HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\oracle_home_key registry key can be set to:
LANGUAGE_TERRITORY.CHARSET
where oracle_home_key is HOME0 for Oracle 9i R2 and earlier, and is the Oracle home name used at the time of client installation for Oracle 10g.
If the CHARSET is a Unicode setting and a Unicode application is accessing the driver, then no data conversion is necessary. If an ANSI application is accessing the driver, then the driver must convert the data from the application from ANSI to Unicode (UTF-8) for the client.
If the CHARSET is ANSI and an ANSI application is accessing the driver, then no data conversion is necessary. If a Unicode application is accessing the driver, then the driver must convert the data from the application from Unicode to ANSI for the client.
If NLS_LANG is set to UTF-8, the Oracle driver maps the Oracle data types to Unicode data types as shown in the following table:
The driver also continues to map these Oracle data types to the normal character data types. See “Data Types” for these mappings.
The driver supports the Unicode ODBC W (Wide) function calls, such as SQLConnectW. This allows the Driver Manager to transmit these calls directly to the driver. Otherwise, the Driver Manager would incur the additional overhead of converting the W calls to ANSI function calls, and vice versa.
See “UTF-16 Applications on UNIX and Linux” for related details. Also, refer to Chapter 4 “Internationalization, Localization, and Unicode” in the DataDirect Connect Series for ODBC Reference for a more detailed explanation of Unicode.