9 DataDirect Bulk Load : Export, Validate, and Load Functions

Export, Validate, and Load Functions
The example code in this section shows the DataDirect functions for bulk exporting, verification, and bulk loading.
ExportTableToFile and ExportTableToFileW
Syntax
SQLReturn
ExportTableToFile  (HDBC      hdbc,
                   SQLCHAR*   TableName,
                   SQLCHAR*   FileName,
                   SQLLEN     IANAAppCodePage,
                   SQLLEN     ErrorTolerance,
                   SQLLEN     WarningTolerance,
                   SQLCHAR*   LogFile)
ExportTableToFileW (HDBC      hdbc,
                   SQLWCHAR*  TableName,
                   SQLWCHAR*  FileName,
                   SQLLEN     IANAAppCodePage,
                   SQLLEN     ErrorTolerance,
                   SQLLEN     WarningTolerance,
                   SQLWCHAR*  LogFile)
The standard ODBC return codes are returned: SQL_SUCCESS, SQL_SUCCESS_WITH_INFO, SQL_INVALID_HANDLE, and SQL_ERROR.
Description
ExportTableToFile (ANSI application) and ExportTableToFileW (Unicode application) bulk export a table to a physical file. Both a bulk data file and a bulk configuration file are produced by this operation. The configuration file has the same name as the data file, but with an XML extension. The bulk export operation can create a log file and can also export to external files. Refer to “External Overflow Files” in Chapter 3 of the DataDirect Connect Series for ODBC User’s Guide for more information. The export operation can be configured such that if any errors or warnings occur:
Parameters
hdbc is the driver’s connection handle, which is not the handle returned by SQLAllocHandle or SQLAllocConnect. To obtain the driver's connection handle, the application must then use the standard ODBC function SQLGetInfo (ODBC Conn Handle, SQL_DRIVER_HDBC).
TableName is a null-terminated string that specifies the name of the source database table that contains the data to be exported.
FileName is a null-terminated string that specifies the path (relative or absolute) and file name of the bulk load data file to which the data is to be exported. It also specifies the file name of the bulk configuration file. This file must not already exist. If the file already exists, an error is returned.
IANAAppCodePage specifies the code page value to which the driver must convert all data for storage in the bulk data file. Refer to “Character Set Conversions” in Chapter 3 of the DataDirect Connect Series for ODBC User’s Guide for more information.
The default value on Windows is the current code page of the machine. On UNIX/Linux, the default value is 4.
ErrorTolerance specifies the number of errors to tolerate before an operation terminates. A value of 0 indicates that no errors are tolerated; the operation fails when the first error is encountered.
The default of -1 means that an infinite number of errors is tolerated.
WarningTolerance specifies the number of warnings to tolerate before an operation terminates. A value of 0 indicates that no warnings are tolerated; the operation fails when the first warning is encountered.
The default of -1 means that an infinite number of warnings is tolerated.
LogFile is a null-terminated character string that specifies the path (relative or absolute) and file name of the bulk log file. Events logged to this file are:
Information about the load is written to this file, preceded by a header. Information about the next load is appended to the end of the file.
If LogFile is NULL, no log file is created.
Example
HDBC      hdbc;
HENV      henv;
void      *driverHandle;
HMODULE      hmod;
PExportTableToFile exportTableToFile;
 
char      tableName[128];
char      fileName[512];
char      logFile[512];
int       errorTolerance;
int       warningTolerance;
int       codePage;
 
/* Get the driver's connection handle from the DM. This handle must be used when calling directly into the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HDBC, &driverHandle, 0, NULL);
if (rc != SQL_SUCCESS) {
    ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
    EnvClose (henv, hdbc);
    exit (255);
}
 
/* Get the DM's shared library or DLL handle to the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HLIB, &hmod, 0, NULL);
if (rc != SQL_SUCCESS) {
    ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
    EnvClose (henv, hdbc);
    exit (255);
}
 
exportTableToFile = (PExportTableToFile)
    resolveName (hmod, "ExportTableToFile");
if (! exportTableToFile) {
    printf ("Cannot find ExportTableToFile!\n");
    exit (255);
}
 
rc = (*exportTableToFile) (
      driverHandle,
      (const SQLCHAR *) tableName,
      (const SQLCHAR *) fileName,
      codePage,
      errorTolerance, warningTolerance,
      (const SQLCHAR *) logFile);
if (rc == SQL_SUCCESS) {
      printf ("Export succeeded.\n");
}
else {
      driverError (driverHandle, hmod);
}
ValidateTableFromFile and ValidateTableFromFileW
Syntax
SQLReturn
ValidateTableFromFile  (HDBC       hdbc,
                        SQLCHAR*   TableName,
                        SQLCHAR*   ConfigFile,
                        SQLCHAR*   MessageList,
                        SQLULEN    MessageListSize,
                        SQLULEN*   NumMessages)
ValidateTableFromFileW (HDBC       hdbc,
                        SQLCHAR*   TableName,
                        SQLCHAR*   ConfigFile,
                        SQLCHAR*   MessageList,
                        SQLULEN    MessageListSize,
                        SQLULEN*   NumMessages)
The standard ODBC return codes are returned: SQL_SUCCESS, SQL_SUCCESS_WITH_INFO, SQL_INVALID_HANDLE, and SQL_ERROR.
Description
ValidateTableFromFile (ANSI application) and ValidateTablefromFileW (Unicode application) verify the metadata in the configuration file against the data structure of the target database table. Refer to “Verification of the Bulk Load Configuration File” in Chapter 3 of the DataDirect Connect Series for ODBC User’s Guide for more detailed information.
NOTE FOR SALESFORCE USERS: The Salesforce driver does not support ValidateTableFromFile and ValidateTableFromFileW.
Parameters
hdbc is the driver’s connection handle, which is not the handle returned by SQLAllocHandle or SQLAllocConnect. To obtain the driver's connection handle, the application must then use the standard ODBC function SQLGetInfo (ODBC Conn Handle, SQL_DRIVER_HDBC).
TableName is a null-terminated character string that specifies the name of the target database table into which the data is to be loaded.
ConfigFile is a null-terminated character string that specifies the path (relative or absolute) and file name of the bulk configuration file.
MessageList specifies a pointer to a buffer used to record any of the errors and warnings. MessageList must not be null.
MessageListSize specifies the maximum number of characters that can be written to the buffer to which MessageList points. If the buffer to which MessageList points is not big enough to hold all of the messages generated by the validation process, the validation is aborted and SQL_ERROR is returned.
NumMessages contains the number of messages that were added to the buffer. This method reports the following criteria:
Check data types - Each column data type is checked to ensure no loss of data occurs. If a data type mismatch is detected, the driver adds an entry to the MessageList in the following format: Risk of data conversion loss: Destination column_number is of type x, and source column_number is of type y.
Check column sizes - Each column is checked for appropriate size. If column sizes are too small in destination tables, the driver adds an entry to the MessageList in the following format: Possible Data Truncation: Destination column_number is of size x while source column_number is of size y.
Check codepages - Each column is checked for appropriate code page alignment between the source and destination. If a mismatch occurs, the driver adds an entry to the MessageList in the following format: Destination column code page for column_number risks data corruption if transposed without correct character conversion from source column_number.
Check Config Col Info - The destination metadata and the column metadata in the configuration file are checked for consistency of items such as length for character and binary data types, the character encoding code page for character types, precision and scale for numeric types, and nullablity for all types. If any inconsistency is found, the driver adds an entry to the MessageList in the following format: Destination column metadata for column_number has column info mismatches from source column_number.
Check Column Names and Mapping - The columns defined in the configuration file are compared to the destination table columns based on the order of the columns. If the number of columns in the configuration file and/or import file does not match the number of columns in the table, the driver adds an entry to the MessageList in the following format: The number of destination columns number does not match the number of source columns number.
The function returns an array of null-terminated strings in the buffer to which MessageList points with an entry for each of these checks. If the driver determines that the information in the bulk load configuration file matches the metadata of the destination table, a return code of SQL_SUCCESS is returned and the MessageList remains empty.
If the driver determines that there are minor differences in the information in the bulk load configuration file and the destination table, then SQL_SUCCESS_WITH_INFO is returned and the MessageList is populated with information on the cause of the potential problems.
If the driver determines that the information in the bulk load information file cannot successfully be loaded into the destination table, then a return code of SQL_ERROR is returned and the MessageList is populated with information on the problems and mismatches between the source and destination.
Example
HDBC      hdbc;
HENV      henv;
void      *driverHandle;
HMODULE   hmod;
PValidateTableFromFile validateTableFromFile;
 
char      tableName[128];
char      configFile[512];
char      messageList[10240];
SQLLEN    numMessages;
 
/* Get the driver's connection handle from the DM. This handle must be used when calling directly into the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HDBC, &driverHandle, 0, NULL);
if (rc != SQL_SUCCESS) {
    ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
    EnvClose (henv, hdbc);
    exit (255);
}
 
/* Get the DM's shared library or DLL handle to the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HLIB, &hmod, 0, NULL);
if (rc != SQL_SUCCESS) {
    ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
    EnvClose (henv, hdbc);
    exit (255);
}
 
validateTableFromFile = (PValidateTableFromFile)
  resolveName (hmod, "ValidateTableFromFile");
if (!validateTableFromFile) {
  printf ("Cannot find ValidateTableFromFile!\n");
  exit (255);
}
 
messageList[0] = 0;
numMessages = 0;
 
rc = (*validateTableFromFile) (
      driverHandle,
      (const SQLCHAR *) tableName,
      (const SQLCHAR *) configFile,
      (SQLCHAR *) messageList,
      sizeof (messageList),
      &numMessages);
printf ("%d message%s%s\n", numMessages,
       (numMessages == 0) ? "s" :
       ((numMessages == 1) ? " : " : "s : "),
       (numMessages > 0) ? messageList : "");
if (rc == SQL_SUCCESS) {
       printf ("Validate succeeded.\n");
}
else {
    driverError (driverHandle, hmod);
}
LoadTableFromFile and LoadTableFromFileW
Syntax
SQLReturn
LoadTableFromFile  (HDBC      hdbc,
                    SQLCHAR*   TableName,
                    SQLCHAR*   FileName,
                    SQLLEN     ErrorTolerance,
                    SQLLEN     WarningTolerance,
                    SQLCHAR*   ConfigFile,
                    SQLCHAR*   LogFile,
                    SQLCHAR*   DiscardFile,
                    SQLULEN    LoadStart,
                    SQLULEN    LoadCount,
                    SQLULEN    ReadBufferSize)
LoadTableFromFileW  (HDBC      hdbc,
                    SQLWCHAR*  TableName,
                    SQLWCHAR*  FileName,
                    SQLLEN     ErrorTolerance,
                    SQLLEN     WarningTolerance,
                    SQLWCHAR*  ConfigFile,
                    SQLWCHAR*  LogFile,
                    SQLWCHAR*  DiscardFile,
                    SQLULEN    LoadStart,
                    SQLULEN    LoadCount,
                    SQLULEN    ReadBufferSize)
The standard ODBC return codes are returned: SQL_SUCCESS, SQL_SUCCESS_WITH_INFO, SQL_INVALID_HANDLE, and SQL_ERROR.
Description
LoadTableFromFile (ANSI application) and LoadTablefromFileW (Unicode application) bulk load data from a file to a table. The load operation can create a log file and can also create a discard file that contains rows rejected during the load. The discard file is in the same format as the bulk load data file. After fixing reported issues in the discard file, the bulk load can be reissued using the discard file as the bulk load data file.
The load operation can be configured such that if any errors or warnings occur:
If a load fails, the LoadStart and LoadCount parameters can be used to control which rows are loaded when a load is restarted after a failure.
Parameters
hdbc is the driver’s connection handle, which is not the handle returned by SQLAllocHandle or SQLAllocConnect. To obtain the driver's connection handle, the application must then use the standard ODBC function SQLGetInfo (ODBC Conn Handle, SQL_DRIVER_HDBC).
TableName is a null-terminated character string that specifies the name of the target database table into which the data is to be loaded. For the Salesforce driver, the value of this parameter can vary. See “Using the TableName Parameter with the Salesforce Driver” for more information.
FileName is a null-terminated string that specifies the path (relative or absolute) and file name of the bulk data file from which the data is to be loaded.
ErrorTolerance specifies the number of errors to tolerate before an operation terminates. A value of 0 indicates that no errors are tolerated; the operation fails when the first error is encountered.
The default of -1 means that an infinite number of errors is tolerated.
WarningTolerance specifies the number of warnings to tolerate before an operation terminates. A value of 0 indicates that no warnings are tolerated; the operation fails when the first warning is encountered.
The default of -1 means that an infinite number of warnings is tolerated.
ConfigFile is a null-terminated character string that specifies the path (relative or absolute) and file name of the bulk configuration file.
LogFile is a null-terminated character string that specifies the path (relative or absolute) and file name of the bulk log file. Events logged to this file are:
Message for each row that failed to load.
Information about the load is written to this file, preceded by a header. Information about the next load is appended to the end of the file.
If LogFile is NULL, no log file is created.
DiscardFile is a null-terminated character string that specifies the path (relative or absolute) and file name of the bulk discard file. Any row that cannot be inserted into database as result of bulk load is added to this file, with the last row to be rejected added to the end of the file.
Information about the load is written to this file, preceded by a header. Information about the next load is appended to the end of the file.
If DiscardFile is NULL, no discard file is created.
LoadStart specifies the first row to be loaded from the data file. Rows are numbered starting with 1. For example, when LoadStart=10, the first 9 rows of the file are skipped and the first row loaded is row 10. This parameter can be used to restart a load after a failure.
LoadCount specifies the number of rows to be loaded from the data file. The bulk load operation loads rows up to the value of LoadCount from the file to the database. It is valid for LoadCount to specify more rows than exist in the data file. The bulk load operation completes successfully when either the LoadCount value has been loaded or the end of the data file is reached. This parameter can be used in conjunction with LoadStart to restart a load after a failure.
ReadBufferSize specifies the size, in KB, of the buffer that is used to read the bulk data file for a bulk load operation. The default is 2048.
Example
HDBC      hdbc;
HENV      henv;
void      *driverHandle;
HMODULE   hmod;
PLoadTableFromFile loadTableFromFile;
char      tableName[128];
char      fileName[512];
char      configFile[512];
char      logFile[512];
char      discardFile[512];
int       errorTolerance;
int       warningTolerance;
int       loadStart;
int       loadCount;
int       readBufferSize;
 
/* Get the driver's connection handle from the DM. This handle must be used when calling directly into the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HDBC, &driverHandle, 0, NULL);
if (rc != SQL_SUCCESS) {
  ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
  EnvClose (henv, hdbc);
  exit (255);
}
 
/* Get the DM's shared library or DLL handle to the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HLIB, &hmod, 0, NULL);
if (rc != SQL_SUCCESS) {
  ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
  EnvClose (henv, hdbc);
  exit (255);
}
 
loadTableFromFile = (PLoadTableFromFile)
  resolveName (hmod, "LoadTableFromFile");
if (! loadTableFromFile) {
  printf ("Cannot find LoadTableFromFile!\n");
  exit (255);
}
 
rc = (*loadTableFromFile) (
     driverHandle,
     (const SQLCHAR *) tableName,
     (const SQLCHAR *) fileName,
     errorTolerance, warningTolerance,
     (const SQLCHAR *) configFile,
     (const SQLCHAR *) logFile,
     (const SQLCHAR *) discardFile,
     loadStart, loadCount,
     readBufferSize);
if (rc == SQL_SUCCESS) {
     printf ("Load succeeded.\n");
}
else {
     driverError (driverHandle, hmod);
}
Using the TableName Parameter with the Salesforce Driver
The value required in the TableName parameter varies, depending on the bulk operation specified in the SetBulkOperation function. The following paragraphs describe the TableName value based on whether the Bulk Operation type is set to INSERT, DELETE, or UPSERT.
BULK_OPERATION_INSERT
table_name [(column_list)]
Where
column_list is (columnSpec[, columnSpec]…) and
columnSpec can be columnName or foreignKeyColumnName EXT_ID externalIdColumnName.
The column names define the mapping between columns in the table and columns in the bulk data file. The column names can also indicate which columns are External ID columns. See “Specifying an External ID Column” for more information.
The SQL equivalent of this function is:
INSERT INTO table_name [(column_list)] VALUES (? … ?)
BULK_OPERATION_DELETE
table_name (column_list)
Where column_list is the ID column, which identifies the row to delete.
For DELETE, the ID column is the only valid column in the column list.
The SQL equivalent of this function is:
DELETE FROM table_name WHERE <column> = ? AND <column> = ? …
BULK_OPERATION_UPDATE
table_name (column_list)
Where column_list is ID_column, <update column>[,<update column>]…
ID_column must be one of the columns in the column list. The ID column identifies which row to update; the other columns are the list of columns to be updated.
The SQL equivalent of this function is:
UPDATE table_name SET <update column> = ? … WHERE <ID column> = ? …
BULK_OPERATION_UPSERT
table_name (column_list)
The column list is the same as for INSERT except that at least one of the columns must be identified as an external ID (see BULK_OPERATION_INSERT ).
For UPSERT, column_list can be (columnSpec[, columnSpec]…)
columnSpec can be one of the following:
extIdColumn EXT_ID
where extIdColumn is the column that is checked to determine whether the row already exists in the database.
The SQL equivalent of this function is one of the following:
INSERT INTO table_name [(column_list)] VALUES (? … ?)
UPDATE table_name SET <table column> = ? … WHERE <key column> = ? …
See “Specifying an External ID Column” for more information.
 
SetBulkOperation (Salesforce Driver Only)
Syntax
SQLReturn
SetBulkOperation  (HDBC      hdbc,
                   SQLULEN   Operation)
The standard ODBC return codes are returned: SQL_SUCCESS, SQL_SUCCESS_WITH_INFO, SQL_INVALID_HANDLE, SQL_NO_DATA, and SQL_ERROR.
Description
Specifies the bulk operation to be performed when the LoadTableFromFile and LoadTableFromFileW methods are called. The bulk operation remains set until SetBulkOperation is called again. When a connection is established, the initial bulk operation is BULK_OPERATION_INSERT.
Parameters
hdbc is the driver’s connection handle, which is not the handle returned by SQLAllocHandle or SQLAllocConnect. To obtain the driver's connection handle, the application must use SQLGetInfo (ODBC Conn Handle, SQL_DRIVER_HDBC).
Operation is an integer value that specifies the bulk operation to set on the connection. It can have one of the following values:
Example
HDBC      hdbc;
HENV      henv;
void      *driverHandle;
HMODULE   hmod;
PSetBulkOperation setBulkOperation;
 
/* Get the driver's connection handle from the DM. This handle must be used when calling directly into the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HDBC, &driverHandle, 0, NULL);
if (rc != SQL_SUCCESS) {
   ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
   EnvClose (henv, hdbc);
   exit (255);
}
 
/* Get the DM's shared library or DLL handle to the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HLIB, &hmod, 0, NULL);
if (rc != SQL_SUCCESS) {
   ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
   EnvClose (henv, hdbc);
   exit (255);
}
 
/* Set the Bulk Operation type to DELETE. Any subsequent call to LoadTableFromFile(W) will result in a bulk delete of the rows specified. */
 
setBulkOperation = (PSetBulkOperation)
   resolveName (hmod, "SetBulkOperation");
if (! setBulkOperation) {
   printf ("Cannot find SetBulkOperation!\n");
   exit (255);
}
 
rc = (*setBulkOperation) (
      driverHandle,
      BULK_OPERATION_DELETE);
if (rc == SQL_SUCCESS) {
    printf ("Set Bulk operation(DELETE) succeeded.\n");
}
else {
     driverError (driverHandle, hmod);
}
/* */
GetBulkOperation (Salesforce Driver Only)
Syntax
SQLReturn
GetBulkOperation  (HDBC        hdbc,
                   SQLULEN    *pOperation)
The standard ODBC return codes are returned: SQL_SUCCESS, SQL_SUCCESS_WITH_INFO, SQL_INVALID_HANDLE, and SQL_ERROR.
Description
Returns the bulk operation currently set on the connection. The bulk operation specifies the operation to be performed when the LoadTableFromFile or LoadTableFromFileW methods are called (see “LoadTableFromFile and LoadTableFromFileW”).
Parameters
hdbc is the driver’s connection handle, which is not the handle returned by SQLAllocHandle or SQLAllocConnect. To obtain the driver's connection handle, the application must then use the standard ODBC function SQLGetInfo (ODBC Conn Handle, SQL_DRIVER_HDBC).
pOperation is a pointer to the location where current bulk operation specified for the connection is returned. The returned value is one of the operation values defined by “SetBulkOperation (Salesforce Driver Only)”.
Example
HDBC      hdbc;
HENV      henv;
void      *driverHandle;
HMODULE  hmod;
PGetBulkOperation getBulkOperation;
SQLULEN bulkOperationType;
 
/* Get the driver's connection handle from the DM. This handle must be used when calling directly into the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HDBC, &driverHandle, 0, NULL);
if (rc != SQL_SUCCESS) {
ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
EnvClose (henv, hdbc);
exit (255);
}
 
/* Get the DM's shared library or DLL handle to the driver. */
 
rc = SQLGetInfo (hdbc, SQL_DRIVER_HLIB, &hmod, 0, NULL);
if (rc != SQL_SUCCESS) {
ODBC_error (henv, hdbc, SQL_NULL_HSTMT);
EnvClose (henv, hdbc);
exit (255);
}
 
/* Get the current value for bulk operation. */
 
getBulkOperation = (PGetBulkOperation)
resolveName (hmod, "GetBulkOperation");
if (! getBulkOperation) {
printf ("Cannot find GetBulkOperation!\n");
exit (255);
}
 
rc = (*getBulkOperation) (
driverHandle,
&bulkOperationType);
if (rc == SQL_SUCCESS) {
printf ("Current bulk operation is: %u.\n", bulkOperationType);
}
else {
driverError (driverHandle, hmod);
}
 
/* */