Using Catalog Functions

5 Designing ODBC Applications for Performance Optimization : Using Catalog Functions

Using Catalog Functions

Because catalog functions, such as those listed here, are slow compared to other ODBC functions, their frequent use can impair system performance:

■

SQLColumns

■

SQLProcedureColumns

■

SQLColumnPrivileges

■

SQLSpecialColumns

■

SQLForeignKeys

■

SQLStatistics

■

SQLGetTypeInfo

■

SQLTables

■

SQLProcedures

SQLGetTypeInfo is included in this list of expensive ODBC functions because many drivers must query the server to obtain accurate information about which types are supported (for example, to find dynamic types such as user defined types).

Caching Information to Minimize the Use of Catalog Functions

To return all result column information mandated by the ODBC specification, a driver may have to perform multiple queries, joins, subqueries, or unions to return the required result set for a single call to a catalog function. These particular elements of the SQL language are performance expensive.

Although it is almost impossible to write an ODBC application without catalog functions, their use should be minimized. By caching information, applications can avoid multiple executions.

For example, call SQLGetTypeInfo once in the application and cache the elements of the result set that your application depends on. It is unlikely that any application uses all elements of the result set generated by a catalog function, so the cached information should not be difficult to maintain.

Avoiding Search Patterns

Passing NULL arguments or search patterns to catalog functions generates time-consuming queries. In addition, network traffic potentially increases because of unwanted results. Always supply as many non-NULL arguments to catalog functions as possible. Because catalog functions are slow, applications should invoke them efficiently. Any information that the application can send the driver when calling catalog functions can result in improved performance and reliability.

For example, consider a call to SQLTables where the application requests information about the table "Customers." Often, this call is coded as shown, using as many NULL arguments as possible:

rc = SQLTables (hstmt, NULL, 0, NULL, 0, "Customers", SQL_NTS, NULL, 0);

A driver processes this SQLTables call into SQL that looks like this:

SELECT ...  FROM SysTables WHERE TableName = ’Customers’

UNION ALL

SELECT ... FROM SysViews WHERE ViewName = ’Customers’

UNION ALL

SELECT ... FROM SysSynonyms WHERE SynName = ’Customers’ ORDER BY ...

In our example, the application provides scant information about the object for which information was requested. Suppose three "Customers" tables were returned in the result set: the first table owned by the user named Beth, the second owned by the sales department, and the third a view created by management.

It may not be obvious to the end user which table to choose. If the application had specified the OwnerName argument in the SQLTables call, only one table would be returned and performance would improve. Less network traffic would be required to return only one result row and unwanted rows would be filtered by the database. In addition, if the TableType argument was supplied, the SQL sent to the server can be optimized from a three-query union into a single Select statement as shown:

SELECT ... FROM SysTables WHERE TableName = 'Customers' AND Owner = 'Beth'

Using a Dummy Query to Determine Table Characteristics

Avoid using SQLColumns to determine characteristics about a table. Instead, use a dummy query with SQLDescribeCol.

Consider an application that allows the user to choose the columns that will be selected. Should the application use SQLColumns to return information about the columns to the user or prepare a dummy query and call SQLDescribeCol?

Case 1: SQLColumns Method

rc = SQLColumns (... "UnknownTable" ...);

// This call to SQLColumns will generate a query to the system catalogs...

// possibly a join which must be prepared, executed, and produce a result set

rc = SQLBindCol (...);

rc = SQLExtendedFetch (...);

// user must retrieve N rows from the server

// N = # result columns of UnknownTable

// result column information has now been obtained

Case 2: SQLDescribeCol Method

// prepare dummy query

rc = SQLPrepare (... "SELECT * FROM UnknownTable WHERE 1 = 0" ...);

// query is never executed on the server - only prepared

rc = SQLNumResultCols (...);

for (irow = 1; irow <= NumColumns; irow++) {

  rc = SQLDescribeCol (...)

  // + optional calls to SQLColAttributes

  }

// result column information has now been obtained

// Note we also know the column ordering within the table!

// This information cannot be assumed from the SQLColumns example.

In both cases, a query is sent to the server, but in Case 1, the query must be evaluated and form a result set that must be sent to the client. Clearly, Case 2 is the better performing model.

To complicate this discussion, let us consider a database server that does not natively support preparing a SQL statement. The performance of Case 1 does not change, but the performance of Case 2 improves slightly because the dummy query is evaluated before being prepared. Because the Where clause of the query always evaluates to FALSE, the query generates no result rows and should execute without accessing table data. Again, for this situation, Case 2 outperforms Case 1.