21 The Salesforce Driver : Client-Side Caches

Client-Side Caches
The Salesforce driver can implement a client-side data cache for improved performance. Data is cached from the remote data source to the local machine on which the driver is located.
The driver caches data on a per-table basis, as opposed to caching the result of a particular query. Caching data on a table level allows the caches to be queried, filtered, and sorted in other queries. Once a cache is created, its use is transparent to the application. For example, if a cache is created on the Account table, then all subsequent queries that reference Account access the Account cache. Disabling or dropping the cache allows references to the Account table to access the remote data again. Because the use of the cache is transparent, no changes to the application are required to take advantage of the cache.
You must specifically create a cache before it can be populated; caches are not created automatically. After you have created a cache on a table, the cache will be populated as a result of the next operation on the table. For example, after creating a cache on Account, data is returned from the Salesforce data source and stored locally in the cache when you first execute the following statement:
SELECT ROWID, SYS_NAME FROM Account
Any subsequent queries against the Account table return data from the cache, which reduces response time. SQL queries can access both cached data and remote data (data stored in Salesforce that has not been assigned to a cache) in the same statement.
The caches maintained by the Salesforce driver are write-through caches. This means that, for any operation that modifies data in a table that is cached, the driver performs the operation on the remote data first and then updates the cache as much as possible.
To create, modify, refresh, or delete client-side data caches, use the following SQL statement extensions:
See the following sections for overviews of each extension. Refer to Chapter 10 “SQL Statements and Extensions for the Salesforce Driver” in the DataDirect Connect Series for ODBC Reference for descriptions of the syntax of these extensions.
Creating a Cache
You create a cache using the Create Cache statement (refer to “Create Cache (EXT)” in the DataDirect Connect Series for ODBC Reference). A cache can be created on a single table or on a set of related tables. When creating a cache on a single table, you specify the name of the table to cache and can optionally specify a filter for the table. The filter determines whether the cache holds all of the data in the remote table or a subset of the data that matches the filter. You can also specify attributes for the Create Cache statement that determine:
Creating a cache for a set of related tables is similar to creating a cache on a single table except that a primary table and one or more referencing tables are specified. This is useful if you want to cache a subset of data for a table and also cache data related to that subset of data. For example, you might have three tables, Account, Contact, and Opportunity, where both a contact and an opportunity belong to a particular account. Using a relational cache, you could specify that accounts that have had activity in the past year be cached, as well as caching the opportunities and contacts for only those cached accounts.
Modifying a Cache Definition
Once a cache has been created, you can modify the definition of the cache or set of related caches with the Alter Cache statement (refer to “Alter Cache (EXT)” in the DataDirect Connect Series for ODBC Reference). Only the attributes of the cache can be modified through the Alter Cache statement; the table or related set of tables cannot be changed and a single table cache cannot be changed to a relational cache.
WARNING: Changing the attributes of a cache may cause the current data in the cache to be discarded and refetched from the remote data source.
Disabling and Enabling a Cache
When a cache is defined on a table, all fetch operations performed on that table access the cache, essentially hiding the remote table from the application. At times, you may want an application to query the remote data instead of the cached data. For example, assume that a cache was created on Account with a filter set to cache accounts that have had activity in the past year. You may want to run a query to get information about an account that has not been active for two years. One alternative would be to drop the Account cache, run the query, and then recreate the cache on Account, but this can be problematic. First, you must recreate the cache and make sure it had the same attributes as before. Second, the data in the cache is discarded and needs to be refetched when the cache is recreated. Depending on the amount of cached data, this could take a significant amount of time. To address this type of issue, the Salesforce driver can temporarily disable a cache. When a cache is disabled, its definition and data are maintained. Any queries that reference a table with a disabled cache access the remote table. When you want to access cached data again, the cache can be enabled.
Refreshing Cache Data
To prevent the data in a cache from becoming out of date, the driver must periodically refresh the cache data with data from the remote data source. To minimize the amount of data that needs to be moved when a cache is refreshed, and therefore the time required to refresh it, the driver checks to see which records in the remote table have been added, modified, or deleted since the last time the cache was refreshed. The driver retrieves only data for added or modified records and removes only deleted records from the cache. You or the application can refresh the cache manually or the driver can refresh the cache automatically.
You can refresh a cache manually at any time by using the Refresh Cache statement (refer to “Refresh Cache (EXT)” in the DataDirect Connect Series for ODBC Reference). The Refresh Cache statement can also be used to perform a Clean (complete) refresh in addition to the standard optimized refresh. A Clean refresh discards all of the data from the cache and repopulates it with data from the remote data source.
The driver can refresh a cache automatically in one of two ways. When you create a cache, one of the attributes that you set is the refresh interval for the cache. During each cache query, the driver checks to see whether the time elapsed since the last refresh exceeds the refresh interval for the cache. If it has, the driver refreshes the cache before satisfying the query.
Update operations to a table that is cached can trigger the driver to refresh the cache automatically. The caches maintained by the Salesforce driver are write-through caches. For any operation that modifies data in a table that is cached, the driver performs the operation on the remote data first and then updates the cache as much as possible. The driver may not be able to update the cache with all of the modifications because some of the modified data may have been generated by the remote data source. For example, if a row is inserted but a value for all columns in the row is not required, any default values generated by the remote data source for columns not specified in the Insert statement would not be set in the cache. Because the driver cannot reflect all of the changes made when a cached table is modified, it sets the cache state to dirty. When a cache state is dirty, the next query that attempts to fetch data from that cache causes the driver to refresh the cache before the fetch operation is performed. This allows the fetch to see the values populated by the remote data source.
Automatically refreshing a dirty cache is not always desirable. For example, if an application alternates fetches and inserts on a table, and the insert does not depend on any remote data source generated values, then the refresh between fetches is unnecessary. The Refresh Dirty Cache connection option (see “Connection Option Descriptions”) controls whether the driver automatically refreshes a cache with a dirty state. The state of a cache can be viewed by selecting the STATUS column of the SYSTEM_CACHES catalog table. See “SYSTEM_CACHES Catalog Table” for more information.
Dropping a Cache
You can drop an existing cache using the Drop Cache statement (refer to “Drop Cache (EXT)” in the DataDirect Connect Series for ODBC Reference). If a cache is a relational cache, the Drop Cache statement drops the cache for the primary table as well as the caches for the related tables.
NOTE: When a cache is dropped, all of the data in that cache is discarded.
Cache MetaData
The Salesforce driver maintains information about the caches that have been created. The driver provides two system tables to expose the cache information, the SYSTEM_CACHES table and the SYSTEM_CACHE_REFERENCES table.
The SYSTEM_CACHES and SYSTEM_CACHE_REFERENCES system tables exist in the INFORMATION_SCHEMA schema. See “Catalog Tables” for a complete description of the contents of these system tables.