VSAM buffering

The physical process of going out to disk (or tape) for data takes time. The point behind buffering is to avoid going to disk if possible by loading data into memory, which is faster access than disk cache. VSAM has default values for how many buffers to allocate to the data and index components of a VSAM file. However these defaults were worked out when all storage was below the 16MB line, all disks were real CKD, were spinning and slow, and the comms channels were slow too. These defaults have not changed with technology, such as flash storage, RAID FBA disks emulating CKD, PAV and FICON channels. It could be difficult to work out exactly what the best options were, so a number of Rules of Thumb have been developed over the years. However be aware that any old Rule of Thumbs that was designed to avoid disk seek time is not valid now. Modern DASD devices have large cache capacity, and read ahead algorithms will preload up to a cylinders worth of data into cache. This eliminates I-O delay due to seek times and rotational positioning, but there is still a lot of benefit to be gained from I-O buffer tuning.

   EADM Advert

Accelerate DB2 Write with zHyperWrite and "EADM™ by Improving DB2 Logs Volumes Response Time:

JCL buffers

Before adding any extra buffers to a job, consider what impact this might have on allocated memory storage, and if you might get a GETMAIN region abend. The best option is to use REGION=0 as this can use memory below the 16MB line, between 16MB and 2GB, and above the 2GB bar.

The VSAM unit of data transfer is the CI size, and this is often defaulted to 4096. By default, VSAM provides 2 Data and 1 Index buffers, but these generally do not provide adequate performance. The following formulae have worked well over the years.

For sequential processing the process will read through the DATA portion of the file, from start to end. The recommendation is to define one Index CI buffer, and several Data CI buffers for lookahead. However be aware that too many data buffers could cause paging and so degrade performance. Sequential processing should get a lot of assistance from disk cache. Use:

   BUFND = (2 * number of CIs per track) + 3

For random processing, if the file is small enough, define enough index buffers to contain the entire index, otherwise aim to get the sequence set and the top level record in buffers. The sequence set is the lowest level of index. Use:

   BUFNI = (TI - HURBA/CASZ) + 1

where
TI = Total number of index records of the Index component
HURBA = High-used field from the Data component
CASZ = CISZ *CI/CA
If you are allocating more than one String, as indicated by the STRNO value, then you need to allocate one more index buffers than the number of strings, just to get the highest level index into a buffer. Allocate more buffers to buffer lower index levels.
All these parameters can be found by examining LISTCAT of the VSAM cluster. Just issue the TSO command LISTCAT ENT(cluster name) ALL

The following JCL can be used to place index data in buffers.

//DD1    DD DSN=aa.bb.cc,DISP=SHR,
//       AMP=('BUFNI=50')

In this case, the index was fairly small, and it could be contained in 50 buffers. I-O rates on a heavily accessed index can come down by 1000% or more, with good buffering.

back to top