VSAM buffering

The physical process of going out to disk (or tape) for data takes time. The point behind buffering is to avoid going to disk if possible by loading data into memory, which is faster access than disk cache. VSAM has default values for how many buffers to allocate to the data and index components of a VSAM file. However these defaults were worked out when all storage was below the 16MB line, all disks were real CKD, were spinning and slow, and the comms channels were slow too. These defaults have not changed with technology, such as flash storage, RAID FBA disks emulating CKD, PAV and FICON channels. It could be difficult to work out exactly what the best options were, so a number of Rules of Thumb have been developed over the years. However be aware that the old Rules of Thumb that were designed to avoid disk seek time are not valid now. Modern DASD devices have large cache capacity, and read ahead algorithms will preload up to a cylinders worth of data into cache. This eliminates I-O delay due to seek times and rotational positioning, but there is still a lot of benefit to be gained from I-O buffer tuning.

   EADM Advert

Accelerate DB2 Write with zHyperWrite and "EADM™ by Improving DB2 Logs Volumes Response Time:

JCL buffers

Before adding any extra buffers to a job, consider what impact this might have on allocated memory storage, and if you might get a GETMAIN region abend. The best option is to use REGION=0 as this can use memory below the 16MB line, between 16MB and 2GB, and above the 2GB bar.

The VSAM unit of data transfer is the CI size, and this is often defaulted to 4096. By default, VSAM provides 2 Data and 1 Index buffers, but these generally do not provide adequate performance.

A couple of VSAM definitions that are needed to understand buffereing:
Applications using VSAM can allow more than one process to access a file concurrently. Only one write access is possible, but up to 255 concurrent read requests are allowed. These concurrent accesses are called Strings and the total number of strings allowed is set by the STRNO parameter. You would need to check out an RMF report to see how many concurrent accesses might be required, but STRNO=3 is considered a good default.
The lowest level of a VSAM index is called the Sequence Set. The index records above the sequence set are called the Index Set. As well as the hierachical index structure, each CI in the sequence set point to the next one in sequence, and each index record in the sequence set points to a data component CA. So it is possible to read a file sequentially by following the sequence set.

Do not use the BUFFERSPACE parameter as this sets the buffer sizes for the whole VSAM cluster. Data and index components need different size buffers, so use BUFNI for index buffers and BUFND for the data component buffers. IBM recommends that you optimise buffering for random processing. In this case, you need 1 buffer for each string process, as data buffers are not shared between strings for direct processing. You also need an extra one for system proccessing like splits.

   BUFND = STRNO + 1

For index processing, you should aim to define enough index buffers to contain the entire index set. The index set buffers are shared between strings, but the sequence set buffers are not. So you need enough buffers to hold the entire index set, and one buffer for each string for the sequence set. To work these out, you also need the file to be loaded with data.
To work out the number of buffers required, first note that each index CI only holds one index record, so every sequence set CI maps to one Data CA. So if you know how many Data CAs exist, you know how many sequence set records there are. A listcat will tell you the CIsize, the CI/CA ratio, the total number of index records and the HURBA or high used byte address, but not the number of Index Set records. If you multiply the CISIZE by CI/CA you get the CASIZE. Divide this into the HURBA and round up to the next integer and that tells you the number of CAs in the file. Subtract that from the number of index records, and you get the number of records in the Index Set.
Finally, the number of index buffers is the number of strings, plus the number in Index Set records. To express this as a formula:

   BUFNI = STRNO + (TI - HURBA/( CISIZE*CI/CA ))

where
TI = Total number of index records of the Index component
HURBA = High-used field from the Data component
All these parameters can be found by examining LISTCAT of the VSAM cluster. Just issue the TSO command LISTCAT ENT(cluster name) ALL

The following JCL can be used to place index data in buffers.

//DD1    DD DSN=aa.bb.cc,DISP=SHR,
//       AMP=('BUFNI=50','BUFND=4')

I-O rates on a heavily accessed index can come down by 1000% or more, with good buffering.

back to top