DB2 Storage

Mainframe DB2 naming standards

Probably the first thing you notice about DB2 is the naming convention. DB2 names rarely fit in with the rest of the naming conventions in your site! They look like

hlq.DSNDBn.dbname.psname.y0001.znnn

hlq is the catalog alias and usually defines the application. The dbname, or database name, and psname, usually table or index name, were usually picked by DBAs to a standard which makes sense to them. In the y0001 component, the 'y' can be 'I' or 'J'. znnn is the partition name, A001 for a single partition, but it can range up to E4096 for multiple partitions.
Unfortunately, these standards do not always fit in easily to automated storage management using DFSMS. Still, it is worth looking at some specific DB2 components, and how they fit into the grand naming scheme.

DB2 backup file names have the following format:

DB_alias.Type.Inst_name.NODEnnnn.CATNnnnn.timestamp.Seq_num

Where Database alias is a 1-8 character database alias name that was specified when the backup utility was invoked.
Type is the type of backup operation, where: 0 represents a full database-level backup, 3 represents a table space-level backup, and 4 represents a backup image generated by the LOAD...COPY TO command.
Instance name is the 1- to 8-character name of the current db2 instance.
Node number is the database partition number. In single partition database environments, this is always NODE0000. In partitioned database environments, it is NODExxxx and is the current database partition.
Catalog partition number is the database partition number of the catalog partition for the database. In single partition database environments, this is always CATN0000. In partitioned database environments, it is CATNxxxx.
Time stamp is a 14-character representation of the date and time at which the backup operation was performed in the format yyyymmddhhnnss.
Sequence number is a 3-digit number used as a file extension.

DB2 Components

A TABLE is a DB2 storage unit, consisting of rows and columns of data. Tables are implemented internally, within TABLESPACES. Tablespaces are allocated as VSAM LRDS files and the CI size can be 4, 8, 16 or 32K depending on performance needs.

DB2 V11 introduced Universal Table Spaces (UTS), which is a combination of segmented and organisation partitioning (see below). From a Storage viewpoint, a UTS keeps the space and performance advantages of segmented table spaces, and combines this with the advantages of partitioning, mainly realised by DBAs.

Older table spaces, all now deprecated, include
Simple table spaces are not partitioned or segmented, hold one table, and is composed of up to 32 LRDS VSAM files, with a maximum size of 64 GB.
Segmented table spaces store the data from separate tables in different segments, but they cannot be partitioned. They hold several tables, and can also be up to 64 GB. Think of this a being a bit like a PDS, a means of collecting together a number of small files into one container, to make management easier. If a table held in a segmented table space is dropped, most of the segments are immediately available for other data.
A PARTITIONED TABLESPACE works the other way. It takes one large table and splits it over several physical files. It can be between 64GB. and 128TB, depending on which version of DB2 is used. The last qualifier for partitioned databases identifies the partition number, so the fourth partition of a partitioned database has a file name ending in 'A004'.
The advantage of using partitions is that a large database can be split into smaller files. SQL queries will then just run against the partitions which match the WHERE clause, and not the whole database. This reduces IO and speeds up performance. DB2 utilities can also be set to run against sets of partitions, and not the whole database. This speeds up maintenance work. It also means that if a file is lost, then just one partition is affected, not the whole database.

There are two ways to generate partitions:
Partition-by-Growth starts with a single partition and simply adds more are the database grows up to a maximum of 128 TB.
Partition-by-Key is maintained by a key. When you initially partition a database, you will chose key ranges so that all the partitions are the same size. Over time, the database content will change, and some partitions will be bigger than others. If you want to change the key ranges, you need to unload the database, drop it, redefine it with the new key ranges, then load all the data back in again. This is not a trivial task, as all the plans and packages have to be rebound, and all views redefined although you just need to drop and redefine the partitions you are changing, not the whole database. Since DB2 V8, partitions can be added on-line and can take up to 4096 partitions per tablespace.

DB2 Database Structure

Rotating Partitions work a bit like Generation Data Groups and are typically used for time retention dependent data. Say you update data on a monthly basis and need to keep a year's work of data. You would define thirteen rotating partitions, then as you add a new partition, the oldest one will automatically be deleted.

INDEXES contain keys which point to data rows in tables. Indexes are stored in INDEXSPACES, which are also implemented by Linear VSAM datasets. The DBAs usually use naming standards to differentiate between index spaces and table spaces, but these will obviously be site dependent. DB2 takes the index and table space names, and uses these to generate the fourth qualifier name.

Tables and indexes are combined into DATABASES. These usually make sense in terms of an application (the 'customer' database, for example, would be a collection of all the tables and associated indexes which held customer information). You will have local standards for table names, which DB2 uses when generating the names for the VSAM files. The table name is the third qualifier. A Database can hold a maximum of 4096 table spaces.

STOGROUPS map databases to physical disks. The significant difference between DB2 stogroups, and SMS storagegroups, is that the SMS relationship is 1:n, and the DB2 relationship n:n (hey, this stuff is about DB2, so why not use a bit of relational terminology?). Translated, this means that an SMS disk can only be defined to one storagegroup, but a DB2 disk can be assigned to several stogroups. In any case, if you are running DB2 under SMS, then your DBAs should define all their stogroups with volume(*), or just miss the volume parameter off altogether, and let SMS take control.

So that just leaves the high level qualifier. Ideally, this should identify the application, and whether the data is prod, development, or test. If so, you can use the hlq to direct allocations to the correct storage pool. Separating prod and dev by hlq also keeps production and development data in separate catalogs, always a good idea!

DB2 UDB Tablespaces

DB2 UDB on open systems (Windows, Unix, Linux) has two types of tablespaces called SMS (System Managed Space) and DMS (Database Managed Space). Do not confuse this with z/OS SMS. This is explained on the DB2 and SMS page.

Database (UDB) are usually found in home_dir/instance/NODEnnnn/SQLnnnnn/, where home_dir is site specific, chosen when a database is created.
instance is used to ensure that a database name is unique within a db2 instance, and is usually db2inst1, db2inst2 etc.,
partition is NODEnnn, is the partition number starting from 0000 and SQLnnnnn is the directories that contain the databases, with SQL00001 containing all the files for the first database created etc.
Database names must be unique within the location in which they are cataloged. On Linux and UNIX implementations, this location is a directory path, whereas on Windows implementations, it is a logical disk.

The SMS storage model consists of physical files that represent database objects like tables. These files are located in table space containers, basically directories, that can be spread over several disks. The DBMS controls the file names and distributes the data evenly across the table space containers.
The SMS table space type for user defined table spaces is from version 10.1 onwards but is still current for catalog and temporary table spaces.

In the DMS (database managed space) storage model the data creator is responsible for the initial space allocation and the DBMS does the on-going management. The data is stored in operating system files or raw disk.
The DMS table space type is also deprecated from version 10.1 onwards for user-defined permanent table spaces.

The preferred storage model is now Automatic Storage Table Spaces. Now the database manager both creates and extends containers as needed.

Your DBA must define at least one Storage Group for ASTS to work. Now when they create an automatic storage table space, the DBMS will automatically create a container on each of the storage paths in that group and stripe the physical files over those containers. When you need to add more storage, you add it to the storage groups, not the containers. Now when the existing containers fill up, the DBMS will automatically create a new set of striped containers and use them. If you don't want this to happen, but would rather use the new space straight away to take advantage of more I/Os in parallel, your DBA would need to run the ALTER TABLESPACE REBALANCE command
When creating a ASTS, your DBA can also specify things like its initial size, the amount that the table space size will be increased when the table space is full, the maximum size that the table space can grow to, and the storage group it uses.

The different types of tablespace are
A regular tablespace, which stores permanent data and can hold Large Objects (LOBS).
A Catalog tablespace, one per database, which holds system catalog tables and is automatically created when the database is created.
A Large tablespace, which holds permanent data and can hold LOBS. It must be DMS and a default large tablespace called USERSPACE1 is created when the database is created.
A System Temporary tablespace, the default is called TEMPSPACE1 and is used for sort, reorganisation and other temporary data requirements. At least one must exist for every database, but more can be created.

DB2 v.8.2.2 simplified the choice between SMS and DMS. You can set DMS up to auto-resize tablespaces, and automatically extend them as the tablespace becomes full. You can control the growth rate and the maximum size. DB2 v.8.2.2 pretty much automates the tablespace storage management tasks.
DB2 v.8.2.2 introduced a new feature called 'Automatic Storage'. Automatic storage tablespaces use DMS tablespaces, and uses 'storage paths' essentially file systems or directories. You associate one or more storage paths with each database and you don't need to specify a list of containers when allocating a tablespace. DB2 will automatically decide which containers to allocated based on the storage paths associated with the database. You can define a starting size, or you can take a default. Automatic storage will auto-resize databases by default. It will also add new stripe sets of containers to a tablespace