Open Source Databases

An Open Source database is one that is free to use, and anyone can download the code, study it, modify it, and re-use it. The result is often a database that consists of modules that plug in together. Open Source databases were originally used to support data storage for internet applications, and grew and changed rapidy as the internet changed.

A typical OS database has a three-layer logical architecture: The top layer looks after external things like networking and security, including authentication. The middle layer contains the DBA stuff, coding, built-in functions, stored procedures. The lower layer is the one we are interested in, the Storage Layer, the layer that contains the storage engines. Most OS databases support a few different storage engines, and communicate with them through a storage engine API. When SQL queries are issued by the middle SQL layer, this is passed through the Storage API to the Storage Engine. The Storage Engine then interprets these commands and retrieves or updates data on the storage hardware.

Storage engines evolved independently from the databases, and often a single storage engine will be used by several different OS databases. The different engines have their own advantages and disadvantages. OS databases usually support more than one storage engine, often concurrently. In fact, a single transaction might use more than one storage engine. The most popular storage engine is InnoDB, owned now by Oracle.

OS Database Backups

Databases consist of sets of files. Therefore, the easiest way to backup a database is to shutdown the DBMS and copy all its files. This is a good, safe backup, provided you can guarantee to copy all of the files, but large databases and 24*7 operations requirements usually make these cold backups impractical.

Snapshots sound like a better idea, as they can take a consistent copy of the files at a given moment in time. However databases tend to do a lot of work in main storage buffers, then flush these buffers out to disk a few moments later. A snapshot will not capture database updates that only exist in buffers. If the data is non-transactional, then that data is usually lost. If the data is using a transactional storage engine like InnoDB, then it can always be recovered as long as a commit was done, but the tables will still need to be repaired.

Hot backups are the best option, using DBMS utilities. They can take a backup while the database is open, and ensure it is consistent by using transaction logs to capture any changes made while the backup. If you combine hot backups with a product like Repostor this moves the backup data off-site for secure disaster recovery.

InnoDB

InnoDB was one of the first storage engines developed by third parties. InnoDB stores its data in a series of one or more data files that are collectively known as a tablespace. In other words, a tablespace is a file containing data. As far as the database layer is concerned, a tablespace is pretty much a black box that is managed by InnoDB.

InnoDB databases are stored in a .../bin/mysql/data directory, the exact path will depend on your site standards. Database file extensions are .frm, with the file name the same as the database name. If the operating system file names are case sensitive, then the database names are case sensitive.
The system tablespace is stored in the ibdata file. It contains information used by InnoDB internally, like rollback segments, as well as some system tables.

Tablespaces have the prefix .ibd. There is no mechanism to recover deleted data space within a tablespace file, so deleting data will not 'fix' a full tablespace. Instead, the data must be deleted and then the data copied into a new table. This has implications if the operating system filespace is full, as extra room is needed for the copy.

InnoDB supports Transaction Logs for recovery of uncommitted transactions. The redo logs are usually written to two files, called ib_logfile0 and ib_logfile1, unless the default is changed. The undo log is usually written to the system tablespace, called ibdata1. These logs are circular, when the writer reaches the end of the log, it starts writing at the beginning again overwriting the old log data.

As a database can be a single file with multiple users accessing it, it is important to make sure that reads and updates from different users do not conflict with each other. If two users are able to update a record simultaneously, either one update will be lost, or the record will be corrupt. A DBMS manages this with a locking strategy. This is not exactly a Storage topic, but in brief, InnoDB offers the choice of locking at table level or row level. Row level locking allows more concurrency of operations, at the expense of more CPU overhead. Row locks are implemented in the storage engine, not the server, which means that if a transaction uses more than one storage engine, the data will not be locked correctly. Locks are held for the duration of a transaction, which will typically involve reading and updating several rows. This means that two transactions can request locks on the same resources and so hang waiting for them. This is called a deadlock. The InnoDB storage engine will notice these circular dependencies and fail one or more of the transactions, which will then be rolled back. The application should be designed to detect this and retry the transaction.

Transaction logging also helps make transactions more efficient, because instead of updating the tables on disk each time a change occurs, the storage engine can change its in-memory copy of the data. The storage engine will then write a record of the change to the transaction log, which is safely on disk. This is a relatively fast operation, because appending log events involves sequential I/O in one small area of the disk instead of random I/O in many places. Then, at some later time, a process can update the table on disk. Thus, most storage engines that use this technique, known as write-ahead logging, and end up writing the changes to disk twice. If there’s a crash after the update is written to the transaction log but before the changes are made to the data itself, the storage engine can still recover the changes upon restart.

MySQL

MySQL is probably the second most popular database in use, after Oracle. This is doubtless because it is used so much in web appliations, the 'M' in WAMP and LAMP usually stands for MySQL. MySQL can also power other applications, like data warehouses and OLTP systems and can be configured to run on a wide range of hardware.

The default engine for MySQL today is InnoDB. Oracle originally bought InnoDB and then MySQL, and so can develop both products together, which helps ensure they work well together. However even though it is owned by Oracle, MySQL remains fully open source. See above for details of the InnoDB storage engine.

MySQL offers the mysqldump command for backups, and the command must be directed to a file. To dump all databases to a file called dump.sql, use the command

mysqldump --all-databases > dump.sql

To dump only specific databases, just name them on the command line with the --databases option:

mysqldump --databases db1 db2 db3 > dump.sql

When you use the databases or all-databases option, the dump file will contain the CREATE DATABASE and USE commands needed to restore the databases, so to run a restore, all you need is the dump command with the direction in reverse

mysql < dump.sql
mysql db1 < dump.sql

If you are running the restore from the mysql command line, use the source command:

mysql> source dump.sql

If you run these commands from Windows PowerShell, you will need to put quotes around them as Windows considers < and > as special characters.

The problem with this approach is that you are dumping to a local disk, and it would be easy to lose both database and backups after a hardware error. You need a product like Repostor to interface with the MySQL commands and move the data off-site.

MariaDB

MariaDB was developed from MySQL code and uses the same default InnoDB storage engine. It also uses an Aria Storage Engine for system and internal temporary tables. InnoDB is discussed above, Aria is a non-transactional storage engine, but it is protected from crash failures by a write-ahead log.
MariaDB data files can usually be found in /path/mariadb/mariadbrelease/data/ where path and release will be site specific. A table file usually has a .frm extension

The Binary Log contains a record of all changes to the databases, both data and structure. It consists of a set of binary log files and an index and it only records statements that actually update the data. As the binary log lives in the middle database layer, it is independent of the storage engines attached. It is used to facilitate replication, flashback and incremental backups.

Backups

MariaDB has the following types of backups:

Logical Backups or dumps use SQL statements to get a flexible copy of a database that can be restored to different MariaDB versions, different servers, and possible even to different types of DBMS. Logical backups can use the MySQL command mysqldump, with the same syntax as for MySQL above.

The Mariabackup tool can be used to take a 'hot backup' of MariaDB files while MariaDB is active. It works by taking fuzzy backups and then rolling them forward to a clean state by using the InnoDB undo log.

The Mariabackup command supports incremental backups, The first backup must be a full database backup using a command something like this:

mariabackup --backup --target-dir=/var/mariadb/backup/ --user=mariabackup --password=mypassword

Once you have created a full backup on your system, you can run an incremental backup to a different directory, with a link to the full backup directory like this:

mariabackup --backup --target-dir=/var/mariadb/inc1/ --incremental-basedir=/var/mariadb/backup/ --user=mariabackup --password=mypassword

Each subsequent incremental backup must be pointed to a different directory like this, with a reference back to the latest incremental.

mariabackup --backup --target-dir=/var/mariadb/inc2/ --incremental-basedir=/var/mariadb/inc1/ --user=mariabackup --password=mypassword

To restore a database from an incremental backup, you need to apply the incrementals to the base backup first, to create an up to datebackup. These commands apply for MariaDB 10.2 onwards.

First, prepare the base backup by synchronising it with the changes contained in the InnoDB redo log while that backup was running.

mariabackup --prepare --target-dir=/var/mariadb/backup

Now apply each incremental to the base, starting with the oldest one like this

mariabackup --prepare --target-dir=/var/mariadb/backup --incremental-dir=/var/mariadb/inc1

Run this command for each available incremental, then you have a current backup file ready for restore. You have two options for the restore; --copy-back which will preserve the original backup files and --move-back which deletes the backup files. To run the restore, you need to stop the MariaDB Server, then empty out the datadir. Then, run the mariabackup command like this, this command preserves the backup files:

mariabackup --copy-back --target-dir=/var/mariadb/backup/

Flashback

flashback is a feature that allows you to undo some transactions that incorrectly updated databases or tables with data from the binary log. In other words, you can use it to fix user errors without needing a full database restore. The process involves using the mysqlbinlog command to decode the binary log and then search the log to find the exact start and stop position of the problem transaction. mysqlbinlog is then run with the --flashback --start-position=xxxx --stop-position=xxxx and piped into an sql file that can be used to reverse the damage.

Replication

MariaDB supports the following types of replication:

Asynchronous Replication -The original MariaDB replication system is asynchronous master-slave replication. A master needs to have the binary log enabled. The master logs all data changes in the binary log and marks the transaction as successful. These changes are then sent to all the slaves.

Semi-Synchronous Replication -In semi-synchronous replication, when a transaction is not marked as successful until the master, and at least one slave reports success. If the master crashes, the most up-to-date slave can be promoted to master without losing any data.

Galera Cluster - Galera Clustering will replicate InnoDB tables in a synchronous, master-master replication configuration over a cluster of MariaDB servers. Like asynchronous replication, Galera uses the binary log and requires that data changes are logged in the ROW format.

MongoDB

MongoDB has an architecture that is built on collections and documents and uses a document storage format called BSON, which is a binary style of JSON documents. A traditional SQL database stores data in columns and rows, but MongoDB stores data objects as documents inside a collection. These elements can be described as:
A Collection is group of database documents, the relational database equivalent of a table.
A Document is a set of data pairs, where each piece of data is indentified by a key, which is associated with a value.

MongoDB originally used a storage engine called MMAPV1, but the current default engine is WiredTiger.
The WiredTiger Storage Engine Architecture combines features from both B-Tree and LSM Tree engines, and this allows the system to take snapshots, or checkpoints, of the database at regular intervals - the default is 1 minute. If there are any problems with writes between checkpoints, these snapshots can be used to back the database out to the most recent checkpoint. It uses write-ahead logging to automate crash recovery back to the last completed transaction. The WiredTiger storage engine uses the OS page-cache facities to improve performance.

MongoDB supports replica sets in a Master - Slave configuration. By default, all writes and reads go to the master, or primary, and the data is them replicated off to the slaves. It is optionally possible to read data from the slaves, or secondaries, but that data might not be fully current. The databases can also run over multiple servers, with the data duplicated between them.

Sharding is used for automatic load balancing. A Shard is a physical partition which consists of a Master with its replicas. The user selects a Shard Key, and this key is used to split the data into ranges and then distribute the data over the shards.

Backups

The mongodump command will extract data from a MongoDB database and store it in a BSON file. The equivalent command is mongorestore, which reads that BSON file and populates a MongoDB database. You run these commands from the system command line, not the mongo shell.
If you have a single mongo database, then the backup command can simply be:

mongodump

and the corresponding command to restore that database would be:

mongorestore

The following simple example will just backup the collection called 'roosevelt' from a database called 'presidents'

mongodump --db=presidents --collection=roosevelt

DB2 Storage

Lascon latest major updates