Enterprise Storage Stack

Documentation for Linux


Compression

Compression is activated when you format a volume with compression enabled.

How Compression Works.

ESS stores data blocks as a stream. This storage architecture lends itself to compression. Instead of storing fixed 4K blocks in the linear stream, ESS stores variables length blocks packed in the stream. This allows ESS to implement transparent block level compression with very low overhead.

The overhead of compression in ESS is dominated by the CPU overhead compressing blocks. The algorithms that ESS uses are very fast, and routinely benchmark compression at over 600MB/sec/core. De-compression is even faster at over 1GB/sec/core. Because of the linear stream nature of ESS, keeping track of blocks is almost as efficient when compressing and when not.

Compression does require more information. With linear block storage, only the block address needs to be remembered in order to reference any data. With compressed blocks, the address of the block in the stream, plus the compressed size of the block needs to be remembered. This additional information increases the size of the lookup tables. For most arrays, this will result in about a 25% increase in memory footprint.

Why to Use Compression

Scenario 1: Compression to synthesize space.

Stored data typically takes two forms. Compressible and incompressible. Incompressible data includes files that are compressed and files that are encrypted. Most "media" files are internally compressed. This includes photos, audio, and video files. Uncompressed files are generally text files, and files that contain ordered storage like database files.

If your system has large amounts of text or database files, then activating compression may allow you to store more data than the actual amount of storage you have. This is a form of thin provisioning. The ESS device actually appears to be larger than the backing store.

You create this type of volume by formatting it with the option '--comp=nnn'. nnn is the percentage of synthesized space that you want to create, as a percentage. If your volume would normally be 10TB, --comp=200 will create a volume that appears to be 20TB.

Scenario 2: Compression to reduce wear.

Storage operations often involve storing additional information. This is usually extra information required by a file system. This extra information is overhead and is in addition to the data that you are actually storing.

EasyCo has tested a number of file systems, and all have overhead. How much overhead depends on how the file system is used. File systems store data in RAM and eventually push this data to "disk". If the file system is allowed to push data to disk "at the convenience of the file system", then the amount of extra overhead is very low. File systems are very good at optimizing updates to keep this overhead reasonable. The downside of this approach is that you will lose some data if the system crashes before all writes have been completed. Even worse, there is no guarantee of what order files will reach the disk, so you can lose older files and keep newer files. This is called a "serialization error" and can lead to corrupted execution of some applications.

Your application can instruct the file system to update disk immediately, or in order. In this case, the overhead of file system meta blocks grows higher. For example, with EXT4 writing 14K files and applying an fsync() for every file, the overhead is about 4:1. So for every 14K written, 56K of disk writes will actually happen.

When block level compression is applied, these control blocks will result in fewer physical writes. Testing at EasyCo indicated that these control blocks are greater than 90% compressible. In this case, writing 14K binary data with fsync()'s for every file on EXT4 yields a post-compression ratio of only about 1.1:1 or nearly one quarter stock EXT4 without compression. In terms of flash, this means your media will last four times as long for this type of workload.

So, even if you don't synthesize space, compression can add real value by increasing write performance and lowering the wear on your flash media.

What is you run out of physical space when compressing?

When you thin provision a volume, you are creating a storage volume that is potentially bigger than the underlying storage. Whether it is actually bigger depends on how you use it, and how compressible your data is. If the underlying volume were allowed to totally fill, you would render the volume unusable and unmountable. This is a bad thing.

To keep this from happening, ESS will change behavior as it approaches full. At 94% full, writes are slowed to around 1MB/sec. For a large array, it will take hours to write enough to fill the array the rest of the way. At 95% full, writes slow to 100KB/sec. At 96% full, writes start to error with ENOSPC errors. These three thresholds may be modified in /proc/mft/fbd1/vars/ThrottleLvl1, /proc/mft/fbd1/vars/ThrottleLvl2, and /proc/mft/fbd1/vars/ThrottleLvl3. If you run a volume out of space, you can lower these threshold values to give yourself enough writes to delete some data and get a running volume back.

If you get to 96% full, you have a read-one volume, so at least you can back up your data.