Deep Dive: Erasure Coding
Learn about what erasure coding is, how it works, and how it’s different from traditional RAID.

What is Erasure Coding?

Digital data has traditionally been stored in one of two ways: first, it can be copied from one storage medium to another, such as from a hard disk to a RAID array to a backup tape. In this form, the data exists in only two places at any one time. Second, it can be encoded in different locations within a single storage medium and retrieved with specific instructions. This latter method is known as erasure coding.

How Does Erasure Coding Work?

Erasure coding is a method of data protection. It involves breaking data into fragments, expanding and encoding it with redundant data, and dispersing the result into storage objects located in diverse locations. A computationally intensive mathematical algorithm is used to calculate the erasure locations, allowing erasure coding to prove that the loss of any single erasure will not result in any unrecoverable loss of information. According to the algorithm, for each parity fragment, the parity's value will be calculated based on the original data fragments.
For example, a group of storage nodes might use a 5+2 encoding algorithm to encode and distribute data fragments. This algorithm breaks the data into five data fragments, then adds two additional parity fragments which are calculated based on the original five pieces of data. This algorithm structure is referred to as Reed-Solomon erasure coding. Then, each data fragment is distributed to its own node. Erasure coding supports a wide variety of different configurations. Algorithms can be almost any combination of data segments and parity segments, not just the 5+2 algorithm in the example above.
If a disaster occurs that disables one or more nodes, data fragments can still be recovered from other storage objects. In a 5+2 example, two data fragments can be disabled or lost and the entire set of data can still be replicated and retrieved.
On Filebase, the erasure coding algorithm for each supported network is different. Objects are split into 30 pieces then geographically distributed to Sia host servers all around the world. Only 10 of the 30 pieces need to be available in order to process a download request. In comparison, objects uploaded to Storj are split up into 80 and distributed across thousands of diverse nodes. Retrieving an object only requires 29 of these 80 pieces.
If an application tries to retrieve data from the data segments and that data is available, the operation proceeds as normal. If that data is not available but the parity segment is, the application will retrieve it from the parity segment and use it to reconstruct the missing data from one or more of the other data segments.

How is Erasure Coding different from RAID?

RAID, or Redundant Array of Independent Disks, is a method of grouping multiple physical disk drives into a single logical storage unit. When data is written to the array, it's broken up and placed across multiple disks. In this way, if one disk fails, the data can be reconstructed from the remaining disks in an array. RAID has two forms of data protection, data mirroring and data striping with parity. Mirroring is a basic form of data protection that is simply data duplication. Data is stored with a copy on each drive for redundancy. This form of RAID is referred to as RAID 1 and is easy to set up and maintain, but consumes a large amount of disk space and resources.
RAID 5, or RAID that features striping with parity, stripes data across multiple hard disks and adds in parity blocks to protect data. This process is similar to erasure coding, but has two major differences. The first being that disks stored in RAID arrays are always stored in the same geographical location, whereas data stored with erasure coding is always stored across a variety of geographic locations.
The second main difference is that RAID 5 can only handle one disk failure at a time. RAID 6, which features data striping with parity and data mirroring can handle two disk failures, but uses a substantial amount of hardware resources to achieve the same amount of data redundancy that erasure coding that uses 5+2 uses. Erasure coding can far surpass RAID 6. For example, an erasure coding algorithm that uses a 10+6 calculation can support six simultaneous disk failures, but the maximum RAID can support is two with RAID 6.

What Are The Benefits of Erasure Coding?

A few of the benefits of a data storage method that uses erasure coding are:
  • More efficient resource utilization
  • Low risk of data loss
  • Increased flexibility
  • Increased durability

Why is Erasure Coding Useful?

Erasure coding is useful because it provides a level of data redundancy and integrity that traditional data redundancy mechanisms do not. Erasure coding is especially useful for protecting object-based storage systems, especially cloud storage services like Filebase. Erasure coding is beneficial for storing large amounts of data for applications or systems that must tolerate failures. In modern computing, utilizing RAID of any form for large data sets isn’t practical or cost effective by any means.
To properly utilize erasure coding technology, infrastructures must have adequate network capabilities to achieve ideal data retrieval and replication performance. For this reason, utilizing erasure coding for data backup or archival is the most recommended use case, since backup and archival workflows are typically static data stores and do not rely on constant data ingress and egress.
Filebase offers a remedy to this problem, using an aggressive caching layer at our edge locations. If an object is accessed frequently or recently, then it will be cached in the edge layer for quicker access.
If you have any questions, please join our Discord server, or send us an email at [email protected]