Deep Dive: Decentralized Storage
Learn about what decentralized storage is, how it works, and how to transition from centralized storage to decentralized storage.
To understand decentralized storage, you first need to understand how centralized storage works.
Centralized storage is the type of storage that most people currently use in their daily lives. Mobile phones, laptops, and tablets are all forms of centralized storage. The hard drives or SD cards in these devices are forms of centralized storage since everything stored on these devices is stored in one place with one copy.
Data centers are also forms of centralized storage. Data files that are stored on servers housed in data centers are stored in one geographical location on one single server in the data center. Data is not spread amongst different servers within the data center unless explicit data replication is configured, such as RAID, but even then each copy of each data file is stored in one location.
This means that when data from devices such as phones or laptops are backed up to cloud storage, while there might be two copies of the data, they are both stored in two different forms of centralized storage. If something happens to your phone that compromises or destroys the data, then you have to rely on the backup to retrieve your data. Then if something has happened to that data, such as it being destroyed by a fire, natural disaster, or if it's affected by an outage, your data is inaccessible despite the fact you were diligent and backed it up in case of situations like this.
The problem with centralized storage is if something happens to it, it’s gone unless you have a backup plan and active backup method. This means if you’ve dropped your phone, or lost it, and never backed it up to iCloud, or if you did and it was backed up once 8 months before you lost the phone, all the data on that phone that’s been added since that backup, that’s gone, since it was stored in a centralized location. Then, if you go to iCloud and try to retrieve the data that is there is inaccessible or corrupt, you’ve lost not just 8 months of data, but everything.
This is a huge weakness of centralized storage since despite how vigilant you are in backing up your data regularly, it can still be lost if the cloud storage provider is hit by an outage or disaster.
The best way to visualize how decentralized storage works is to think about how online orders are processed and shipped.
Say you place an order on the website Chewy, a pet supplies marketplace with warehouses all over the United States. Your order contains three different items; a dog toy, a dog treat, and some dog food. Since there are warehouses all over the country, it's unlikely that each warehouse has all three of these items in stock at the same time. To fulfill your order, each item gets shipped from whatever warehouse has the item in stock.
Decentralized storage networks are powered by blockchains, which are peer-to-peer networks of computing resources. Each node on a blockchain is an individual entity, such as a home computer or a dedicated server, that has been added to the blockchain through software. Every node provides its local resources, such as processing power and empty storage capacity, to provide resources to the blockchain to be used for storage and other blockchain transactions.
Through blockchain networks, decentralized storage is able to utilize already existing unused storage across the world. This means there are no costs for adding new hard drives, no data center building rent, or upkeep costs. This makes decentralized storage much cheaper than centralized storage which typically includes costs for new hardware, or building maintenance and fees like power bills or on-site technicians.
When you store a file on a decentralized storage network, the file gets broken apart into a number of different pieces. That number varies based on what network you’re using, if you store on the Sia network, the file gets broken into 30 pieces. This process is called erasure coding. Each piece is then individually encrypted with a special algorithm, then stored across the world in a wide variety of locations, which also will vary based on the blockchain network you store on. When you go to access or download your file, it's like when your order gets shipped from Chewy - your file is pieced together from each location it's stored at, then sent to you, but unlike a Chewy order that takes a few days to arrive, your file is ready to access in just a few seconds.
When the file is broken apart into multiple pieces and erasure-coded, not all of those pieces are required to reassemble the file to be accessed or downloaded. This is a feature that’s in place specifically to ensure data integrity. For example, on the Sia network where files are broken into 30 pieces, only 10 of those chunks are needed to be pieced together to access or download that file. That means that ⅔ of the file chunks can be offline, corrupt, or otherwise inaccessible, but you can still access your file. You won’t even know that there are offline or corrupt chunks, it won’t change how you access your file at all. No file can be accessed without the minimum number of other pieces, which only you can access due to that special algorithm that gets applied during the erasure coding process, so there's no concern about the owner of the node that stores a piece of the file being able to access the file.
So going back to our Chewy example, this is like when you place an order, but one of the items in your order has been inventoried incorrectly, so when the warehouse staff goes to pull the item, it's actually out of stock there. The warehouse staff routes your request to another warehouse that has the item, so the item still gets delivered to you. They don’t cancel your entire order for a missing item, they simply send it from another warehouse. It’s the same concept with decentralized storage - if one file chunk can’t be accessed, the file can still be accessed and downloaded with no interruption to you, they just use a chunk stored in another location to piece together the file and send it to you.
One of the biggest advantages is the innate security that comes with decentralized storage. Since each file is erasure coded, encrypted, and stored across the globe, and you need at least ⅓ of the pieces to access the file, files are secure not only in integrity and accessibility but also in data privacy. No one can access the data file’s chunks besides the user who uploaded them. The exception to this is data stored on IPFS. This is because data stored on these networks is public by default since IPFS files are accessed with their content identifier through an IPFS gateway address.
Another advantage of decentralized storage is reliability. Since decentralized storage is just that, decentralized, it has no single point of failure that would take the network down and make the data inaccessible. This means there are no more crippling outages that disrupt workflows or result in a loss of business.
In the past, it has been hard to transition from centralized to decentralized storage, but Filebase is intended to be an easy on-ramp for everyone to make the transition from centralized to decentralized storage. Traditionally, users would have to manage things like contracts and cryptocurrency to use decentralized storage. At Filebase, we manage all of that for you and give you the ability to store data across different decentralized networks. We currently support IPFS and Sia. Filebase also doesn’t impose any restrictions that you’d face when storing directly on any of these decentralized networks, such as minimum file size or data retention limitations.
Filebase is the first S3-compatible decentralized storage platform, which means that almost any product, tool, or piece of code that works with Amazon S3 can be configured with Filebase with extreme ease, which makes the transition for developers and enterprises super seamless, but also for the everyday user. You can use Filebase from our easy-to-use web dashboard, or you can configure your favorite backup tool to point to Filebase. So before Filebase, the transition was hard, but today, we aim to be the on-ramp to help transition from Web2 centralized storage to Web3 decentralized storage.