|
Important Rules to Avoid Losing Data
Most S3QL backends store data in distributed storage systems. These systems differ from a traditional, local hard disk in several important ways. In order to avoid losing data, this section should be read very carefully.
Rules in a Nutshell
To avoid losing your data, obey the following rules:
1. Know what durability you can expect from your chosen storage provider. The durability describes how likely it is that a stored object becomes damaged over time. Such data corruption can never be prevented completely, techniques like geographic replication and RAID storage just reduce the likelihood of it to happen (i.e., increase the durability).
2. When choosing a backend and storage provider, keep in mind that when using S3QL, the effective durability of the file system data will be reduced because of S3QL’s data de-duplication feature.
3. Determine your storage service’s consistency window. The consistency window that is important for S3QL is the smaller of the times for which:
3.1 a newly created object may not yet be included in the list of stored objects
3.2 an attempt to read a newly created object may fail with the storage service reporting that the object does not exist
If one of the above times is zero, we say that as far as S3QL is concerned the storage service has immediate consistency.
If your storage provider claims that neither of the above can ever happen, while at the same time promising high durability, you should choose a respectable provider instead. |
|