The Google File System

wangfans · 发表于 2013-6-14 15:59

Second, files are huge by traditional standards. Multi-GB
files are common. Each file typically contains many application
objects such as web documents. When we are regularly
working with fast growing data sets of many TBs comprising
billions of objects, it is unwieldy to manage billions of approximately
KB-sized files even when the file system could
support it. As a result, design assumptions and parameters
such as I/O operation and blocksizes have to be revisited.

wangfans · 发表于 2013-6-15 14:02

Third, most files are mutated by appending new data
rather than overwriting existing data. Random writes within
a file are practically non-existent. Once written, the files
are only read, and often only sequentially. A variety of
data share these characteristics. Some may constitute large
repositories that data analysis programs scan through. Some
may be data streams continuously generated by running applications.
Some may be archival data.

wangfans · 发表于 2013-6-15 14:02

Some may be intermediate
results produced on one machine and processed
on another, whether simultaneously or later in time. Given
this access pattern on huge files, appending becomes the focus
of performance optimization and atomicity guarantees,
while caching data blocks in the client loses its appeal.

wangfans · 发表于 2013-6-15 14:06

Fourth, co-designing the applications and the file system
API benefits the overall system by increasing our flexibility.
29
For example, we have relaxed GFS’s consistency model to
vastly simplify the file system without imposing an onerous
burden on the applications. We have also introduced an
atomic append operation so that multiple clients can append
concurrently to a file without extra synchronization between
them. These will be discussed in more details later in the
paper.

wangfans · 发表于 2013-6-15 14:06

Multiple GFS clusters are currently deployed for different
purposes. The largest ones have over 1000 storage nodes,
over 300 TB of diskst orage, and are heavily accessed by
hundreds of clients on distinct machines on a continuous
basis.

wangfans · 发表于 2013-6-15 14:06

2. DESIGN OVERVIEW
2.1 Assumptions
In designing a file system for our needs, we have been
guided by assumptions that offer both challenges and opportunities.
We alluded to some key observations earlier
and now lay out our assumptions in more details.

wangfans · 发表于 2013-6-16 20:42

The system is built from many inexpensive commodity
components that often fail. It must constantly monitor
itself and detect, tolerate, and recover promptly from
component failures on a routine basis.

wangfans · 发表于 2013-6-16 20:42

The system stores a modest number of large files. We
expect a few million files, each typically 100 MB or
larger in size. Multi-GB files are the common case
and should be managed efficiently. Small files must be
supported, but we need not optimize for them.

wangfans · 发表于 2013-6-16 20:42

The workloads primarily consist of two kinds of reads:
large streaming reads and small random reads. In
large streaming reads, individual operations typically
read hundreds of KBs, more commonly 1 MB or more.
Successive operations from the same client often read
through a contiguous region of a file. A small random
read typically reads a few KBs at some arbitrary
offset. Performance-conscious applications often batch
and sort their small reads to advance

wangfans · 发表于 2013-6-16 20:42

The workloads also have many large, sequential writes
that append data to files. Typical operation sizes are
similar to those for reads. Once written, files are seldom
modified again. Small writes at arbitrary positions
in a file are supported but do not have to be
efficient.