|
dfs.data.dir
While dfs.name.dir specifies the location of the namenode metadata,
dfs.data.dir is used to indicate where datanodes should store HDFS block data.
Also a comma separate list, rather than mirroring data to each directory specified,
the datanode round robins blocks between disks in an attempt to allocate blocks
evenly across all drives. The datanode assumes each directory specifies a separate
physical device in a JBOD group. As described earlier, by JBOD, we mean each
disk individually addressable by the OS, and formatted and mounted as a separate
mount point. Loss of a physical disk is not critical since replicas will exist on other
machines in the cluster.
Example value: /data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn,/data/4/dfs/dn. Used
by: DN.
fs.checkpoint.dir
The fs.checkpoint.dir parameter specifies the comma separated list of directories
used by the secondary namenode in which to store filesystem metadata during a
checkpoint operation. If multiple directories are provided, the secondary namenode mirrors the data in each directory the same way the namenode does. It is rare,
however, that multiple directories are given because the checkpoint data is transient and, if lost, is simply copied during the next checkpoint operation. Some administrators treat the contents of this directory as a worst case scenario location
from which they can recover the namenode’s metadata. It is, after all, a valid copy
of the data required to restore a completely failed namenode |
|