12
返回列表 发新帖
楼主: jieforest

深入HBase的组件细节

[复制链接]
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
11#
 楼主| 发表于 2015-5-9 22:13 | 只看该作者
A scale out-based system enables us to have a redundant and high availability system. It is cost effective, which means that there is no need to invest in high-end machines, no application migration overhead, and servers can be located in many locations. It is suitable for massive parallel computing, where a number of machines take up the workload evenly. The following figure shows the HBase scaling method:

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
12#
 楼主| 发表于 2015-5-9 22:13 | 只看该作者
In HBase, we can add new RegionServers on the fly; for this, new DataNodes are added, the RegionServer daemon is started on these DataNodes, and scalability is obtained. In short, we first add a number to the cluster, and then start the DataNode and RegionServer daemons on the newly added node.

Let's talk about HBase communication between daemons (nodes). The different daemons and the HBase nodes communicate with each other using Remote Procedure Call (RPC), which enables the HBase components to make calls to in-built functions. It also enables each component to behave towards these calls as if they were local. This in turn enables the procedures or subroutines to be executed to a different address space, such as another computer system. This kind of intercommunication prevents the rewriting of the server architecture code.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
13#
 楼主| 发表于 2015-5-9 22:14 | 只看该作者
The following figure shows the RPC flow:

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
14#
 楼主| 发表于 2015-5-9 22:15 | 只看该作者
In HBase, HBaseRPC is the class that facilitates HBase to use RPC among the components. It is based on the Java dynamic proxy pattern. It uses an invoker class that implements InvocationHandler to intercept client-side method calls, and then it marshalls the method name and argument through HBaseClient. The communication between client and server using RPC works as follows:

1. The client contacts ZooKeeper to find who the active HMaster is and what the location of the root RegionServer is.

2. Then, the client communicates RegionServer using HRegionInterface to read/write the table.

3. Client applications talk to HMaster using HMasterInterface in order to dynamically create a table, add a column family, and for other operations.

4. Then, HMaster communicates to RegionServers using HRegionInterface to open, close, move, split, or flush the region.

5. Active HMaster data and the root RegionServer location are cached into ZooKeepers by HMaster.

6. RegionServer then reads the data from ZooKeeper to get information about log-splitting tasks, which is updated to fetch a task report status.

7. RegionServer then communicates with HMaster using HMasterRegionInterface to convey information such as the loading of RegionServer, errors with RegionServer, and the start up process of RegionServers.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
15#
 楼主| 发表于 2015-5-9 22:15 | 只看该作者
Sometimes, RegionServer also communicates with the root region or the meta region, with the help of HRegionInterface, to check the current status of a region or to create a new daughter region while region splitting.

8. This communication is repeated with a tick time interval or a threshold time interval to keep everything updated.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
16#
 楼主| 发表于 2015-5-10 16:36 | 只看该作者
Reading and writing cycle

Now, let's see how the read-and-write operation takes place in HBase diagrammatically:

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
17#
 楼主| 发表于 2015-5-10 16:36 | 只看该作者
Let's discuss and understand how the read-and-write operation takes place in and from HBase tables. In HBase, the client does not write data to HFile directly; it is first written to WAL and then to HBase MemStore, which is shared by an HStore in the main memory and then flushed to HFile later. Refer to the following figure:

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
18#
 楼主| 发表于 2015-5-10 16:36 | 只看该作者
Write-Ahead Logs

Write-Ahead Logs facilitate the data reliability and reside on HDFS; each RegionServer hosts a single WAL. In the case of a RegionServer crash where MemStore is not flushed, WAL is used to restore the data to a new RegionServer. So, only once data is written successfully to WAL and MemStore, the write operation is said to be successful.
MemStore

MemStore acts as an in-memory write buffer with a default size of 64 MB. Once data in MemStore reaches the threshold (which is by default 40 percent of the heap size or 64 MB), it is flushed to a new HFile on HDFS for persistence. The 64 MB HFile is not related to block size here; Hadoop internally manages block allocation and storage. HBase does not play a role in the underlying mechanism of block replication or dividing HFiles into blocks. Each column family might have many HFiles, but the HFile will only belong to a specific column family.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
19#
 楼主| 发表于 2015-5-10 16:37 | 只看该作者
Now, let's take a look at the process flow of reading from HBase. The reading process starts when the client initiates a read request; the client gets the RegionServer and region information, and it communicates this to the acquired RegionServer. At the acquired RegionServer, the client first tries to read from MemStore; if hit, the read activity completes; if it's a miss, it navigates to block cache. Finally, it reaches out to HFile to read the required row of data. If there is a missing record, the corresponding HFile is loaded into the memory that contains the required row of data. So, MemStore and block cache provide real-time access to data for performance purposes, and HFile provides persistent, on-demand data.

Block cache follows the least recently used (LRU) algorithm. Every RegionServer has a single block cache that keeps the most frequently accessed data from HFile in the main memory, which results in reducing the disk seek for data access time.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
20#
 楼主| 发表于 2015-5-10 16:37 | 只看该作者
HBase housekeeping

As data is being added to HBase, it writes an immutable file to store. Each store is made up of column families, and regions consist of these row-key ordered files as it's immutable. So, there will be more files rather than one on the fly. Due to many files, the I/O will be slower, and hence lag in reading and writing, resulting in slower operation. To overcome these types of problems, HBase uses the compaction methodology; let's look into it now. Refer to the following figure for a better understanding:

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表