楼主: jieforest

分布式数据库Hypertable 0.9.7.6发布

[复制链接]
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
31#
 楼主| 发表于 2013-6-6 09:42 | 只看该作者
ADAPTIVE MEMORY ALLOCATION

The following diagram illustrates how the RangeServer adapts its memory usage based on changes in workload.



Under write-heavy workload, the RangeServer will give more memory to the CellCaches so that they can grow as large as possible, which minimizes the amount of spilling and merging work required. Under read-heavy workload, the system gives most of the memory to the block cache, which significantly improves query throughput and latency.

使用道具 举报

回复
论坛徽章:
11
SQL极客
日期:2013-12-09 14:13:35SQL数据库编程大师
日期:2013-12-06 13:59:43SQL大赛参与纪念
日期:2013-12-06 14:03:45红孩儿
日期:2012-12-19 11:08:17优秀写手
日期:2013-12-18 09:29:09暖羊羊
日期:2015-04-22 14:41:41
32#
发表于 2013-6-6 15:54 | 只看该作者
楼主讲的真不错,能否翻译成中文,英文看的有点费劲

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
33#
 楼主| 发表于 2013-6-7 08:59 | 只看该作者
REREQUISITESTOP

Before you get started with the installation, there are some general system requirements that need to be satisfied before proceeding.  These requirements are described in the following list.

admin machine - You should designate one of the machines in your Hypertable cluster as the admin machine (admin1 in examples below).  This is the machine from which you will be administering the cluster.  It can be the same machine as the master or any machine of your choosing.  There are no special hardware requirements for this machine, but it needs to have Internet access (at least temporarily) to get the recommended cluster management tool, Capistrano, installed on it.  It is possible to install Capistrano without Internet access, but it's challenging and could take you half a day to get it working.

password-less ssh - For ease of administration, we recommend using Capistrano, which requires password-less ssh login access from the admin machine to all other machines in the cluster (masters, hyperspace replicas, range servers, etc).  See Password-less SSH Login for details on how to set this up.

ssh MaxStartups - sshd on the admin machine needs to be configured to allow simultaneous connections from all of the machines in the Hypertable cluster.  The default simultaneous connection limit, MaxStartups, defaults to 10.  See SSH Connection Limit for details on how to increase this limit.

firewall - The Hypertable processes use TCP and UDP to communicate with one another and with client applications.  Firewalls can block this traffic and prevent Hypertable from operating properly.  Any firewall that blocks traffic between the Hypertable machines should be disabled or the appropriate ports should be opened up to allow Hypertable communication.  See Hypertable Firewall Requirements for instructions on how to do this.

open file limit - Most operating systems have a limit on the total number of files that a process can have open at any one time.  This limit is usually set too low for Hypertable, since it can create a very large number of files.  See Open File Limit for details on how to increase this limit.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
34#
 楼主| 发表于 2013-6-7 08:59 | 只看该作者
TEP 1 - INSTALL HDFSTOP

The first step in getting Hypertable up and running on top of Hadoop is to install HDFS.  Hypertable currently builds against Cloudera's CDH3 distribution of Hadoop (see CDH3 Installation for installation instructions).  Each RangeServer process should run on a machine that is also running an HDFS DataNode.  It's best not to run the HDFS NameNode on the same machine as a RangeServer since both of those processes tend to consume a lot of RAM.

To accommodate Bigtable-style workload, HDFS needs to be specially configured.  The dfs.datanode.max.xcievers property, which controls the number of files that a DataNode can service concurrently, should be increased to at least 4096 and the dfs.namenode.handler.count, which controls the number of NameNode threads available to handle RPCs, should be increased to at least 20.  This can be accomplished by adding the following lines to the conf/hdfs-site.xml file.


  dfs.namenode.handler.count
  20


  dfs.datanode.max.xcievers
  4096

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
35#
 楼主| 发表于 2013-6-7 08:59 | 只看该作者
Once the filesystem is installed, create a /hypertable directory that is readable and writable by the user account in which hypertable will run.  For example:

sudo -u hdfs hadoop fs -mkdir /hypertable
sudo -u hdfs hadoop fs -chmod 777 /hypertable

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
36#
 楼主| 发表于 2013-6-7 08:59 | 只看该作者
STEP 2 - INSTALL CAPISTRANOTOP

The Hypertable distribution comes with a number of scripts to start and stop the various servers that make up a Hypertable cluster. You can use your own cluster management tool to launch these scripts and deploy new binaries. However, if you're not already using a cluster management tool, we recommend Capistrano. The distribution comes with a Capistrano config file (conf/Capfile.cluster) that makes deploying and launching Hypertable a breeze.

Capistrano is a simple tool for automating the remote execution of tasks. It uses ssh to do the remote execution. To ease deployment, you should have password-less ssh access (i.e. public key) to all of the machines in your cluster. Installing Capistrano is pretty simple. On most systems you just need to execute the following commands (Internet access required):

$ sudo gem update
$ sudo gem install capistrano
After this installation step you should now have the cap program in your path:

$ cap --version
Capistrano v2.9.0

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
37#
 楼主| 发表于 2013-6-7 09:00 | 只看该作者
STEP 3 - EDIT CAPISTRANO CAPFILETOP

Once you have Capistrano installed, copy the conf/Capfile.cluster that comes with the Hypertable distribution to your working directory (e.g. home directory) on admin1, rename it to Capfile, and tailor it for your environment. The cap command reads the file Capfile in the current working directory by default. There are some variables that are set at the top that you need to modify for your particular environment. The following shows the variables at the top of the Capfile that need modification:

set :source_machine,     "admin1"
set :install_dir,        "/opt/hypertable"
set :hypertable_version, "0.9.7.0"
set :default_pkg,        "/tmp/hypertable-0.9.7.0-linux-x86_64.rpm"
set :default_dfs,        "hadoop"
set :default_distro,     "cdh3"
set :default_config,     "/root/hypertable.cfg"

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
38#
 楼主| 发表于 2013-6-8 12:57 | 只看该作者
Here's a brief description of each variable:

Table 2. Hypertable Capistrano Variables

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
39#
 楼主| 发表于 2013-6-8 12:57 | 只看该作者
In addition to the above variables, you also need to define three roles, one for the machine that will run the master processes, one for the machines that will run the Hyperspace replicas, and one for the machines that will run the RangeServers. Edit the following lines:
  1. role :source, "admin1"
  2. role :master, "master"
  3. role :hyperspace, "hyperspace001", "hyperspace002", "hyperspace003"
  4. role :slave,  "slave001", "slave002", "slave003", "slave004", "slave005", "slave006", "slave007", "slave008"
  5. role :localhost, "admin1"
  6. role :thriftbroker_additional
  7. role :spare
复制代码

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
40#
 楼主| 发表于 2013-6-8 12:57 | 只看该作者
The following table describes each role.

Table 3. Hypertable Capistrano Roles

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表