楼主: jieforest

Hypertable HQL指南

[复制链接]
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
51#
 楼主| 发表于 2013-7-2 01:16 | 只看该作者
If we were to load this file with LOAD DATA INFILE into the counts table, a subsequent select would yield the following output:
  1. hypertable> select * from counts;
  2. org.hypertable.www/     url:2010-10-26_09       3
  3. org.hypertable.www/     url:2010-10-26_10       3
  4. org.hypertable.www/about.html   url:2010-10-26_09       1
  5. org.hypertable.www/about.html   url:2010-10-26_10       1
  6. org.hypertable.www/documentation.html   url:2010-10-26_09       1
  7. org.hypertable.www/documentation.html   url:2010-10-26_10       1
  8. org.hypertable.www/download.html        url:2010-10-26_09       2
  9. org.hypertable.www/download.html        url:2010-10-26_10       2
复制代码

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
52#
 楼主| 发表于 2013-7-2 01:17 | 只看该作者
GROUP COMMIT

Updates are carried out by the RangeServers through the following steps:

Write the update to the commit log (in the DFS)
Sync the commit log (in the DFS)
Populate in-memory data structure with the update

Under high concurrency, step #2 can become a bottleneck. Distributed filesystems such as HDFS can typically handle a small number of sync operations per second. The Group Commit feature solves this problem by delaying updates, grouping them together, and carrying them out in a batch on some regular interval.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
53#
 楼主| 发表于 2013-7-2 01:17 | 只看该作者
A table can be configured to use group commit by supplying the GROUP_COMMIT_INTERVAL option in the CREATE TABLE statement. The GROUP_COMMIT_INTERVAL option tells the system that updates to this table should be carried out with group commit and also specifies the commit interval in milliseconds. The interval is constrained by the value of the config property Hypertable.RangeServer.CommitInterval, which acts as a lower bound (default is 50ms). The value specified for GROUP_COMMIT_INTERVAL will get rounded up to the nearest multiple of this property value. The following is an example CREATE TABLE statement that creates a table counts set up for group commit operation.

Example
  1. hypertable> CREATE TABLE counts (
  2.   url,
  3.   domain
  4. ) GROUP_COMMIT_INTERVAL=100;
复制代码

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
54#
 楼主| 发表于 2013-7-2 01:18 | 只看该作者
UNIQUE CELLS

Unique cells can be used whenever an application wants to make sure that there can never be more than one cell value in a column family. Unique cells are useful i.e. for assigning product IDs, user IDs etc. Traditional SQL databases offer auto-incrementing columns, but an auto-incrementing column would be relatively slow to implement in a distributed database. Hypertable's support for unique cells is therefore a bit different.

First, the column family needs to be set up to store only the oldest value:

  CREATE TABLE profile ('guid' MAX_VERSIONS 1 TIME_ORDER DESC, ...)

To insert values, create a mutator and write the unique cell to the database. Then create a scanner, fetch the cell and verify that it was written correctly. If the scanner returns the same value then the update was fine. Otherwise the cell already existed with a different value.

Since this process is a bit cumbersome we introduced the HyperAppHelper library. It exports the following C++ function which requires a TablePtr and a KeySpec as a parameter. If the guid parameter is empty, Hypertable will fill it with an 128 bit GUID:

void create_cell_unique(const TablePtr &table, const KeySpec &key, String &guid);

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
55#
 楼主| 发表于 2013-7-2 01:18 | 只看该作者
This function can also be used through the Thrift interface. Here's a PHP snippet from the microblogging example. If the last parameter $guid is an empty string, then a new guid will be created and returned.

self::$_client->create_cell_unique(self::$_namespace, $table, $key, $guid);

The newly created GUID will look similar to this one:

d7f8350a-777b-42b8-9967-e0cdc0dd1545

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
56#
 楼主| 发表于 2013-7-3 13:05 | 只看该作者
SECONDARY INDICES

Tables can have one or more indices, each indexing a single column family.  Two types of indices exist: a cell value index, which optimizes scans on a single column family that do an excact match or prefix match of the cell value, and a qualifier index, which optimizes scans on a single column family that do an exact match or prefix match of the column qualifier. The use of indices is optional.

The indices are stored in an index table which is created in the same namespace as the primary table and has the same name with one (cell value index) or two (qualifier index) caret signs (`^`) as a prefix.

A column family can have both types of indices (cell value index and qualifier index) at the same time.  The following HQL command creates a table with three column families (a, b and c).  Column family a has a cell value index, column family b has a qualifier index and c has both.
  1. CREATE TABLE foo (
  2.     a,
  3.     b,
  4.     c,
  5.     INDEX a,
  6.     QUALIFIER INDEX b,
  7.     INDEX c,
  8.     QUALIFIER INDEX c,
  9. );
复制代码

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
57#
 楼主| 发表于 2013-7-3 13:06 | 只看该作者
Indices speed up some queries that match on column families.  Accessing columns which are indexed is nearly as fast as accessing them by their row key. On the downside they require additional disk storage and cause a very small performance impact when inserting data to an indexed column.

Cell value indices are used when selecting cells by value (SELECT a FROM TABLE t WHERE a = "cell-value" ...) or by a value prefix (SELECT a FROM TABLE t WHERE a =^ "cell-prefix" ...).

Qualifier indices are used when selecting cells from a qualified column (SELECT a:foo FROM TABLE t ...) or selecting all cells with a qualifier prefix (SELECT a:^prefix FROM TABLE t ...).

For more information see the documentation of CREATE TABLE and a blog post about secondary indices.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
58#
 楼主| 发表于 2013-7-3 13:06 | 只看该作者
over.

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表