12
返回列表 发新帖
楼主: jieforest

NoSQL数据库性能评估

[复制链接]
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
11#
 楼主| 发表于 2014-8-26 22:50 | 只看该作者
Load phase

During the first stage of the test, the load phase, we uploaded 100,000,000 records of 1 Kb each to every data store. YCSB measured the average throughput in operations per second and average latency of operations in milliseconds. The next diagram displays the results of the load phase:


Figure 2: The results of the load phase. Source: Altoros

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
12#
 楼主| 发表于 2014-8-27 22:48 | 只看该作者
HBase demonstrated the lowest throughput, probably because we turned on the auto-flash mode. This mode ensures that each operation that creates a record will be sent from the client to the server and then persisted to the database. HBase also supports an alternative mode that uses additional cash on the client side. When the client is out of client cache it sends data from the cash to the server. In this alternative mode, HBase saves data to disk in batches.

As we expected, Cassandra demonstrated excellent results with almost 18,000 operations per second. This is due to Cassandra’s architecture. It simultaneously updates data in memory and writes it to the transaction journal on the disk. This guarantees data persistency should a node crash.

The number of operations per second in MongoDB’s results was pretty close to that of Hbase - the average latency was around seven milliseconds at 13,000 operations per second.

In this particular test, all data was loaded in a single iteration, but insert, update, read, and scan operations in the transaction phase of the test were performed in five iterations for each workload (for every database).

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
13#
 楼主| 发表于 2014-8-27 22:49 | 只看该作者
It should be noted that many of the diagrams with test results demonstrate that database performance is limited and starts to decline at a certain throughput level. Also we need to mention that we used Amazon AWS and network storage which could potentially influence the results.

Workload A

Workload A includes read and update operations in a ratio of 50/50. It simulates an e-commerce application. This slide shows the results of update operations.


Figure 3: The results of update operations in Workload A. Source: Altoros

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
14#
 楼主| 发表于 2014-8-27 22:49 | 只看该作者
Cassandra and HBase demonstrated good performance with a throughput below 20 milliseconds.

MongoDB’s latency increased substantially as we increased the workload. At 100 operations per second all the four databases had similar performance. But when the workload reached 1,500 operations per second, MongoDB’s latency increased to 100 milliseconds.

The next diagram shows results of read operations in Workload A.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
15#
 楼主| 发表于 2014-8-27 22:50 | 只看该作者
Figure 4: Figure 4. The results of read operations for Workload A. Source: Altoros

All the graphs are very different because reads and updates were randomly distributed. The results for read operations in Workload A were more or less similar in all the tested solutions. The difference in latencies was insignificant, within a range of 15-30 milliseconds.

Workload B

Workload B is read-mostly with 95% of reads and only 5% of updates. It simulates content tagging when adding a tag is an update, but most other transactions are reads. Here are the results for update operations in workload B.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
16#
 楼主| 发表于 2014-8-27 22:51 | 只看该作者
Figure 5. The results of update operations for Workload B. Source: Altoros

Cassandra demonstrates a very low latency, but her performance is limited to 1200 operations per second. With HBase, the latency increases evenly as the workload grows. The behavior of MongoDB is similar to the previous test where the latency increased together with the throughput.

The next slide shows the results of read operations that make up 95% of Workload B.

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表