楼主: jieforest

把Apache Cassandra作为云数据库的评估

[复制链接]
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
11#
 楼主| 发表于 2012-7-19 12:25 | 只看该作者
The Cloud Promises Lower Cost

Many IT professionals who begin looking at the cloud as an alternative to typical on-premise architectures assume they will experience cheaper software costs in a cloud implementation.

However, they often have a rude awakening when they fail to experience cost benefits from managing data in the cloud. The cold reality is that some traditional RDBMS providers are every bit as expensive in a cloud implementation as they are in a standard on-premise implementation.  

When looking to implement a database in the cloud, IT professionals should seek a cost structure that is friendly to scaling out horizontally, regardless of machine size or the data volume being managed. Otherwise, there is risk of unpleasant cost increases when the underlying business becomes very successful and more nodes are needed to manage ballooning data volumes and increased concurrent users.  

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
12#
 楼主| 发表于 2012-7-20 15:08 | 只看该作者
Evaluating Apache Cassandra as a Cloud Database

So what is Apache Cassandra and how does it stack up against the criteria for cloud databases previously discussed? Following is an overview of Cassandra, including a description of the key technology differentiators that make it a stand-out cloud database. Also discussed is how DataStax Enterprise – a smart data platform powered by Cassandra – provides the best possible cloud database option for those needing to manage both real-time and analytic data in a cloud environment.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
13#
 楼主| 发表于 2012-7-20 16:32 | 只看该作者
Why Cassandra?  

Key technical differentiators that make Cassandra a winning choice in a cloud computing environment include the following:  

•  A built-for-scale architecture that can handle petabytes of information and thousands of concurrent users/operations per second as easily as it can manage much smaller amounts of data and user traffic

•  Peer-to-peer design that offers no single point of failure for any database process or function; every node is the same, so there is no concept of a master node or anything similar

•  Online capacity additions that deliver linear performance gains for both read and write operations  

•  Read/write anywhere capabilities that equate to a true network-independent method of storing and accessing data  

•  Guaranteed data safety that ensures no loss of data, no matter what node is written to in a cluster  

•  Tunable data consistency that allows Cassandra to offer the data durability and protection like an RDBMS, but with the flexible choice of relaxing data consistency when application use cases allow  

•  Flexible/dynamic schema design that accommodates all formats of big data applications, including structured, semi-structured, and unstructured data; data is represented in Cassandra via column families that are dynamic in nature and accommodate all modifications online  

•  Simplified replication that provides data redundancy and is capable of being multi-data center and cloud in nature  

•  Data compression that reduces the footprint of raw big data by over 80 percent in some use cases  

•  A SQL-like language (CQL – Cassandra Query Language) that lessens the learning curve for developers and administrators coming from the RDBMS world  

•  Support for key developer languages (e.g., Java, Python) and operating systems  

•  No requirement for any special equipment; Cassandra runs on commodity hardware  

•  Very easy installations in cloud environments including Amazon Machine Images (AMIs) that enable a user to be up and running with a multiple-node cluster in minutes  

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
14#
 楼主| 发表于 2012-7-20 16:32 | 只看该作者
Cassandra is built with the assumption that failures can and will occur in a data center or cloud infrastructure. Therefore, data redundancy to protect against hardware failure and other data loss scenarios is built into and managed transparently by Cassandra. Furthermore, this capability can be configured so that big data applications can use a single large database distributed across multiple, geographically dispersed data centers, between different physical racks in a data center, and between public cloud providers and on-premise managed data centers.  

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
15#
 楼主| 发表于 2012-7-20 16:33 | 只看该作者
Figure 1: Cassandra multi-data center capabilities

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
16#
 楼主| 发表于 2012-7-22 10:17 | 只看该作者
These and other capabilities make Cassandra and DataStax Enterprise the smart choice for modern businesses with big data management needs that have outgrown traditional RDBMS software.

Netflix – An Example of Succeeding in the Cloud with Cassandra

With more than 25 million members worldwide, Netflix, Inc. (Nasdaq: NFLX) is the world's leading Internet subscription service for enjoying movies and TV shows. Netflix allows its members to instantly watch unlimited movies and TV episodes streaming over the Internet to computers and TVs.

Figure 2: Performance results from Netflix’s benchmark of Cassandra in the cloud

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
17#
 楼主| 发表于 2012-7-22 10:18 | 只看该作者
Cassandra and DataStax are a key part of Netflix’s database infrastructure, with everything being hosted in the cloud. Netflix gave a presentation at the 2011 High Performance Transaction System workshop that demonstrated both the ease of use and linear performance capabilities of using Cassandra in the cloud. The following is an excerpt from a Netflix blog post summarizing the presentation:

“ The automated tooling that Netflix has developed lets us quickly deploy large scale Cassandra clusters, in this case a few clicks on a web page and about an hour to go from nothing to a very large Cassandra cluster consisting of 288 medium sized instances, with 96 instances in each of three EC2 availability zones in the US-East region. Using an additional 60 instances as clients running the stress program we ran a workload of 1.1 million client writes per second. Data was automatically replicated across all three zones making a total of 3.3 million writes per second across the cluster.”

The linear performance capabilities are illustrated well in the Netflix benchmark, delivering a very impressive 1.1 million writes per second. The ease with which Cassandra nodes can be configured and implemented in the cloud is also clear.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
18#
 楼主| 发表于 2012-7-22 10:19 | 只看该作者
DataStax Enterprise – Certified Cassandra for Production Applications

Cassandra is a top open source project for the Apache foundation and enjoys strong community support and developer involvement. New community releases and patches are produced very quickly, with the understanding that community builds are not put through any enterprise-styled quality assurance process, and often contain a mixture of enhancements plus bug fixes.

By contrast, DataStax Enterprise only contains selected Cassandra releases chosen by the expert staff and committers at DataStax. Each selected release is then put through a rigorous certification process designed by DataStax engineers and QA staff to ensure it is stable and
ready for enterprise production systems. Any found issues are immediately fixed and applied to the DataStax Enterprise server.

In addition, DataStax provides enterprises with predictable, certified quarterly service pack updates as well as other software benefits such as emergency hot fixes (for production outages) and bug escalation privileges that prioritize customers’ issues over community-submitted bugs.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
19#
 楼主| 发表于 2012-7-23 00:45 | 只看该作者
DataStax Enterprise – Real-Time, Analytics, and Search in the Cloud

DataStax is the leading provider of enterprise NoSQL software products and services based on Apache Cassandra.

Through its offerings, DataStax supports businesses that need a progressive data management system that can serve as a real-time datastore for critical production applications, and delivers built-in analytic and search capabilities for analyzing and searching that data once it is in Cassandra.

DataStax Enterprise inherits Cassandra’s entire, powerful feature set for servicing modern realtime applications, and uses it to merge in a fault-tolerant, analytics, and enterprise search platform that provides Hadoop MapReduce, Hive, and Pig support for analytics and uses Apache Solr for fast enterprise search.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
20#
 楼主| 发表于 2012-7-23 00:46 | 只看该作者
Solving the Cloud Mixed-Workload Problem

A primary benefit that DataStax Enterprise provides to enterprises needing smart big data management capabilities is its ability to service real-time, analytic, and enterprise search data operations in the same database cluster without any of the loads impacting the other. The key to making this possible is the underlying architecture of Cassandra.

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表