楼主: jieforest

NoSQL数据存储管理模式的演变

[复制链接]
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
41#
 楼主| 发表于 2013-9-27 12:53 | 只看该作者
7. Related Work

We define a NoSQL database programming language as an abstract interface for programming against NoSQL data stores. In recent work, [5] present a calculus for NoSQL systems together with its formal semantics. They introduce a Turing-complete language and its type system, while we present a much more restricted language with a focus on updates and schema evolution.

For relational databases, the importance of designing database programming languages for strong programmability, concerning both performance and usability, has been emphasized in [19].

The language presented there can express database operators, query plans, and also capture operations in the  application  logic.  How- ever, the work there is targeted at query execution in relational databases, while we cover aspects of data definition and data ma- nipulation in NoSQL data stores. Moreover, we treat the data store itself as a black box, assuming that developers use a cloud-based database-as-a-service offering that they cannot manipulate.

All successful applications age with time [29], and eventually require maintenance or evolution. Typically, there are two alter- natives to handling this problem on the level of schema: Schema versioning and schema evolution. Relational databases have an es- tablished language for schema evolution (“ALTER TABLE”). This schema definition language is part of the SQL standard, and is im- plemented by all available relational databases systems.

For evolving XML-based applications, research prototypes have been built that concentrate on the co-evolution of XML schemas and the associated XML documents [18]. The authors of [25] have developed a model driven approach for XML schema design, and support co-evolution between different abstraction levels. A dedi- cated language for XML evolution is introduced in [26] that for- malizes XML schema change operations and describes the corre- sponding updates of associated XML documents.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
42#
 楼主| 发表于 2013-9-27 13:09 | 只看该作者
JSONiq is a quite new query language for JSON documents, the first version was published in April 2013 [32]. Future versions of JSONiq will contain an update facility and will offer operations to add, delete, insert, rename, and replace properties and values.

Our schema evolution language can be translated into corresponding update expressions. If JSONiq establishes itself as a standard for querying and updating NoSQL datastores, we can also base our schema evolution method on this language.

The question whether an evolution is safe corresponds to the existence of (universal) solutions in data exchange. In particular, established practices from XML data exchange, using regular tree grammars to specify the source and the target schema [2], are highly relevant to our work. The use of object mappers translating objects from the application space into persisted entities can be seen as a form of schema specification.

This raises an interesting question: Provided that all entities conform to the class hierarchy specified by an object mapper, if we evolve entities, will they still work with our object mapper? This boils down to checking for absolute consistency in XML data exchange [2], and is a current topic in database theory (e.g. [6]). It is therefore part of our plans to see how we can leverage the latest research on XML data exchange for evolving data in schema-less data stores.

There are various object-relational mapping (ORM) frame- works fulfilling well established standards such as the Java Per- sistence API (JPA), and supporting almost all relational database systems. Some ORM mappers are even supported by NoSQL data stores, of course not implementing all features, since joins or foreign-keys are not supported by the backend (e.g. see the JPA and JDO implementations for Google Datastore [16, 17]).

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
43#
 楼主| 发表于 2013-9-27 13:10 | 只看该作者
So far, there are only few dedicated mappers for persisting ob- jects in NoSQL data stores (sometimes called object-data-store mappers (ODM)). Most of today’s ODMs are proprietary, support- ing a particular NoSQL data store (e.g. Morphia [24] for Mon- goDB, or Objectify[28] for Google Datastore). Few systems sup- port more than one NoSQL data store (e.g. Hibernate OGM [20]).

Today, these objects-to-NoSQL mapping tools have at best rudi- mentary support for schema evolution. To the best of our knowl- edge, Objectify and Morphia go the furthest by allowing developers to specify lazy migration in form of object annotations. However, we could not yet find any solutions for systematically managing and expressing schema changes. At this point, the ecosystem of tools for maintaining NoSQL databases is still within its infancy.

8. Summary and Future Work

This work investigates the maintainability of  feature-rich,  interactive web applications, from the view-point of schema evolution. In particular, we target applications that are backed by schema-less document stores or extensible record stores.

This is an increasingly popular software stack, now that  database-as-a-service  offerings are readily available: The programming APIs are easy to use, there is near to no setup time required, and pricing is reasonable.

Another sweet spot of these systems is that the data’s schema does not have to be specified in advance. Developers may freely adapt the data’s structure as the application evolves. Despite utter freedom, the data nevertheless displays an implicit structure: The application class hierarchy is typically reflected in the persisted data, since object mappers perform the mundane task of marshalling data between the application and the data store.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
44#
 楼主| 发表于 2013-9-27 13:11 | 只看该作者
As an application evolves, so does its schema. Yet schema- free NoSQL data stores do not yet come with convenient schema management tools. As of today, virtually all data migration tasks require custom programming (with the exception of very basic data inspection tools for manipulating single entities).

It is up to the developers to code the migration of their production data “on foot”, getting the data ready for the next software release. Worse yet, with weekly releases, the schema evolves just as frequently.

In this paper, we lay the foundation for systematically managing schema evolution in this setting. We define a declarative NoSQL schema evolution language, to be used in a NoSQL data store administration console.

Using our evolution language, developers can specify common operations, such as adding, deleting, or re- naming properties in batch. Moreover, properties can be moved or copied, since data duplication and denormalization are fundamental in NoSQL data stores. We emphasize that we do not mean to en- force a relational schema onto NoSQL data stores. Rather, we want to ease the pain of schema evolution for application developers.

We regard it as one of our key contributions that our operations can be implemented for a large class of NoSQL data stores. We show this by an implementation in a generic NoSQL database pro- gramming language. We also discuss which operations can be ap- plied safely, since non-deterministic migrations are unacceptable.

Future work. Our NoSQL schema evolution language specifies operations that are executed eagerly, on all qualifying entities. An alternative approach is to migrate entities lazily, the next time they are fetched into the application space. Some object mappers already provide such functionality.

We believe that lazy evolution is still lit- tle understood, and at the same time poses great risks when applied erroneously. We will investigate how our NoSQL schema evolu- tion language may be implemented both safely and lazily. Ideally, a dedicated schema evolution management tool would allow developers to migrate data eagerly for leaps in schema evolution, and to patch things up lazily for minor changes.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
45#
 楼主| 发表于 2013-9-27 13:12 | 只看该作者
References

[1] Apache  Cassandra,  2013.   http://cassandra.apache.org/.
[2] M. Arenas, P. Barcelo′, L. Libkin, and F. Murlak. Relational and XML Data Exchange. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2010.
[3] J. Baker, C. Bond, J. C. Corbett, J. Furman, et al. “Megastore: Provid- ing Scalable, Highly Available Storage for Interactive Services”. In Proc. CIDR, pages 223–234, 2011.
[4] Basho Technologies. riak/docs, 2013. http://docs.basho.com/riak/latest/.
[5] V. Benzaken, G. Castagna, K. Nguyen, and J. Sime′on. “Static and dynamic semantics of NoSQL languages”. In Proc. POPL, pages 101– 114, 2013.
[6] M. Bojan′czyk, L. A. Kolodziejczyk, and F. Murlak. “Solutions in XML data exchange”. In Proc. ICDT, pages 102–113, 2011.
[7] M. Brown. Developing with Couchbase Server. O’Reilly, 2013.
[8] R. Cattell. “Scalable SQL and NoSQL data stores”. SIGMOD Record, 39(4):12–27, 2010.
[9] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, et al. “Bigtable: A Distributed Storage System for Structured Data”. In Proc. OSDI, pages 205–218, 2006.
[10] K. Chodorow. MongoDB: The Definitive Guide. O’Reilly, 2013.
[11] Couch Potato, 2010. https://github.com/langalex/couchpotato/issues/14.
[12] J. Dean and S. Ghemawat. “MapReduce: Simplified Data Processing on Large Clusters”. In Proc. OSDI, pages 137–150, 2004.
[13] L. George. HBase: The Definitive Guide. O’Reilly, 2011.
[14] Google   Inc.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
46#
 楼主| 发表于 2013-9-27 13:12 | 只看该作者
over.

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表