楼主: jieforest

NoSQL数据存储管理模式的演变

[复制链接]
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
21#
 楼主| 发表于 2013-9-22 09:19 | 只看该作者
3. A NoSQL Schema Evolution Language

In schema-less NoSQL data stores, there is no explicit, global schema. Yet when we are building feature-rich, interactive web applications on top of NoSQL data stores, entities actually do display an implicit structure (or schema); this structure manifests in the entity kind and entity property names.

This especially holds when object mappers take over the mundane task of marshalling objects from the application space into persisted entities, and back. These object mappers commonly map class names to entity kinds, and class members to entity properties. (We discuss object mappers further in the context of related work in Section 7.)

Thus, there is a large class of applications that use NoSQL data stores, where the data is somewhat consistently structured, but has no fixed schema in the relational sense. Moreover, in an agile setting, these are applications that evolve rapidly, both in their features and their data.

Under these assumptions, we now define a compact set of declarative schema migration operations, that have been inspired by schema evolution in relational databases, and update operations for semi-structured data [31]. While we can only argue empirically, having read through discussions in various developer forums, we are confident that these operations cover a large share of the common schema migration tasks.


使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
22#
 楼主| 发表于 2013-9-23 15:31 | 只看该作者
Figure 2 shows the syntax of our NoSQL schema evolution language in Extended Backus-Naur Form (EBNF). An evolution operation adds, deletes, or renames properties. Properties can also be moved or copied. Operations may contain conditionals, even joins.

The property kinds (derived from kname) and the property names (pname) are the terminals in this grammar. We will formally specify the semantics for our operations in Section 5. For now, we discuss some examples to develop an intuition for this language.

We introduce a special-purpose numeric property “version” for all entities. The version is incremented each time an entity is pro- cessed by an evolution operator. This allows us to manage heteroge- neous entities of the same kind. This is an established development practice in entity evolution.

We begin with operations that affect all entities of one kind:

1) The add operation adds a property to all entities of a given kind. A default value may be specified (see Example 2).

2) The delete operation removes a property from all entities of a given kind (see Example 3).

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
23#
 楼主| 发表于 2013-9-23 15:32 | 只看该作者
3) The rename operation changes the name of a property for all entities of a given kind (see Example 4).

Example 2. Below, we show an entity from our blogpost example before and after applying operation add blogpost.likes = 0. This adds a likes-counter to all blogposts, initialized to zero. We chose a compact tabular representation of entities and their properties.


Figure 3. Moving property “url” (c.f. Example 5).


使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
24#
 楼主| 发表于 2013-9-23 15:34 | 只看该作者
Example 2. Below, we show an entity from our blogpost example before and after applying operation add blogpost.likes = 0. This adds a likes-counter to all blogposts, initialized to zero. We chose a compact tabular representation of entities and their properties.



Example 3. The operation delete blogpost.url deletes the property “url” from all blogposts.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
25#
 楼主| 发表于 2013-9-23 15:35 | 只看该作者
We  can  also  specify  a  selection  predicate.  For  instance,  delete blogpost.url where blogpost.version = 1 deletes the url-property
only from those blogposts that are at schema version 1.

Example 4. rename blogpost.text to content renames the prop- erty “text” to “content” for all blogpost entities.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
26#
 楼主| 发表于 2013-9-23 15:35 | 只看该作者
We define further operations that affect two kinds of entities. Such migration operations are not available in schema definition languages for relational databases. Yet since NoSQL data stores typically do not support joins, denormalization is a technique heavily relied upon.

When building interactive web applications, re- sponsiveness is key, which usually forbids programmatic joins in the application. Instead, one would reorganize that data such that it renders joins unnecessary. Thus, duplication and denormaliza- tion are first-class citizens when building applications on top of NoSQL data stores. Accordingly, we introduce dedicated opera- tions for supporting these schema refactorings.

1) The move operation moves a property from one entity-kind to another entity-kind (see Example 5).

2) The copy operation copies a property from one entity-kind to another entity-kind (see Example 6).

Of course, in moving and copying we also compute joins. Yet this is done in offline batch processing, and not during time-critical interactions with users.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
27#
 楼主| 发表于 2013-9-24 12:20 | 只看该作者
Example 5. To move the property “url” from users to all their blog- posts, we specify the operation move user.url to blogpost where user.name = blogpost.author. Figure 3 shows its application to a blog by user Gerhard.  

Example 6. The next example shows the copy operation: The property “email” is copied from users to all their blogposts: copy user.email to blogpost where user.name = blogpost.author. Figure 4 shows its application to a blog by user Gerhard. The copy operation does not change the user entities.  

Section 5 formalizes the semantics and investigates the effort of our migration operations. As a prerequisite, we next introduce a generic NoSQL database programming language.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
28#
 楼主| 发表于 2013-9-24 12:21 | 只看该作者
4. A NoSQL Database Programming Language

Relational databases come with a query language capable of joins, as well as dedicated data definition and data manipulation lan- guage. Yet in programming against NoSQL data stores, the application logic needs to take over some of these responsibilities.

We now define the typical operations on entities in NoSQL data stores, building a purposeful NoSQL database programming lan- guage. Our language is particularly modeled after the interfaces to Google Datastore [15], and is applicable to document stores (e.g. [7]) as well as schema-less extensible record stores (e.g. [13]).

We consider system architectures such as shown in Figure 1. Each user interacts with an instance of the application, e.g. a servlet. Typically, the application fetches entities from the data store into the application space, modifies them, and writes them back to the data store.

We introduce a common abstraction from the current state of the data store and the objects available in the application space. We refer to this abstraction as the memory state.

The memory state. We model a memory state as a set of map- pings from entity keys to entity values. Let us assume that an en- tity has key κ and value ϑ. Then the memory contains the mapping from this key to this value: κ  → ϑ. Keys in a mapping are unique, so a memory state does not contain any mappings κ  → ϑ1 and κ   → ϑ2  with ϑ1  != ϑ2.

The entity value itself is structured as a mapping from property

names to property values. A property value may be from an atomic domain Dom, either single-valued (Dom) or multi-valued (Dom+), or it may consist of the properties of a nested entity.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
29#
 楼主| 发表于 2013-9-24 12:22 | 只看该作者
4. A NoSQL Database Programming Language

Relational databases come with a query language capable of joins, as well as dedicated data definition and data manipulation lan- guage. Yet in programming against NoSQL data stores, the application logic needs to take over some of these responsibilities.

We now define the typical operations on entities in NoSQL data stores, building a purposeful NoSQL database programming lan- guage. Our language is particularly modeled after the interfaces to Google Datastore [15], and is applicable to document stores (e.g. [7]) as well as schema-less extensible record stores (e.g. [13]).

We consider system architectures such as shown in Figure 1. Each user interacts with an instance of the application, e.g. a servlet. Typically, the application fetches entities from the data store into the application space, modifies them, and writes them back to the data store.

We introduce a common abstraction from the current state of the data store and the objects available in the application space. We refer to this abstraction as the memory state.

The memory state. We model a memory state as a set of map- pings from entity keys to entity values. Let us assume that an en- tity has key κ and value ?. Then the memory contains the mapping from this key to this value: κ  → ?. Keys in a mapping are unique, so a memory state does not contain any mappings κ  → ?1 and κ   → ?2  with ?1  != ?2.

The entity value itself is structured as a mapping from property

names to property values. A property value may be from an atomic domain Dom, either single-valued (Dom) or multi-valued (Dom+), or it may consist of the properties of a nested entity.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
30#
 楼主| 发表于 2013-9-24 12:23 | 只看该作者
Example 7. We model a memory state with a single entity manag- ing user data. The key is a tuple of kind user and the id 42. The en- tity value contains the user’s login “hhiker” and password “galaxy”:

{(“user”, 42)  → {login  → “hhiker”, pwd  → “galaxy”}}.      口

Substitutions. We describe manipulations of a memory state by substitutions. A substitution σ is a mapping from a set K (e.g. the entity keys) to a set V  (e.g. the entity values) and the special

symbol ⊥. To access ?i in a substitution {κ1   → ?1, . . . , κn    →

?n}, we write σ(κi). If σ(κi) = ⊥, then this explicitly means that

this mapping is not defined.

Let ms be the memory state, and let σ be a substitution. In updating the memory state ms by substitution σ, we follow a create- or-replace philosophy for each mapping in the substitution. We denote the updated memory by ms[σ]:

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表