楼主: jieforest

嵌入式NoSQL数据库RaptorDB 3.1.0发布

[复制链接]
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
11#
 楼主| 发表于 2013-12-22 23:03 | 只看该作者
View Storage :

1)You don't need to specify column widths for string columns and RaptorDB can handle any size string (indexing is however limited to max 255 bytes for normal string columns).

2)Views (output from map functions) save data as binary not JSON ( much faster for indexing).

3)Primary list/View on object type definition for immediate access to the object saved (immediate call to a primary map function)

Document Storage :

Ability to save byte[] data as well so you can save files etc. ( these are saved in a separate storage file).

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
12#
 楼主| 发表于 2013-12-22 23:04 | 只看该作者
Indexing Engine :

1)Support for special hybrid bitmap index for views using Word Aligned Hybrid (WAH) compression.

2)Automatic indexing of views (self maintained no administration required).

3)All columns are indexed with the MGIndex.

4)Fast full text search support on string columns (you choose normal or full text search of strings in your view) .

LINQ Query Processor :

Query Filter parser with nested parenthesis, AND and OR expressions.

Map Function Engine :

1)Map functions are compiled .net code not JavaScript like competitors ( order of magnitude faster, see the performance tests).

2)Map functions have access to an API for doing queries and document fetches (you can do complex business logic in them).

3)Map functions type check the data saved, which makes reading the data easier and everything more consistent.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
13#
 楼主| 发表于 2013-12-22 23:04 | 只看该作者
Limitations

The following limitations are in this release:

Requires at least .net 4 (uses the Task library).
Document and View Item count is limited to 4 billion items or Int32.
This release is not code backward compatible with the key/value store version.
Aggregation on views.
Standalone service version is not available at the moment.
Revision checking on documents is not supported at the moment.
Sharding / Replication is not supported at the moment.
Query filters only work with literal right hand sides (e.g. Name='bob' not StatusColumn > LastStatusColumn referencing another column)
In transaction mode, currently the active thread does not see it's own data changes.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
14#
 楼主| 发表于 2013-12-22 23:05 | 只看该作者
Rules

To help you work with RaptorDB better, below are some rules which will help you:

1)You must have a Primary View for each type you are going to save.

2)If a view has ConsistentView = True it will act like a Primary View.

3)If a Primary View has TransactionMode = True then all operations on it and all views associated with it will be in a transaction and will Rollback if any view fails.

4)BackgroudIndexing is turned off in transaction mode.

5)Queries in a transaction will only see their own updates.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
15#
 楼主| 发表于 2013-12-22 23:06 | 只看该作者
The Competition  

There are a number competing storage systems, a few of which I have looked at, are below:

1)MongoDB (c++): Great database which I love and is the main inspiration behind RaptorDB, although I have issues with its 32bit 4Gb database size limit and the memory map file design which could potentially corrupt easily. (Polymorphism has a workaround in mongodb if anyone wants to know).

2)CouchDB / CouchBase (erlang): Another standard in document databases. The design is elegant.

3)RavenDB (.net): Work done by Ayende which is a .net document database built on the Windows ESENT storage system.

4)OrientDB (java) : The performance specs are impressive.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
16#
 楼主| 发表于 2013-12-23 15:38 | 只看该作者
Performance Tests

The test were all done on my notebook which is a AMD K625 1.5Ghz CPU, 4Gb DDR2 Ram, WD 5400rpm HDD, Win7 Home 64bit , Windows rating of 3.9.

Javascript vs .NET compiled map functions

Most of the document databases today use the javascript language as the language to write map functions, in RaptorDB we are using compiled .net code. A simple test of the performance benefits of using compiled code is the following example from the http://www.silverlight.net/conte ... ss/run/default.html site which is a chess application, this test is certainly not comprehensive but it does show a reasonable computation intensive comparison :



As you can see even in Google Chrome (V8 javascript engine) which arguably is the fastest javascript processor currently, is beaten by the .net code (in Silverlight, full .net version could be faster still) by a factor of about 8x.

With this non scientific test it's unreasonable to use anything but compiled code for map functions.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
17#
 楼主| 发表于 2013-12-23 15:43 | 只看该作者
Insert object performance

Depending on the complexity of you documents and the views defined you can expect around 10,000 docs/sec throughput from RaptorDB (based on my test system).

Query Performance

The real test to validate the use of a NoSql database is the query test, you have put the data in, how fast is it getting it out. RaptorDB will output query processing times to the log file, as you can see from the samples below most of the query time is taken by the actual fetching of the data and the query plan is executed in the milliseconds range (the power of the bitmap index).
  1. 2012-04-29 12:38:40|DEBUG|1|RaptorDB.Views.ViewHandler|| query bitmap done (ms) : 40.0023
  2. 2012-04-29 12:38:40|DEBUG|1|RaptorDB.Views.ViewHandler|| query rows fetched (ms) : 33.0019
  3. 2012-04-29 12:38:40|DEBUG|1|RaptorDB.Views.ViewHandler|| query rows count : 300

  4. 2012-04-29 12:38:40|DEBUG|1|RaptorDB.Views.ViewHandler|| query bitmap done (ms) : 1.0001
  5. 2012-04-29 12:38:40|DEBUG|1|RaptorDB.Views.ViewHandler|| query rows fetched (ms) : 469.0268
  6. 2012-04-29 12:38:40|DEBUG|1|RaptorDB.Views.ViewHandler|| query rows count : 25875

  7. 2012-04-29 12:38:45|DEBUG|1|RaptorDB.Views.ViewHandler|| query bitmap done (ms) : 4.0002
  8. 2012-04-29 12:38:45|DEBUG|1|RaptorDB.Views.ViewHandler|| query rows fetched (ms) : 6.0003
  9. 2012-04-29 12:38:45|DEBUG|1|RaptorDB.Views.ViewHandler|| query rows count : 500

  10. 2012-04-29 12:38:45|DEBUG|1|RaptorDB.Views.ViewHandler|| query bitmap done (ms) : 0
  11. 2012-04-29 12:38:45|DEBUG|1|RaptorDB.Views.ViewHandler|| query rows fetched (ms) : 677.0387
  12. 2012-04-29 12:38:45|DEBUG|1|RaptorDB.Views.ViewHandler|| query rows count : 50000
复制代码

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
18#
 楼主| 发表于 2013-12-23 15:44 | 只看该作者
If you want to torture someone, make them write a LINQ Provider!

It took me around a month of intense research, debugging and tinkering to get my head around the LINQ provider interface and how it works, while the title of this section is a bit harsh but I hope it conveys the frustration I felt at the time.

To be fair what emerged is very clean, concise and elegant.  Admittedly it is only the Expression evaluation part of LINQ and is a fraction of what you have to go through for a full LINQ provider. This was all I needed for RaptorDB so I will try to explain how it was done here for anybody wanting to continue as resources are very rare on this subject.

For RaptorDB we want a "where" clause parser in LINQ which will essentially filter the view data and give us the rows, this is done with the following command :
  1. int j = 1000;
  2. var result = db.Query(typeof(SalesInvoice),
  3.                  (SalesInvoice s) => (s.Serial > j &&  s.CustomerName == "aaa")
  4.              );
复制代码

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
19#
 楼主| 发表于 2013-12-23 15:44 | 只看该作者
The main part we are focusing on is the line :
  1. (SalesInvoice s) => (s.Serial > j && s.CustomerName == "aaa")
复制代码
From this we want to parse the expression which reads : given the SalesInvoice type (used for denoting the property/column names, and serves no other purpose) filter where the [serial number is greater than j and the customer name is "aaa"] or the count is greater than zero.  From this the query engine must determine the "column names" used and fetch them from the index file and get the associated values from that index and apply logical arithmetic on the results to get what we want.

使用道具 举报

回复
论坛徽章:
277
马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11版主9段
日期:2012-11-25 02:21:03ITPUB年度最佳版主
日期:2014-02-19 10:05:27现任管理团队成员
日期:2011-05-07 01:45:08
20#
 楼主| 发表于 2013-12-23 15:44 | 只看该作者
The quirks of LINQ parsing

There are two quirks in parsing LINQ queries :

1)Variables (the j in the above example) are replaced with a compiler placeholder and not the value.

2)All the work is done in the VisitBinary method, for logical expression parsing and clause evaluation, so you have to be able to distinguish and handle them both.

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表