MongoDB and Scale Out? No, says MongoHQ

Sky-Tiger · 发表于 2014-5-18 19:54

Solution
Each row in the table describes a single entity, all of the same type. That type is given by the name of the table itself, Product. We know certain information about each of these items, based on the columns in the table itself, such as the model number, the division, and so on. We want to represent these data in RDF.
Since each row represents a distinct entity, each row will have a distinct URI. Fortunately, the need for unique identifiers is just as present in the database as it is in the Semantic Web, so there is a (locally) unique identifier available namely, the primary table key, in this case the column called ID. For the Semantic Web, we need a globally unique identifier. The simplest way to form such an identifier is by having a single URI for the database itself (perhaps even a URL if the database is on the Web). Use that URI as the namespace for all the identifiers in the database. Since this is a database for a manufacturing company, let’s call that namespace mfg

Sky-Tiger · 发表于 2014-5-18 20:04

It is not unusual for someone who is building a model in RDF for the first time to feel a bit limited by the simple subject/predicate/object form of the RDF triple. They don’t want to just say that Shakespeare wrote Hamlet, but they want to qualify this statement and say that Shakespeare wrote Hamlet in 1604 or that Wikipedia states that Shakespeare wrote Hamlet in 1604. In general, these are cases in which it is, or at least seems, desirable to make a statement about another statement. This process is called reification. Reification is not a problem specific to Semantic Web modeling; the same issue arises in other data modeling contexts like relational databases and object systems. In fact, one approach to reification in the Semantic Web is to simply borrow the standard solution that is commonly used in relational database schemas, using the conventional mapping from relational tables to RDF given in the preceding challenge. In a relational database table, it is possible to simply create a table with more columns to add additional information about a triple. So the statement Shakespeare wrote Hamlet is expressed (as in Table 3.1) in a single row of a table, where there is a column for the author of a work and another column for its title. Any further information about this event is done with another column (again, just as in Table 3.1). When this is converted to RDF according to the example in the Challenge, the row is represented by a number of triples, one triple per column in the database. The subject of all of these triples is the same: a single resource that corresponds to the row in the table.

Sky-Tiger · 发表于 2014-5-18 20:04

Then we can create an identifier for each line by concatenating the table name “Product” with the unique key and expressing this identifier in the mfg: namespace, resulting in identifiers mfg:Product1, mfg:Product2, and so on.
Each row in the table says several things about that item namely, its model number, its division, and so on. To represent this in RDF, each of these will be a property that will describe the Products. But just as is the case for the unique identifiers for the rows, we need to have global unique identifiers for these properties. We can use the same namespace as we did for the individuals, but since two tables could have the same column name (but they aren’t the same properties!), we need to combine the table name and the column name. This results in properties like mfg:Product ModelNo, mfg:Product Division, and so on.
With these conventions in place, we can now express all the information in the table as triples. There will be one triple per cell in the table that is, for n rows and c columns, there will be n c triples. The data shown in Table 3.12 have 7 columns and 9 rows, so there are 63 triples, as shown in Table 3.13.
The triples in the table are a bit different from the triples we have seen so far. Although the subject and predicate of these triples are RDF resources (complete with qname namespaces!), the objects are not resources but literal data that is, strings, integers, and so forth. This should come as no surprise, since, after all, RDF is a data representation system. RDF borrows from XML all the literal data types as possible values for the object of a triple; in this case, the types of all data are strings or integers.
The usual interpretation of a table is that each row in the table corresponds to one individual and that the type of these individuals corresponds to the name of the table. In Table 3.12, each row corresponds to a Product. We can represent this in RDF by adding one triple per row that specifies the type of the individual described by each row, as shown in Table 3.14.
The full complement of triples from the translation of the information in Table 3.12 is shown in Figure 3.7. The types (i.e., where the predicate is rdf:type, and the object is the class mfg:Product) are shown as links in the graph; triples in which the object is a literal datum are shown (for sake of compactness in the figure) within a box labeled by their common subject.

Sky-Tiger · 发表于 2014-5-18 20:16

This approach works well for examples like Shakespeare wrote Hamlet in 1601, in which we want to express more information about some event or statement. It doesn’t work so well in cases like Wikipedia says Shakespeare wrote Hamlet, in which we are expressing information about the statement itself, Shakespeare wrote Hamlet. This kind of metadata about statements often takes the form of provenance (information about the source of a statement, as in this example), likelihood (expressed in some quantitative form like probability, such as It is 90 percent probable that Shakespeare wrote Hamlet), context (specific information about a project setting in which a statement holds, such as Kenneth Branagh played Hamlet in the movie), or time frame (Hamlet plays on Broadway January 11 through March 12). In such cases, it is useful to explicitly make a statement about a statement. This process, called explicit reification, is supported by the W3C RDF standard with three resources called rdf:subject, rdf

redicate, and rdf

bject.

Sky-Tiger · 发表于 2014-6-5 23:54

Plug and Play OpenStack
However, the enticing part of OpenStack might be to build your own private cloud, and there are several ways to accomplish this goal. Perhaps the simplest of all is an appliance-style solution. You purchase an appliance, unpack it, plug in the power and the network, and watch it transform into an OpenStack cloud with minimal addition‐ al configuration. Few, if any, other open source cloud products have such turnkey op‐ tions. If a turnkey solution is interesting to you, take a look at Nebula One.
However, hardware choice is important for many applications, so if that applies to you, consider that there are several software distributions available that you can run on servers, storage, and network products of your choosing. Canonical (where Open‐ Stack replaced Eucalyptus as the default cloud option in 2011), Red Hat, and SUSE offer enterprise OpenStack solutions and support. You may also want to take a look at some of the specialized distributions, such as those from Rackspace, Piston, Swift‐ Stack, or Cloudscaling. Also, a hat tip to Apache CloudStack, which Citrix donated to the Apache Foundation after its $200 million purchase of Cloud.com. While not cur‐ rently packaged in any distributions, like Eucalyptus, it is an example of an alternative private cloud software developed in an open source–like manner.
Alternatively, if you want someone to help guide you through the decisions about the underlying hardware or your applications, perhaps adding in a few features or inte‐ grating components along the way, consider contacting one of the system integrators with OpenStack experience, such as Mirantis or Metacloud.

Sky-Tiger · 发表于 2014-6-5 23:54

Roll Your Own OpenStack
However, this guide has a different audience—those seeking to derive the most flexi‐ bility from the OpenStack framework by conducting do-it-yourself solutions.
OpenStack is designed for scalability, so you can easily add new compute, network, and storage resources to grow your cloud over time. In addition to several massive OpenStack public clouds, a considerable number of organizations (such as Paypal, In‐ tel, and Comcast) have built large-scale private clouds. OpenStack offers much more than a typical software package because it lets you integrate a number of different technologies to construct a cloud. This approach provides great flexibility, but the number of options might be bewildering at first.

Sky-Tiger · 发表于 2014-6-5 23:55

To understand the possibilities OpenStack offers, it’s best to start with basic architec‐ tures that are tried-and-true and have been tested in production environments. We offer two such examples with basic pivots on the base operating system (Ubuntu and Red Hat Enterprise Linux) and the networking architectures. There are other differ‐ ences between these two examples, but you should find the considerations made for the choices in each as well as a rationale for why it worked well in a given environ‐ ment.
Because OpenStack is highly configurable, with many different backends and net‐ work configuration options, it is difficult to write documentation that covers all pos‐ sible OpenStack deployments. Therefore, this guide defines example architectures to simplify the task of documenting, as well as to provide the scope for this guide. Both of the offered architecture examples are currently running in production and serving users.

Sky-Tiger · 发表于 2014-6-5 23:55

This example architecture has been selected based on the current default feature set of OpenStack Havana, with an emphasis on stability. We believe that many clouds that currently run OpenStack in production have made similar choices.
You must first choose the operating system that runs on all of the physical nodes. While OpenStack is supported on several distributions of Linux, we used Ubuntu 12.04 LTS (Long Term Support), which is used by the majority of the development community, has feature completeness compared with other distributions and has clear future support plans.
We recommend that you do not use the default Ubuntu OpenStack install packages and instead use the Ubuntu Cloud Archive. The Cloud Archive is a package reposito‐ ry supported by Canonical that allows you to upgrade to future OpenStack releases while remaining on Ubuntu 12.04.

MongoDB and Scale Out? No, says MongoHQ

浏览过的版块