JPA implementation patterns

Sky-Tiger · 发表于 2009-7-16 22:41

Using blobs

The first thing to note is that inheritance is a very large component of the object-relational impedance mismatch. And then question we should ask ourselves is: why are we even mapping all those often changing concrete classes to database tables? If object databases had really broken through, we might be better off storing those classes in such a database. As it is, relational database have inherited the earth so that is out of the question. It might also be that for a part of your object model the relational model actually makes sense because you want to perform queries and have the database manage the (foreign key) relations. But for some parts you are actually only interested in simple persistence of objects.

A nice example is the "persisted command framework" I mentioned above. The framework needs to store generic information about each command such as a reference to the "change plan" (a kind of execution context) it belongs to, start and end times, log output, etc. But it also needs to store a command object that represents the actual work to be done (an invocation of wsadmin or wlst or something similar in our case).

For the first part the hierarchical model is best suited. For the second part simple serialization will do. So we first define a simple interface that is implemented by the different command objects in our system:

public interface Command {
void execute();
}

And then we create the entity that stores both the metadata (the data we want to store in a relational model) and the serialized command object:

@Entity
public class CommandMetaData {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private int id;

@ManyToOne
private ChangePlan changePlan;

private Date startOfExecution;

private Date endOfExecution;

@Lob
private String log;

@Lob
@Column(name = "COMMAND", updatable = false)
private byte[] serializedCommand;

@Transient
private Command command;

public CommandMetaData(Command details) {
serializedCommand = serializeCommand(details);
}

public Command getCommand() {
if (command != null) {
command = deserializeCommand(serializedCommand);
}
return command;
}

[... rest omitted ...]
}

The serializedCommand field is a byte array that is stored as a blob in the database because of the @Lob annotation. The column name is explicitly set to "COMMAND" to prevent the default column name of "SERIALIZEDCOMMAND" from appearing in the database schema.

The command field is marked as @Transient to prevent it from being stored in the database.

Sky-Tiger · 发表于 2009-7-16 22:41

When a CommandMetaData object is created, a Command object is passed in. The constructor serializes the command object and stores the results in the serializedCommand field. After that the command cannot be changed (there is no setCommand() method), so the serializedCommand can be marked as not updatable. This prevents that pretty big blob field from being written to the database every time another field of the CommandMetaData (such as the log field) is updated.

Every time the getCommand method is invoked, the command is deserialized if needed and then it is returned. The getCommand could be marked synchronized if this object were used in multiple concurrent threads.

Some things to note about this approach are:

* The serialization method used influences the flexibility of this approach. Standard Java serialization is simple but does not handle changing classes well. XML can be an alternative but that brings its own versioning problems. Picking the right serialization mechanism is left as an exercise for the reader.
* Although blobs have been around for a while, some databases still struggle with them. For example, using blobs with Hibernate and Oracle can be tricky.
* In the approach presented above, any changes made to the Command object after it has been serialized will not be stored. Clever use of the @PrePersist and @PreUpdate lifecycle hooks could solve this problem.

This semi-object database/semi-relational database approach to persistence worked out quite well for us. I am interested to hear whether other people have tried the same approach and how they fared. Or did you think of another solution to these problems?

Sky-Tiger · 发表于 2009-7-16 22:41

What to test?

The first question to ask is: what code do we want to test? Two kinds of objects are involved when we talk about JPA: domain objects and data access objects (DAO's). In theory your domain objects are not tied to JPA (they're POJO's, right?), so you can test their functionality without a JPA provider. Nothing interesting to discuss about that here. But in practice your domain objects will at least be annotated with JPA annotations and might also include some code to manage bidirectional associations (lazily), primary keys, or serialized objects. Now things are becoming more interesting...

(Even though such JPA specific code violates the POJO-ness of the domain objects, it needs to be there to make the domain objects always function the same way. Whether inside or outside of a JPA container. The managing of bidirectional associations and using UUIDs as primary keys are nice examples of this. In any case, this is code you most certainly need to test.)

Of course, we'd also need to test the DAO's, right? An interesting question pops up here: why do we want to test the DAO's? Most of them just delegate to the JPA provider and testing the JPA provider makes no sense unless we are writing "learning tests" (see also Robert Martin's Clean Code) or developing our own JPA provider. But the combination of the DAO's and the JPA specific part of the domain objects is testworthy.

Sky-Tiger · 发表于 2009-7-16 22:42

What to test against?

Now that we know what to test, we can decide what to test against. Since we are testing database code, we want our test fixture to include a database. That database can be an embedded in-memory database such as HSQLDB (in memory-only mode) or a "real" database such as MySQL or Oracle. Using an embedded database has the big advantage of being easy to set up; there is no need for everyone running the tests to have a running MySQL or Oracle instance. But if your production code runs against another database, you might not catch all database issues this way. So an integration test against a real database is also needed, but more on that later.

For most tests we need more than just the database. We need to set it up correctly before a test and after the test we need leave it in a usable state for the next test to run. Setting up the schema and filling the database with the right data before running the test are not that hard to do (a.k.a. left as an exercise for the reader ;-) ), but returning the database to a usable state after the test is a more difficult problem. I've found a number of approaches to this problem:

* The Spring Framework includes the a test framework that uses transactions to manage the state of your test fixture. If you annotate your test to be @Transactional, the SpringJUnit4ClassRunner will start a transaction before each test starts and roll back that transaction at the end of the test to return to a known state. If you are still using JUnit 3.8 you can extend the AbstractTransactionalSpringContextTests base class for the same effect. This might seem nice but in practice I've found this method to be unsatisfactory for a number of reasons:
      1. By default the JPA context is not flushed until the transaction is committed or a query is executed. So unless your test includes a query, any modifications are not actually propagated to the database which can hide problems with invalid mappings and such. You could try and explicitly invoke EntityManager.flush before the end of the test, but then the tests don't represent real scenario's anymore.
      2. Also, saving an entity and then retrieving it in the same session does not uncover those nasty lazy loading issues. You're probably not even hitting the database as the JPA provider will return a reference to the object that you just saved!
      3. Finally, in a test you might like to first store some data in the database, then run the tests, and finally check that the right data was written out to the database. To test this properly you need three separate transactions without the first two transactions being rolled back.
* If you use an embedded in-memory database that database will be clean when you run the first test and you won't need to worry about leaving it in a good state after all the tests are run. This means you will not have to roll back any transactions and can have multiple transactions within one test. But you might have to do something special between each test. For example, when using the Spring TestContext framework you can use the @DirtiesContext annotation to reinitialize the in-memory database between tests.
* If you cannot use an in-memory database or re-initializing it after every test is too expensive, you can try and clear all the tables after every test (or before every test). For example, DbUnit can be used to delete all data from your test tables or truncate all test tables. Foreign key contraints may get in the way though, so you will want to temporarily disable referential integrity before performing these operations.

Sky-Tiger · 发表于 2009-7-16 22:42

At what scope to test?

The next thing is to decide on the scope of the tests. Will you write small unit tests, larger component tests, or full-scale integration tests? Because of the way JPA works (the leakiness of its abstraction if you will), some problems might only surface in a larger context. While testing the persist method on DAO in isolation is useful if you want to know whether the basics are correct, you will need to test in a larger scope to shake out those lazy loading or transaction handling bugs. To really test your system you will need to combine small unit tests with larger component tests where you wire your service facades with the DAOs under test. You can use an in-memory databases for both tests. And to complete your test coverage you will need an integration test with the database to be used in production using a tool such a Fitnesse. Because we are not specifically testing the JPA code in that case, having unit tests on a smaller scale will help you pinpoint DAO bugs more quickly.

Sky-Tiger · 发表于 2009-7-16 22:42

What to assert?

One final thing to tackle is what to assert in the tests. It might be that your domain objects are mapped to an existing schema in which case you want to make sure that the mapping is correct. In this case you would like to use raw JDBC access to the underlying database to assert that the right modifications were made to the right tables. But if the schema is automatically generated from the JPA mapping you probably will not care about the actual schema. You'll want to assert that persisted objects can be correctly retrieved in a new session. Direct access to the underlying schema with JDBC is not necessary and would only make such test code brittle.

justforregister · 发表于 2009-7-16 23:46

好长哦

jieforest · 发表于 2009-7-18 11:22

JPA实现模式确实是经验总结。

justforregister · 发表于 2009-7-18 23:58

要好好看看

meng_0311 · 发表于 2009-7-29 17:08

有哪位在Spring DM和Eclipselink下用@PersistenceContext注入EntityManager成功过吗？如果有的话，能否提供一个Demo.

Thanks