May 3, 2016
How to Improve Performance of Your JPA Applications with NoSQL
Many years ago, when I was working on a ecommerce website project for one of the largest automobile companies in the world, I stumbled onto a concept that sounded like science fiction to me: automatic data persistence through entity classes.
Yes, I’m talking about the now well known Enterprise JavaBeans. Announced in 1998, and later incorporated into the specifications of Java EE, it introduced the concept of Entity Beans. The idea was to provide a development framework that allowed developers to map their object entities automatically to relational tables, so that the framework would take care of persisting the application data automatically in the database. This was called ORM: object-relational mapping.
That was the early 2000s, and we were used to expecting big innovations coming from the then-super-cool Sun Microsystems—sort of an Apple of the IT world of its time—but that was really a paradigm change. It came on the trail of Object Oriented programming, by itself a huge paradigm shift for the mainstream application development world. At that point, the concept of persisting data in a centralized database was well established established and relational databases were everywhere. Server-side web applications were becoming the main trend, and of course, you had to select a database to store your data, and although relational databases were not the only alternative available, they were the obvious choice for what we used to call “desktop applications”. This all implied that the only way that your application had to store and retrieve data was through executing SQL queries…in many case, very complex ones.
Java, on the other hand, is completely object-oriented, which is not naturally translated into tables and relationships. Relational databases were easily adopted by other procedural languages via usage of SQL. At that time, Java was also struggling with the result of a main court fight between Microsoft and Sun Microsystems, related to compatibilities between Java and Internet Explorer. Every programmer was discussing was which platform, or framework, was going to survive: Java or Microsoft’s new announcement, .NET.
In this context, the automatic persistence offered by EJB was exciting and a huge innovative concept. However, the hardware reality at that time imposed a challenge: although it was a nice concept, the truth is that the processing hardware wasn’t ready for it yet. Java had trouble enough proving that running an interpreted code—something considered “old school”—was not going to slow everything down. Having such code being executed under the several layers of extra management required by EJB was unthinkable. And remember, we’re talking about the the 32-bit single-core processor era, where a typical high-end server had something between 256Mb and 512Mb of slow RAM! (see topdesignmag.com)
Fast forward to 2016, Hibernate is on version 5 and, according to recent research, more than 73% of the Java development happens under some sort of Java EE framework.
Since 2009, with the specification of JPA 2.0, more and more applications have been using the benefits of such abstraction. It was boosted with the main adoption of Hibernate ORM, developed by Gavin King in 2001, as an easier implementation of the persistence capabilities provided by the former EJB2-style entity bean classes. With its certification as an implementation of the JPA 2.0 specification in 2010, Hibernate became a popular, widely adopted technology among application developers.
And yet, 15 years since its start, a lot of the discussions you can find in programmer forums is still about the original theme: how to make JPA perform better. The same old problem seems to persist, in spite of huge advancement in hardware speeds. It is certainly more important now that JPA is considered mainstream, affecting hundreds of thousands of systems around the world. The inherent problem of ORM architecture has not changed: mapping an object-oriented world to a relational-world is not a simple task—it requires an immense extra processing effort to get it accomplished seamlessly.
Many years ago Ted Newards called ORM “The Vietnam of Computer Science”, associating it to the Law of Diminishing Returns: it all looks good in the beginning, but the more you use, the harder it is to see additional benefits. At some point it’s hard to “drop the bait and run” due to all the investment and time already spent so far. He even goes as far as to suggest using a combination of ORM solutions and direct SQL (or JDBC), “to carry them past those areas where an ORM would create problems.” This has a lot to do with performance.
The people at jhades.org make a very good point in their blog when they say that the main problem is that ORM imposes itself the challenge of synchronizing (in real-time) two completely different types of data structures. There is little affinity between tables and relationships and object-oriented data structures. As a result, traditional relational DBMS are pulling their weight down on any ORM implementation simply because the lack of affinity between SQL and those applications that could take most advantage of ORM, the so-called Domain Driven Designs.
But the whole database industry is going through a transformation by itself nowadays. During the last 15 years you would have to be a brave explorer to dare to pick any alternative to a pure RDBMS to persist your data—if you were able to find it—not to mention the effort to justify why you did it. Today the myriad of NoSQL databases expand the possibilities for many new paradigms in computer science. There is no reason to think JPA would not benefit from it—I’d argue that it definitely does. Many NoSQL approaches make much more sense, from the data structure point of view, as a better solution to persist the data under a JPA implementation than a table/relationship DBMS.
Our research appears to indicate that this is true. We have recently announced a new implementation of JPA based on our key-value store (KVS) database engine, c-treeACE V11. Initial tests have indicated a performance gain of around 30% when using c-treeACE to replace any SQL database.
This has been accomplished by taking full advantage of an intelligent mapping that recognizes queries that can be better executed under a low-level KVS approach rather than through the heavy load of unnecessary SQL. Because c-treeACE is a multimodel database, the layer that interacts with the database (the Java Persistence Layer, JPL) can seamlessly switch between SQL and NoSQL to optimally execute each query.
In conclusion, it is important to keep an eye on all the development that is going on with NoSQL. The benefits of NoSQL may not be limited to new application development. In some cases, you may see positive results by revisiting existing, traditional frameworks, such as your JPA implementation. Whether you are using Hibernate, or any other ORM framework, database replacement should be a low-risk, small-effort project. You may find you are a few steps away from saving thousands of dollars.