IV Final thoughts

So here we are after the ride through highs and lows of JPA with some bits of common sense advices. Perhaps the story wasn’t that good because – at least for me – it culminated quite soon around the troubles with to-one relationships, not to mention that my proposed solution is not pure JPA.

JPA/ORM – yes or no?

Would I use JPA or ORM again? Voluntarily? Yes – especially for simple cases. I’d use it even for bigger projects, of course, but I’d tune it down, not using all its features, especially avoiding some mapping problems.

This is a paradox – and a problem too. Firstly, ORM ideas were developed mostly to help us with rich domain models. Using it for simple systems and domains you pay the cost of a technology intended for complex domains. But from my experience ORM gets unwieldy for larger domains – especially if we use more mapping features. All the programmers must understand those, unless we make the data layer fixed and isolated, but then some developers can’t develop features top-to-bottom.

I can’t speak much about how DDD helps with ORM problems. I’ve recently saw a presentation by Eric Evans, the DDD guy. He admitted that proper DDD may be easier with systems where context boundaries are not only logical, but also physical – like microservices. I agree, because implicit is never as solid as explicit, as I often stressed in this book. I saw this over and over again in many software problems.

On the other hand, we must also agree that many implicit and automagic solutions make our live easier. But they must be supported by knowledge. When people maintaining the system change and new ones don’t understand those implicit parts all the benefits are gone.

If we take the ORM shortcut for simple case it may still happen that the simple problem gets more complex, application grows a lot and we will suddenly have to live with a lot of ORM investment with a very difficult retreat path (business will not be able to pay for it, for sure). So if I know that I would rather use some other solution instead, I’d think couple of times about the possibility of this small project/product becoming big.

There is another way how to treat JPA and ORM – especially for rich domains. Understand it fully – really, really fully – tune the caches, use reliable lazy where appropriate and reap the benefits. Personally, I doubt this is a good solution (or realistic for that matter) but it can be OK for command part of CQRS application. ORM simply falls short in area of queries beyond the most primitive levels of sophistication.

JPA or concrete ORM provider?

Do I prefer JPA or concrete ORM provider? For many years the answer was pretty clear – I always strove to stay within JPA realm. Accidentally, though, I went down the path with EclipseLink and its capability to JOIN with entity roots¹ – which I believe would be an incredible addition for the JPA 2.2. The more I work with JPA the less I believe in easy transition from one provider to another – except for simple cases. The more complex queries the more I’d try to find some alternative. Maybe the first step would be staying with ORM and utilizing the full power of a single provider – whatever the provider is. That means checking the bugs on their issue tracking system when you encounter one, reporting new ones and so on.

Sticking with EclipseLink for a while – and diving deep – leaves you enough time to encounter bugs, some of them quite painful. I decided to collect all the bugs we found (not necessarily being the first ones to report it) and you can check them in the appendix Bugs discovered while writing this book. Would I find more Hibernate bugs had I worked with it? I’m pretty sure I would as the few Hibernate bugs found there were found during our attempt to switch JPA providers quite early on the project (before we left the JPA path with root entities after JOINs) or during experiments with JPA 2.1 features for the sake of this book. That cannot be compared to what EclipseLink went through on our project.

What bothers me are some very serious bugs that are completely unattended by project maintainers, often not even answered. You create a test case for concurrency problem with CASE (read “don’t use CASE at all if the query may be executed in parallel”), issue tracker tells you it sent emails to half a dozen people – and no response at all. To be honest I lost my faith in EclipseLink quality quite a lot. Before Hibernate 5.1 we were stuck with EclipseLink but even now with ad hoc join now generally available it’s still mainly about trading one set of bugs for other (less known) set of bugs. Or perhaps we can try some other JPA provider – like DataNucleus that also allows JOIN to another root entity.

I actually tried DataNucleus briefly with demos for this book and it didn’t grow on me at all. Any relevant page (like Maven dependency example or persistence.xml example) can be found easily via Google, but their “Getting started” will not send you to these. Also, DataNucleus requires class enhancement and there is no chance to try all ORMs side by side just by choosing the persistence unit name from the persistence.xml containing all three setups – which is possible with EclipseLink and Hibernate. I dug deeper, but found some strange error during runtime complaining about column that wasn’t there, but it was. I admit that mapping of Person.dogs collection in the basic example project is tricky, because it recreates one problem from our project caused by legacy reason, but valid, I believe. DataNucleus is more than just a JPA 2.1 provider, but in my case this caused complexity that didn’t help, of course. It is still the most invasive from all three mentioned ORMs. This does not mean it must be any bad, I just didn’t want to disrupt or make separate examples just because of it.

This all means you have to know your provider one way or the other. I always try to use as much of JPA capabilities (mapping annotations, JPQL) but when the time comes I go beyond. It’s very easy with a feature like ad hoc joins where all the providers understand its usefulness (although I doubt they did it to promote my raw FK mapping style). With something specific it’s up to you. Will you ever switch? This depends. If you’re pure JPA using whatever some application server provides you have to be super compliant and super compatible – avoiding bugs included. That sounds rather unlikely for me.

JPA, Querydsl and beyond

So, will I use ORM on bigger systems again? Probably – as I don’t design every system I work on. Would I try something else if I could? Definitely. I have much more faith in Querydsl itself, for instance. You can use it over SQL directly, bypassing the JPA altogether. Main Querydsl contributor, Timo Westkämper, answers virtually any question anywhere² (groups, issues, StackOverflow)… the only risk here is it seems to be predominantly one-man show even though it is backed by a company. You can also check jOOQ, though here you can’t avoid commercial licences if you want to use it with commercial databases. Both these projects – and their popularity – show that the JPA is far from being the last word in Java persistence landscape and also not the only way how to do it.

In any case, I always put Querydsl as another layer over JPA. I recently tried JPA 2.0 project without it and the experience was terrible if you had got used to it before. Also JPA 2.0 version was nothing to be happy about and missing ad hoc joins also proved to be incredible pain as some tables were already mapped with raw FKs and without ad hoc join you’re out of luck – you have to remap it properly with all the consequences.

JPA 2.2 coming

JPA itself is now progressing slowly. JPA 2.1 was a massive improvement over JPA 2.0, but that was already 4 years ago. JPA 2.2 coming with Java EE 8 will be merely a maintenance release. Even so it should bring couple of good things:

Support for Java 8 Date and Time types.
Ability to stream a query result.

Now the first one is sweet – especially when EclipseLink currently has a bug when COALESCE is used on converted types (not that they should not fix it, of course).

The second one is much more interesting but I don’t know what to expect. Will it be true streaming of results from a cursor? Will the entities be created lazily as we consume them? Will it allow us to avoid OutOfMemoryError for longer queries (e.g. hundreds of thousands lines exported to CSV)? Or is it just stream() called on the list returned from getResultList()? I hope for the first as that would be a massive improvement for some less typical cases.

This book

Finally, about this book. It was much bigger story for me in the end, especially as I discovered I’d been working outside of JPA realm for more than a year. That obviously totally ruined all my plans, my wannabe climax was suddenly dull and I wasn’t sure whether to go on at all. But I also learnt a lot about the JPA (and I really had thought I’was quite solid already!) and about the two most popular providers. I hope it was helpful also for you, even though I might have raised more questions than provided answers. At least I tried to keep it reasonably short.

I’m considering another edition discussing other typical problems (why to map many-to-many association entity explicitly, why to avoid Open session in view, etc.) but right now I need a break. :-) With next version after JPA 2.2 probably far away I have plenty of time for it.