How does JPA responds to scalability - java

does anyone know if JPA is a good approach to a scalable environment? (i.e web application in a cluster, or several clusters), if not what is a good approach?
Thanks
edit: I changed JTA for JPA, I think the question makes more sense now.

You'll find implementations of JPA built into many world-class Java EE application servers — for example, JBoss has its Hibernate, Websphere — OpenJPA. They're scalable and capable of running in clusters. This fact alone should let you sleep well, or at least not to be concerned with it on a general level.

Related

JPA implementations suitable for the case

I am building a web application using JSF 2.0 tomcat 7.0.20 and a MySQL DB, My application is for a small company actully starting its business.
Now I see that there is Many implementations of JPA 2.0, now in my Case which one would be the most suitable ? or should I use pure JPA 2.0 and create my own implementation?
I want somting that would work best for the company now and wouldn't make trouble when it grows in the future, i was considering Hibernate or do you have another suggestion?
The following former questions will give you some insight about the pro and cons for each JPA implementation, and choose the one which is the best for your company.
https://stackoverflow.com/questions/2569522/hibernate-or-eclipselink
JPA 2.0 Implementations comparison : Hibernate 3.5 vs EclipseLink 2 vs OpenJPA 2
I believe that Hibernate is actually more popular nowadays, and many industries are using Hibernate.
Depends how you want to judge "most suitable" ... license? most users? fewer known bugs?. Most used doesn't necessarily mean most reliable, but then it can mean there are more people who understand its problem areas and how to avoid them. All would likely do the job for many applications, though some have long-standing issues. DataNucleus JPA is another choice for you to consider, providing for all that you need there but also allowing much more flexibility than others in terms of input and in terms of which datastores you can persist to, should these be factors in your application.

Java Web Application for 5000~ Users

For the first time (hopefully not the last) in my life I will be developing an application that will have to handle a high number of users (around 5000) and manage lots of data. I have developed an application that manages lots of data (around 100~ GB of data, not so much by many of your standards), but the user count was pretty low (around 50).
Here is the list of tools / frameworks I think I will be using:
Vaadin UI framework
Hibernate
PostgreSQL
Apache Tomcat
Memcached (for session handling)
The application will mainly be run inside a company network. It might be run on a cluster of servers or not, depends on how much money the company wants to spend to make its life easier.
So what do you think of my choices and what should I take caution of?
Cheers
The answer, as with all performance/scaling related issues is: it depends.
There is nothing in your frameworks of choice that would lead me to think it wouldn't be able to handle a large amount of users. But without knowing what exactly you want to do or what your budget is, it's impossible to pick a technology.
To ensure that your application will scale/perform, I would consider the following:
Keep the memory footprint of each session low. For example, caching stuff in the HttpSession may work when you have 50, but not a good idea when you have 5000 sessions.
Do as much work as you can in the database to reduce the amount of data that is being moved around (e.g. when looking at tables with lots of rows, ensure that you've got paging that is done in the database (rather than getting 10,000 rows back to Tomcat and then picking the first 10...)
Try to minimise the state that has to be kept inside the HttpSession, this makes it easier to cluster.
Probably the most important recommendations:
Use load testing tools to simulate your peak load and beyond and test. JMeter is the tool I use for performance/load-testing.
When load testing, ensure:
That you actually use 5000 users (so that 5000 HttpSessions are created) and use a wide range of data (to avoid always hitting the cache).
EDIT:
I don't think 5000 users is not that much and you may find that (performance-wise) you only need a single server (depends on the size of the server and the results of the load testing, of course, and you may consider a clustered solution for failovers anyway...) for handling the load (i.e. not every one of your 5000 users will be hitting a button concurrently, you'll find the load going up in the morning (i.e. everyone logs in).
You might want to consider an Apache HTTP server in front of your Tomcat servers. Apache will provide: compression, static caching, load-balancing and SSL.
Any reason for not using Spring? It has really became an de-facto standard in the enterprise java applications.
Spring provides an incredibly powerful and flexible collection of technologies to improve your enterprise Java application development that is used by millions of developers.
Spring is lightweight and can stay as a middle layer, connecting the vaadin and hibernate, there by creating a clean separation of layers. The spring transaction management is also superior to the one on hibernate. I will suggest you go for it until you have a strong reason stopping you.
Since you asked people to weigh in, I won't hold back my opinion. ORMs in general, and Hibernate in particular, are an anti-pattern. I know, I've worked in shops that use Hibernate over the past 9 years. Knowing what I know now, I will never use it again.
I highly recommend this blog post, as it puts it more succinctly than I can:
ORM is an anti-pattern
But forgive me if I quote the bit from that blog about ORMs and anti-patterns:
The reason I call ORM an anti-pattern is because it matches the two
criteria the author of AntiPatterns used to distinguish anti-patterns
from mere bad habits, specifically:
It initially appears to be beneficial, but in the long term has more
bad consequences than good ones
An alternative solution exists that is proven and repeatable
Your other technology choices seem fine. Personally, I lean more toward Jetty than Tomcat. There's a reason that Google embeds it in a lot of their projects (think GWT and PlayN); it's a younger codebase and I think more actively developed now that Eclipse has taken it over. Just my humble opinion.
[UPDATE] One more link, very long read but when making architectural decisions, reading is good.
Object/Relational Mapping: The Vietnam of Computer Science
I recommend Glassfish for application server because Apache Tomcat can serve simple content. And Glassfish has full implementation of the Java EE specification.
Depending on you specification and future goals I would perhaps leave the normal version of tomcat and go for Apache TomEE or my personal preference of jBoss. As I understand it EJB's are not very well supported in the normal tomcat version and that is probalby something sweet to have when you want to create a couple of services, some clustered singleton service and other stuff. But this is just my personal pref of course and if your specification will not allow a more advanced EE server then you should stick with the slick tomcat.

Session Beans and EJB3 vs Spring

I was curious about the capabilities of Sessions Beans in EJB 3 and whether they can be replaced in a typical mid-scale enterprise application with Spring.
I found this article:
http://drag0sd0g.blogspot.com/2010/01/session-bean-alternative-spring.html
that states the following: "Because of heavy use of annotations,
you can pretty much avoid “XML Hell” using EJB 3; the same cannot be said of Spring.
Moreover, because it is an integral part of the Java EE standard, the EJB container
is natively integrated with components such as JSF, JSP, servlets, the JTA transaction
manager, JMS providers, and JAAS security providers of your application server. With Spring, you have to worry whether your application server fully supports the framework with these native components and other high-performance features like clustering, load balancing, and failover. If you aren’t worried about such things, then Spring is not a bad choice at all"
Do you agree with this statement? The Stateless Sessions Beans used to be considered a very powerful enterprise technology because of the pooling and management capabilities. My question is: when is it really necessary to use EJB 3 instead of or in addition to Spring (assuming a mission critical enterprise application in a large company)?
Looks like yet another Java EE vs. Spring post...
EJB/Java EE and Spring are now two mature, competitive Java-based technology stacks. Often there's no reason to complicate things and mix them up. EJB actually learned and used many ideas from Spring et al.
Neither of them drives you into the XML/configuration hell. Both are fairly easy to get started with, at least with the very basic stuff.
Spring is more than just IoC/SOA/transactions. It's more like a toolbox - it's ready to integrate with, or directly provides, frameworks for ORM and transactions, web/MVC, security, timers/scheduling etc. You can pick exactly the pieces you need. You're not forced to use a container (you can use it in your standalone "desktop" app).
EJB is part of Java EE stack. It is, well, the standard. It's not as broad, flexible as Spring, but it's by definition supported by all Java EE containers.
I prefer Spring for the freedom and being one step ahead.
I don't think there are many cases when the use of EJB 3 instead of Spring is absolutely necessary, but there are cases when using EJB 3 would be considerably easier. As the article states, the main advantages of EJB is the integration with the various other JEE technologies and, as of EJB 3, Enterprise Beans are much simpler to write than in they were in previous versions of the spec.
The classic reason for using EJB over POJOs or other middleware technologies is transactions. If your business logic needs to be transactional then EJB provides simple, declarative transnational demarcation and seamless integration with JTA via the container. While the article suggests that support for clustering, load balancing and performance management is an advantage, this is very much dependant on your choice of JEE application server.
I'd say the key factor in deciding whether to use Spring or EJB 3 is your container. If your target container is a fully JEE 5+ compliant application server and you need support for services such as transactions or messaging then EJB 3 is the obvious choice. If, however, you don't need to integrate with other JEE technologies or are deploying to a light-weight app server then using EJB would simply add unnecessary overhead.
How can anyone think EJB3's defining a data model using a series of java annotations spread out over several classes is superior to Hibernates simple model definition syntax is beyond me.
Its a maintainability nightmare. Why have you got an intersection table? It may be defined almost anywhere in the code base. Some junior programmer plays with the annotations and now your java classes are out of sync with the actual database.
Got a performance issues (and you will). Not only have you got the classic Hibernate "I don't know what SQL it's using" you also have the "I don't know why the table was built like that" problem.

JavaEE Application Server or Lightweight Container?

Let me preface this by saying this is not an actual situation of mine but I'm asking this question more for my own knowledge and to get other people's inputs here.
I've used both Spring and EJB3/JBoss, and for the smaller types of applications I've built, Spring (+Tomcat when needed) has been much simpler to use. However, when scaling up to larger applications that require things like load balancing and clustering, is Spring still a viable solution? Or is it time to turn to a solution like EJB3/JBoss when you start to get big enough to need that? I'm not sure if I've scoped the problem well enough to get a good answer, so please let me know.
Thanks,
Jeff
Tomcat can be clustered.
Load balancing is usually a hardware solution (e.g., a BigIP or Cisco ACE) that's independent of app server.
Spring can be enterprise, just like EJB. There's no dividing line that says Spring can't handle it.
I could say that in our project which quite large (~500K LOC) we've got rid of JBoss in favour of Spring/Tomcat for a performance sake.
One of J2EE Application container (and JBoss, as an implementation) key features is a possibility of transparent distributed transactions among different kind of transactional resources. That is great idea and it simplifies coordination of, let's say, JMS messenging and database operations a lot. But when it comes to a necessity of high throughout it becomes a problem. Unfortunately, distributed transactions are notable for its slow speed.
It isn't an easy task to migrate from JBoss to Spring, however it's possible and Spring/Tomcat could be considered, with quite rare exceptions, as fully functional replacement for JBoss.

Should I use EJB3 or Spring for my business layer?

My team is developing a new service oriented product with a web front-end. In discussions about what technologies we will use we have settled on running a JBoss application server, and Flex frontend (with possible desktop deployment using Adobe AIR), and web services to interface the client and server.
We've reached an impasse when it comes to which server technology to use for our business logic. The big argument is between EJB3 and Spring, with our biggest concerns being scalability and performance, and also maintainability of the code base.
Here are my questions:
What are the arguments for or against EJB3 vs Spring?
What pitfalls can I expect with each?
Where can I find good benchmark information?
There won't be much difference between EJB3 and Spring based on Performance. We chose Spring for the following reasons (not mentioned in the question):
Spring drives the architecture in a direction that more readily supports unit testing. For example, inject a mock DAO object to unit test your business layer, or utilize Spring's MockHttpRequest object to unit test a servlet. We maintain a separate Spring config for unit tests that allows us to isolate tests to the specific layers.
An overriding driver was compatibility. If you need to support more than one App Server (or eventually want the option to move from JBoss to Glassfish, etc.), you will essentially be carrying your container (Spring) with you, rather than relying on compatibility between different implementations of the EJB3 specification.
Spring allows for technology choices for Persistence, object remoting, etc. For example, we are also using a Flex front end, and are using the Hessian protocol for communications between Flex and Spring.
The gap between EJB3 and Spring is much smaller than it was, clearly. That said, one of the downsides to EJB3 now is that you can only inject into a bean, so you can end up turning components into beans that don't need to be.
The argument about unit testing is fairly irrelevant now - EJB3 is clearly designed to be more easily unit testable.
The compatibility argument above is also kind of irrelevant: whether you use EJB3 or Spring, you're still reliant on 3rd party-provided implementations of transaction managers, JMS, etc.
What would swing it for me, however, is support by the community. Working on an EJB3 project last year, there just weren't a lot of people out there using it and talking about their problems. Spring, rightly or wrongly, is extremely pervasive, particularlty in the enterprise, and that makes it easier to find someone who's got the same problem you're trying to solve.
What are the arguments for or against EJB3 vs Spring?
Spring is always innovating and recognizes real-world constraints. Spring offered simplicity and elegance for the Java 1.4 application servers and didn't require a version of the J2EE specification that no one had access to in 2004 - 2006. At this point it is almost a religious debate that you can get sucked into - Spring + abstraction + open-source versus Java Enterprise Edition (Java EE) 5.0 specifications.
I think Spring complements more than competes with the Java EE specifications. As the features that were once unique to Spring continue to get rolled into the specification, many will argue that EJB 3 offers a 'good enough' feature set for most internal business applications.
What pitfalls can I expect with each?
If your treating this as persistence issue (Spring+JPA) versus EJB3 your really not making that big of a choice.
Where can I find good benchmark information?
I haven't followed the specj benchmark results for sometime, but they were popular for a while. It seems that each vendor (IBM, JBOSS, Oracle, and Sun) get less and less interested in having a compliant server. The lists get Shorter and shorter of certified vendors as you go from 1.3, 1.4. 1.5 Java Enterprise Edition. I think the days of a giant server that is fully compliant with all the specifications are over.
I would definitely recommend EJB3 over spring. We find that it's more streamlined, nicer to code in, and better supported. I have in the past used Spring and found it to be very confusing, and not as well documented as EJB3 (or JPA I guess at the end of the day)
As of EJB3 you no longer have to deal with external config files, and there's only one POJO that you annotate per database table. This POJO can be passed to your web tier without any problems. IDEs like Netbeans can even auto-generate these POJOs for you. We've used EJB3 now as the back end for quite a few large scale applications, and haven't noticed any performance problems.
Your Session Beans can be easily exposed as web services which you could expose to your Flex frontend.
Session beans are easy to lock down at either a method or class level to assign roles and things like that if you need to.
I can't speak that much about spring, as I only tried it out for a few weeks. But my overall impression of it was very poor. That doesn't mean it's bad framework, but our team here has found EJB3 to be the best for the persistence/business layer.
I tend to prefer Spring over EJB3 but my recommendation would be whichever approach you take, try to stick to writing POJOs and use the standard annotations where possible, like the JSR annotations such as #PostConstruct, #PreDestroy and #Resource which work with both EJB3 or Spring so you can pick whichever framework you prefer.
e.g. you could decide on some project to use Guice instead for IoC.
If you want to use pre-request injection such as in a web application you might find Guice is quite a bit faster for dependency injection than Spring.
Session beans mostly boil down to dependency injection and transactions; so EJB3 and Spring are kinda similar really for that. Where Spring has the edge is on better dependency injection and nicer abstractions for things like JMS
i have used a very similar architecture in the past. Spring + Java 1.5 + Actionscript 2/3 when combined with Flex Data Services made it all very easy (and fun!) to code.
though, a Flex front end means you need adequately powerful client machines.
Regarding your question:
What are the arguments for or against EJB3 vs Spring?
I suggest reading the response from the experts: A RESPONSE TO: EJB 3 AND SPRING COMPARATIVE ANALYSIS by Mark Fisher. Read the comments to find Reza Rahman's remarks (EJB 3.0).
Another thing in favor of spring is that most of the other tools / frameworks out there have better support for integration with spring, most of them use spring internally as well (e.g. activemq, camel, CXF etc).
It is also more mature and there are a lot more resources (books, articles, best practices etc) & experienced developers available than for EJB3.
I think EJB is a good component technology but not a good framework.Spring is the best framework available as of today.So i should consider Spring as the best implementation of JEE in the sense of a framework and my recommendation is to use spring in every project which gives us the flexibility to integrate with any component technology easily .

Categories