If I have a stateful class that requires a utility-like stateless class to perform an operation on it. These stateful classes are kept in a list in a container (stateful) class. This is what I would do with plain Java:
class Container {
List<Stateful> sfList;
}
class Stateful {
List<String> list;
void someMethod() {
list.add("SF");
Stateless.foo(this);
}
class Stateless {
public static void foo(Stateful sf) {
sf.getList().add("SL");
}
}
and inside main this is the procedure:
Stateful sf = new Stateful();
sfList.add(sf);
sf.someMethod();
Now with JavaEE CDI I would do:
class StatefulBean {
List<String> list;
#Inject
StatelessBean slsb;
void someMethod() {
list.add("SF");
slsb.add(this);
}
#Stateless
class StatelessBean {
public void add(StatefulBean sfsb) {
sfsb.add("SL");
}
}
In this case all StatefulBean could access a shared pool of StatelessBean with no concurrency issues and it will scale properly with requests.
However, since Stateful is not a managed bean I can't inject into it and so I used the utility class instead. Also I'm creating Stateful with a constructor so I can't inject into it the stateless bean (I will get a NPE using it).
My questions are:
Are there concurrency and scalabilty differences between the stateless injection approach (provided it would work) and the utility class approach?
How can I make the EE injection approach work?
Are there concurrency and scalabilty differences between the stateless injection approach (provided it would work) and the utility class approach?
Yes there are, primarily around the loss of management. The way you're instantiating Stateful on every invocation of the method, there's no pooling involved. This is going to lead to you creating more instances than you probably need.
Another loss is on the scalability side. Where the container will manage the passivation and activation of your stateful bean in a distributed environment, the manual approach will see to it that you manage your own activation and passivation.
Since Stateful is not a managed bean..
Incorrect. According to the CDI Spec, any java class that meets the listed criteria (in your case, a default no-arg constructor) is a managed bean. That means that you could #Inject StatelessBean into Stateless and the container would oblige. To allow this level of management, you'll need to set bean-discovery-mode=all in your beans.xml.
Even with the (apparently needless) circular reference, normal Java concurrency rules apply: as long as the state that you're manipulating is not static or in a static class, you're threadsafe. Each threaded call to that static method still operates on a separate stack, so no problems there.
How can I make the EE injection approach work?
If you need on-demand instantiation of Stateless(or any other bean really), use the CDI Instance to programmatically obtain a managed instance of any class you want. You can now add something like this to Container:
#Inject #Dependent Instance<Stateful> stateful;
#PostConstruct
public void createStateless(){
//instantiate sfList;
sfList.add(stateful.get()); //execute as many times as you need
}
Related
I have recently noticed that Spring successfully intercepts intra class function calls in a #Configuration class but not in a regular bean.
A call like this
#Repository
public class CustomerDAO {
#Transactional(value=TxType.REQUIRED)
public void saveCustomer() {
// some DB stuff here...
saveCustomer2();
}
#Transactional(value=TxType.REQUIRES_NEW)
public void saveCustomer2() {
// more DB stuff here
}
}
fails to start a new transaction because while the code of saveCustomer() executes in the CustomerDAO proxy, the code of saveCustomer2() gets executed in the unwrapped CustomerDAO class, as I can see by looking at 'this' in the debugger, and so Spring has no chance to intercept the call to saveCustomer2.
However, in the following example, when transactionManager() calls createDataSource() it is correctly intercepted and calls createDataSource() of the proxy, not of the unwrapped class, as evidenced by looking at 'this' in the debugger.
#Configuration
public class PersistenceJPAConfig {
#Bean
public DriverManagerDataSource createDataSource() {
DriverManagerDataSource dataSource = new DriverManagerDataSource();
//dataSource.set ... DB stuff here
return dataSource;
}
#Bean
public PlatformTransactionManager transactionManager( ){
DataSourceTransactionManager transactionManager = new DataSourceTransactionManager(createDataSource());
return transactionManager;
}
}
So my question is, why can Spring correctly intercept the intra class function calls in the second example, but not in the first. Is it using different types of dynamic proxies?
Edit:
From the answers here and other sources I now understand the following:
#Transactional is implemented using Spring AOP, where the proxy pattern is carried out by wrapping/composition of the user class. The AOP proxy is generic enough so that many Aspects can be chained together, and may be a CGLib proxy or a Java Dynamic Proxy.
In the #Configuration class, Spring also uses CGLib to create an enhanced class which inherits from the user #Configuration class, and overrides the user's #Bean functions with ones that do some extra work before calling the user's/super function such as check if this is the first invocation of the function or not. Is this class a proxy? It depends on the definition. You may say that it is a proxy which uses inheritance from the real object instead of wrapping it using composition.
To sum up, from the answers given here I understand these are two entirely different mechanisms. Why these design choices were made is another, open question.
Is it using different types of dynamic proxies?
Almost exactly
Let's figure out what's the difference between #Configuration classes and AOP proxies answering the following questions:
Why self-invoked #Transactional method has no transactional semantics even though Spring is capable of intercepting self-invoked methods?
How #Configuration and AOP are related?
Why self-invoked #Transactional method has no transactional semantics?
Short answer:
This is how AOP made.
Long answer:
Declarative transaction management relies on AOP (for the majority of Spring applications on Spring AOP)
The Spring Framework’s declarative transaction management is made possible with Spring aspect-oriented programming (AOP)
It is proxy-based (§5.8.1. Understanding AOP Proxies)
Spring AOP is proxy-based.
From the same paragraph SimplePojo.java:
public class SimplePojo implements Pojo {
public void foo() {
// this next method invocation is a direct call on the 'this' reference
this.bar();
}
public void bar() {
// some logic...
}
}
And a snippet proxying it:
public class Main {
public static void main(String[] args) {
ProxyFactory factory = new ProxyFactory(new SimplePojo());
factory.addInterface(Pojo.class);
factory.addAdvice(new RetryAdvice());
Pojo pojo = (Pojo) factory.getProxy();
// this is a method call on the proxy!
pojo.foo();
}
}
The key thing to understand here is that the client code inside the main(..) method of the Main class has a reference to the proxy.
This means that method calls on that object reference are calls on the proxy.
As a result, the proxy can delegate to all of the interceptors (advice) that are relevant to that particular method call.
However, once the call has finally reached the target object (the SimplePojo, reference in this case), any method calls that it may make on itself, such as this.bar() or this.foo(), are going to be invoked against the this reference, and not the proxy.
This has important implications. It means that self-invocation is not going to result in the advice associated with a method invocation getting a chance to execute.
(Key parts are emphasized.)
You may think that aop works as follows:
Imagine we have a Foo class which we want to proxy:
Foo.java:
public class Foo {
public int getInt() {
return 42;
}
}
There is nothing special. Just getInt method returning 42
An interceptor:
Interceptor.java:
public interface Interceptor {
Object invoke(InterceptingFoo interceptingFoo);
}
LogInterceptor.java (for demonstration):
public class LogInterceptor implements Interceptor {
#Override
public Object invoke(InterceptingFoo interceptingFoo) {
System.out.println("log. before");
try {
return interceptingFoo.getInt();
} finally {
System.out.println("log. after");
}
}
}
InvokeTargetInterceptor.java:
public class InvokeTargetInterceptor implements Interceptor {
#Override
public Object invoke(InterceptingFoo interceptingFoo) {
try {
System.out.println("Invoking target");
Object targetRetVal = interceptingFoo.method.invoke(interceptingFoo.target);
System.out.println("Target returned " + targetRetVal);
return targetRetVal;
} catch (Throwable t) {
throw new RuntimeException(t);
} finally {
System.out.println("Invoked target");
}
}
}
Finally InterceptingFoo.java:
public class InterceptingFoo extends Foo {
public Foo target;
public List<Interceptor> interceptors = new ArrayList<>();
public int index = 0;
public Method method;
#Override
public int getInt() {
try {
Interceptor interceptor = interceptors.get(index++);
return (Integer) interceptor.invoke(this);
} finally {
index--;
}
}
}
Wiring everything together:
public static void main(String[] args) throws Throwable {
Foo target = new Foo();
InterceptingFoo interceptingFoo = new InterceptingFoo();
interceptingFoo.method = Foo.class.getDeclaredMethod("getInt");
interceptingFoo.target = target;
interceptingFoo.interceptors.add(new LogInterceptor());
interceptingFoo.interceptors.add(new InvokeTargetInterceptor());
interceptingFoo.getInt();
interceptingFoo.getInt();
}
Will print:
log. before
Invoking target
Target returned 42
Invoked target
log. after
log. before
Invoking target
Target returned 42
Invoked target
log. after
Now let's take a look at ReflectiveMethodInvocation.
Here is a part of its proceed method:
Object interceptorOrInterceptionAdvice = this.interceptorsAndDynamicMethodMatchers.get(++this.currentInterceptorIndex);
++this.currentInterceptorIndex should look familiar now
Here is the target
And there are interceptors
the method
the index
You may try introducing several aspects into your application and see the stack growing at the proceed method when advised method is invoked
Finally everything ends up at MethodProxy.
From its invoke method javadoc:
Invoke the original method, on a different object of the same type.
And as I mentioned previously documentation:
once the call has finally reached the target object any method calls that it may make on itself are going to be invoked against the this reference, and not the proxy
I hope now, more or less, it's clear why.
How #Configuration and AOP are related?
The answer is they are not related.
So Spring here is free to do whatever it wants. Here it is not tied to the proxy AOP semantics.
It enhances such classes using ConfigurationClassEnhancer.
Take a look at:
CALLBACKS
BeanMethodInterceptor
BeanFactoryAwareMethodInterceptor
Returning to the question
If Spring can successfully intercept intra class function calls in a #Configuration class, why does it not support it in a regular bean?
I hope from technical point of view it is clear why.
Now my thoughts from non-technical side:
I think it is not done because Spring AOP is here long enough...
Since Spring Framework 5 the Spring WebFlux framework has been introduced.
Currently Spring Team is working hard towards enhancing reactive programming model
See some notable recent blog posts:
Reactive Transactions with Spring
Spring Data R2DBC 1.0 M2 and Spring Boot starter released
Going Reactive with Spring, Coroutines and Kotlin Flow
More and more features towards less-proxying approach of building Spring applications are introduced. (see this commit for example)
So I think that even though it might be possible to do what you've described it is far from Spring Team's #1 priority for now
Because AOP proxies and #Configuration class serve a different purpose, and are implemented in a significantly different ways (even though both involve using proxies).
Basically, AOP uses composition while #Configuration uses inheritance.
AOP proxies
The way these work is basically that they create proxies that do the relevant advice logic before/after delegating the call to the original (proxied) object. The container registers this proxy instead of the proxied object itself, so all dependencies are set to this proxy and all calls from one bean to another go through this proxy. However, the proxied object itself has no pointer to the proxy (it doesn't know it's proxied, only the proxy has a pointer to the target object). So any calls within that object to other methods don't go through the proxy.
(I'm only adding this here for contrast with #Configuration, since you seem to have correct understanding of this part.)
#Configuration
Now while the objects that you usually apply the AOP proxy to are a standard part of your application, the #Configuration class is different - for one, you probably never intend to create any instances of that class directly yourself. This class truly is just a way to write configuration of the bean container, has no meaning outside Spring and you know that it will be used by Spring in a special way and that it has some special semantics outside of just plain Java code - e.g. that #Bean-annotated methods actually define Spring beans.
Because of this, Spring can do much more radical things to this class without worrying that it will break something in your code (remember, you know that you only provide this class for Spring, and you aren't going to ever create or use its instance directly).
What it actually does is it creates a proxy that's subclass of the #Configuration class. This way, it can intercept invocation of every (non-final non-private) method of the #Configuration class, even within the same object (because the methods are effectively all overriden by the proxy, and Java has all the methods virtual). The proxy does exactly this to redirect any method calls that it recognizes to be (semantically) references to Spring beans to the actual bean instances instead of invoking the superclass method.
read a bit spring source code. I try to answer it.
the point is how spring deal with the #Configurationand #bean.
in the ConfigurationClassPostProcessor which is a BeanFactoryPostProcessor, it will enhance all ConfigurationClasses and creat a Enhancer as a subClass.
this Enhancer register two CALLBACKS(BeanMethodInterceptor,BeanFactoryAwareMethodInterceptor).
you call PersistenceJPAConfig method will go through the CALLBACKS. in BeanMethodInterceptor,it will get bean from spring container.
it may be not clearly. you can see the source code in ConfigurationClassEnhancer.java BeanMethodInterceptor.ConfigurationClassPostProcessor.java enhanceConfigurationClasses
You can't call #Transactional method in same class
It's a limitation of Spring AOP (dynamic objects and cglib).
If you configure Spring to use AspectJ to handle the transactions, your code will work.
The simple and probably best alternative is to refactor your code. For example one class that handles users and one that process each user. Then default transaction handling with Spring AOP will work.
Also #Transactional should be on Service layer and not on #Repository
transactions belong on the Service layer. It's the one that knows about units of work and use cases. It's the right answer if you have several DAOs injected into a Service that need to work together in a single transaction.
So you need to rethink your transaction approach, so your methods can be reuse in a flow including several other DAO operations that are roll-able
Spring uses proxying for method invocation and when you use this... it bypasses that proxy. For #Bean annotations Spring uses reflection to find them.
I know I can inject as an instance all the beans that match the interface and then choose between them programmatically :
#Inject #Any Instance<PaymentProcessor> paymentProcessorSource;
That means I have to put the selecting logic into the client.
Can I, as an alternative, cache the value of the ejb using lexical scoping with lambda expression? Will the container be able to correctly manage the lifecycle of the ejb in that case or is this practice to avoid?
For example, having PaymentProcessorImpl1 e PaymentProcessorImpl2 as two strategies of PaymentProcessor, something like that:
public class PaymentProcessorProducer {
#Inject
private PaymentProcessorImpl1 paymentProcessorImpl1;
#Inject
private PaymentProcessorImpl2 paymentProcessorImpl2;
#Produces
private Function<String, PaymentProcessor> produce() {
return (strategyValue) -> {
if ("strategy1".equals(strategyValue)) {
return paymentProcessorImpl1;
} else if ("strategy2".equals(strategyValue)) {
return paymentProcessorImpl2;
} else {
throw new IllegalStateException("Tipo non gestito: "
+ strategyValue);
}
};
}
}
and then into the client to something like that:
#Inject
Function<String, PaymentProcessor> paymentProcessor;
...
paymentProcessor.apply("strategy1")
Can I, as an alternative, cache the value of the ejb using lexical scoping with lambda expression?
Theoretically, you could do this. Whether it works is easy to try on our own.
Will the container be able to correctly manage the lifecycle of the ejb in that case or is this practice to avoid?
What exactly is an EJB here? The implementation of PaymentProcessor? Note that EJB beans are different from CDI beans. As in CDI container does not control lifecycle of EJB beans, it "only provides a wrapper for you to use them as if they were CDI beans".
That being said, the lifecycle is still the same - in your case the producer is creating #Dependent bean meaning every time you inject Function<String, PaymentProcessor>, the producer will be invoked.
What poses certain problem is that you create an assumption on two or more context being active at any given time. The moment you decide to actually apply() the function, the scope within which your implementation(s) exist may or may not be active. If they are both ApplicationScoped for instance, you should be alright. If, however, they are SessionScoped and you happen to timeout/invalidate session before applying function, then you get into a very weird state.
This is probably why I would rather avoid this approach and go with qualifiers. Or you can introduce a new bean which has both strategies in it and have a method with an argument which decides which strategy to use.
Similar questions have been asked, but don't quite address what I'm trying to do. We have an older Seam 2.x-based application with a batch job framework that we are converting to CDI. The job framework uses the Seam Contexts object to initiate a conversation. The job framework also loads a job-specific data holder (basically a Map) that can then be accessed, via the Seam Contexts object, by any service down the chain, including from SLSBs. Some of these services can update the Map, so that job state can change and be detected from record to record.
It looks like in CDI, the job will #Inject a CDI Conversation object, and manually begin/end the conversation. We would also define a new ConversationScoped bean that holds the Map (MapBean). What's not clear to me are two things:
First, the job needs to also #Inject the MapBean so that it can be loaded with job-specific data before the Conversation.begin() method is called. Would the container know to pass this instance to services down the call chain?
Related to that, according to this question Is it possible to #Inject a #RequestScoped bean into a #Stateless EJB? it should be possible to inject a ConservationScoped bean into a SLSB, but it seems a bit magical. If the SLSB is used by a different process (job, UI call, etc), does it get separate instance for each call?
Edits for clarification and a simplified class structure:
MapBean would need to be a ConversationScoped object, containing data for a specific instance/run of a job.
#ConversationScoped
public class MapBean implements Serializable {
private Map<String, Object> data;
// accessors
public Object getData(String key) {
return data.get(key);
}
public void setData(String key, Object value) {
data.put(key, value);
}
}
The job would be ConversationScoped:
#ConversationScoped
public class BatchJob {
#Inject private MapBean mapBean;
#Inject private Conversation conversation;
#Inject private JobProcessingBean jobProcessingBean;
public void runJob() {
try {
conversation.begin();
mapBean.setData("key", "value"); // is this MapBean instance now bound to the conversation?
jobProcessingBean.doWork();
} catch (Exception e) {
// catch something
} finally {
conversation.end();
}
}
}
The job might call a SLSB, and the current conversation-scoped instance of MapBean needs to be available:
#Stateless
public class JobProcessingBean {
#Inject private MapBean mapBean;
public void doWork() {
// when this is called, is "mapBean" the current conversation instance?
Object value = mapBean.getData("key");
}
}
Our job and SLSB framework is quite complex, the SLSB can call numerous other services or locally instantiated business logic classes, and each of these would need access to the conversation-scoped MapBean.
First, the job needs to also #Inject the MapBean so that it can be loaded with job-specific data before the Conversation.begin() method is called. Would the container know to pass this instance to services down the call chain?
Yes, since MapBean is #ConversationScoped it is tied to the call chain for the duration starting from conversation.begin() until conversation.end(). You can think of #ConversationScoped (and #RequestScoped and #SessionScoped) as instances in ThreadLocal - while there exists an instance of them for every thread, each instance is tied to that single thread.
Related to that, according to this question Is it possible to #Inject a #RequestScoped bean into a #Stateless EJB? it should be possible to inject a #ConservationScoped bean into a SLSB, but it seems a bit magical. If the SLSB is used by a different process (job, UI call, etc), does it get separate instance for each call?
It's not as magical as you think if you see that this pattern is the same as the one I explained above. The SLSB indeed gets a separate instance, but not just any instance, the one which belongs to the scope from which the SLSB was called.
In addition to the link you posted, see also this answer.
Iv'e tested a similar code to what you posted and it works as expected - the MapBean is the same one injected throughout the call. Just be careful with 2 things:
BatchJob is also #ConversationScoped but does not implement Serializable, which will not allow the bean to passivate.
data is not initialized, so you will get an NPE in runJob().
Without any code samples, I'll have to do some guessing, so let's see if I got you right.
Would the container know to pass this instance to services down the call chain?
If you mean to use the same instance elsewhere in the call, then this can be easily achieved by making the MapBean an #ApplicationScoped bean (or, alternatively, and EJB #Singleton).
it should be possible to inject a ConservationScoped bean into a SLSB, but it seems a bit magical.
Here I suppose that the reason why it seems magical is that SLSB is in terms of CDI a #Dependent bean. And as you probably know, CDI always creates new instance for dependent bean injection point. E.g. yes, you get a different SLS/Dependent bean instance for each call.
Perhaps some other scope would fit you better here? Like #RequestScoped or #SessionScoped? Hard to tell without more details.
I'm developing a JEE application where each request done to "facade" beans should run a single transaction.
Basicly, in each method, I could do it like this:
#Override
public void updateSalaries(float factor)
{
initializeTransaction();
// Actual business code...
commitTransaction();
}
Where ùpdateSalaries()is a method invoked by the client, and whereinitializeTransaction()andcommitTransaction()` respectively take care of getting/starting/committing/rolling back (if necessary) the transaction.
Unfortunately, the transaction management should be more transparent: something developers should not care about when writing business methods.
Therefore, some way to "decorate" these business methods would be nice, but I can't think of a possible way to do that.
Another possible solution I thought of would be to handle it in a central DataAccessBean class, where I would start the transaction on #PostConstruct and commit it on #PreDestroy:
#Stateless
public class DataAccessBean implements IDataAccessBean
{
#PostConstruct
public void initializeTransaction() { /* ... */ }
#PreDestroy
public void endTransaction() { /* ... */ }
#Override
public <T implements Serializable> T getObjectById(
Class<T> objectType, Object key) { /* ... */ }
#Override
public void saveObject(Serializable object) { /* ... */ }
}
I'm not sure though, if I can rely on that mechanism. An important question would also be, what type of bean I'd need: I doubt a stateful bean would be suitable as the transaction is per-request and not per-session. Maybe a stateless bean would be a good option, but AFAIK a stateless bean might not be destroyed when a request completes (if it resides in a stateless bean pool).
Two little constraints:
The solution should not depend on a particular non-standard framework or JEE-server
The solution should be compatible with JEE 6 and JEE 7
Thanks for the suggestions.
What you need is addressed by Java Transaction API (JTA). From the JEE6 Tutorial (Part VIII - Chapter 42):
The Java Transaction API (JTA) allows applications to access transactions in amanner that is
independent of speciic implementations. JTA speciies standard Java interfaces between a
transactionmanager and the parties involved in a distributed transaction system: the
transactional application, the Java EE server, and themanager that controls access to the shared
resources afected by the transactions.
You want to use the Container-Managed Transaction. In this strategy you just need decorate the beans/methods with the appropriate transaction attribute i.e.:
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void myMethod() {
...
}
The design of your service layer must carefully address the transaction life cycle in case your have services of services (nested service calls).
We have some JavaEE5 stateless EJB bean that passes the injected EntityManager to its helpers.
Is this safe? It has worked well until now, but I found out some Oracle document that states its implementation of EntityManager is thread-safe. Now I wonder whether the reason we did not have issues until now, was only because the implementation we were using happened to be thread-safe (we use Oracle).
#Stateless
class SomeBean {
#PersistenceContext
private EntityManager em;
private SomeHelper helper;
#PostConstruct
public void init(){
helper = new SomeHelper(em);
}
#Override
public void business(){
helper.doSomethingWithEm();
}
}
Actually it makes sense.. If EntityManager is thread-unsafe, a container would have to do
inercept business()
this.em = newEntityManager();
business();
which will not propagate to its helper classes.
If so, what is the best practice in this kind of a situation? Passing EntityManagerFactory instead of EntityManager?
EDIT: This question is very interesting so if you are interested in this question, you probably want to check out this one, too:
EDIT: More info.
ejb3.0 spec
4.7.11 Non-reentrant Instances
The container must ensure that only one
thread can be executing an instance at
any time. If a client request arrives
for an instance while the instance is
executing another request, the
container may throw the
javax.ejb.ConcurrentAccessException to
the second client[24]. If the EJB 2.1
client view is used, the container may
throw the java.rmi.RemoteException to
the second request if the client is a
remote client, or the
javax.ejb.EJBException if the client
is a local client.[25] Note that a
session object is intended to support
only a single client. Therefore, it
would be an application error if two
clients attempted to invoke the same
session object. One implication of
this rule is that an application
cannot make loopback calls to a
session bean instance.
And,
4.3.2 Dependency Injection
A session bean may use dependency injection
mechanisms to acquire references to
resources or other objects in its
environment (see Chapter 16,
“Enterprise Bean Environment”). If a
session bean makes use of dependency
injection, the container injects these
references after the bean instance is
created, and before any business
methods are invoked on the bean
instance. If a dependency on the
SessionContext is declared, or if the
bean class implements the optional
SessionBean interface (see Section
4.3.5), the SessionContext is also injected at this time. If dependency
injection fails, the bean instance is
discarded. Under the EJB 3.0 API, the
bean class may acquire the
SessionContext interface through
dependency injection without having to
implement the SessionBean interface.
In this case, the Resource annotation
(or resource-env-ref deployment
descriptor element) is used to denote
the bean’s dependency on the
SessionContext. See Chapter 16,
“Enterprise Bean Environment”.
I used a similar pattern, but the helper was created in #PostConstruct and the injected entity manager was passed in the constructor as parameter. Each EJB instance had its own helper and thread-safety was guaranteed then.
I also had a variant were the entity manager was not injected (because the EJB wasn't using it altogether), so the helper has to look it up with InitialContext. In this case, the Persistence context must still be "imported" in the parent EJB with #PersistenceContext:
#Stateless
#PersistenceContext(name="OrderEM")
public class MySessionBean implements MyInterface {
#Resource SessionContext ctx;
public void doSomething() {
EntityManager em = (EntityManager)ctx.lookup("OrderEM");
...
}
}
But it's actually easier to inject it (even if the EJB doesn't use it) than to look it up, especially for testability.
But to come back to your main question, I think that the entity manager that is injected or looked up is a wrapper that forwards to the underlying active entity manager that is bound to the transaction.
Hope it helps.
EDIT
The section § 3.3 and § 5.6 in the spec cover a bit the topic.
I've been using helper methods and passed the EntityManager there, and it is perfectly OK.
So I'd recommend either passing it to methods whenever needed, or make the helper a bean itself, inject it (using #EJB) and inject the EntityManager there as well.
Well, personally, I wouldn't like to have to pass the Entity Manager to all my POJOs in my constructors or methods. Especially for non-trivial programs where the number of POJOs is large.
I would try to create POJOs/HelperClasses that deal with the Entities returned by the EntityManager, instead of using the entitymanager directly.
If not possible, I guess I'd create a New EJB Bean.