architecture - should I combine many ad-hoc applications to a single application? - java

My main goal is providing a search application written in jquery that is based on solr. (For those who unfamiliar with solr, just assume its a rest api that can return search result.)
For this goal I wrote many small applications and servlets that each one does an ad-hoc task.
For example:
SearchApp - a jquery app in which an end user can perform searches.
SolrProxy - A java servlet that plays a proxy role between the SearchApp and solr. One of the things it does is logging the user request for later analysis.
StatsApp- a servlet that performs analysis of the user activity and returns a json with the data.
Indexer - a java application that indexes data to solr according to my requirements. in this process it also fetches an SQLServer DB, and then performs some update commands to the DB.
IndexerServlet - an asynchronous servlet that uses Indexer to provide an ability to execute index by http request.
Nutch - an open source project that indexes data to solr for other requirements that are not accomplished in Indexer(3).
(MAYBE) - some service that will perform scheduled Nutch running.
And more components might be added.
It seems a bit wrong to have multiple java projects that each one does a single task, instead of having one project that handles most of the components.
Any ideas and insights on this?
Should I combine all the java apps to a single project? should I use some kind of a fremework for this? or should I live it as it is now?

I don't think it's a bad idea that you have all these separate applications. They all seem to be doing one thing, and doing it well. What you can do, is expose them via a unified interface. So essentially you have a facade that sits in front of all these disparate services that presents an abstract and uniform interface. The consumers of this service will have no idea what sits behind that facade. This is just as well, because now you can discretely update and replace individual components without affecting others. If you had combined all of them into one, you would have to push a new release every time you modified one of the components.

Related

Java user defined flows workflow engine

I am trying to provide the users a way to generate their own workflow as part of the system.
These workflows will be custom paths an order will take depending on customer requirements.
For example: If a customer requires us to sign a set of terms and conditions, the order should not be able to be approved unless a T&C document has been uploaded.
I have been looking at using bpmn-js for the frontend and execute the output BPMN2.0 file every time something changes that is related to the workflow (i.e. hooks on when documents are uploaded in this case) but it doesn't look like the users will be able to select actual system functionality with that library out-of-the-box. Should I try to extend that library or is there something else I could use instead?
I have been looking into using Camunda as well, but it would be nice to not expect the users to use a second application.
Design your application in a way that the business process will be driven by the process engine. When a process is initiated it start a process instance. From there the process engine (embedded in your existing Java application or standalone back-end service) determines if business rules (DMN) need to be evaluated (all required data and approvals present?), services need to be called (invocation of your java code directly), automated (external task pattern) or user task need to be completed by technical workers (any language) or human process participants (based on assignment determined by engine).
If humans are involved, the UI/client queries the process engine for pending user tasks and updated those tasks when humans have performed the tasks.
The process engine will determine the next steps based on the interpretation (no code is generated) of the BPMN2-standard-based process model. Versioning for these is provided ootb. Newly started processes are automatically started on the latest version while running process instances continue their life-cycle on the version they were started on (unless they are migrated).
It is usually not an issue if users need to access a dedicated modelling environment at design time. They just should not have to work with two applications at runtime. Anyway, it is also easily possible to integrate the modelling part into the same application via the BPMN-js library you mentioned.
The "selection of system functionality" is done by selecting the implementation type and setting the corresponding attributes. The BPMN-JS library is generic in this regard. have a look at how this is done in the Camunda Modeler (https://camunda.com/products/camunda-bpm/modeler/):
Java: https://docs.camunda.org/get-started/java-process-app/service-task/
Spring: https://docs.camunda.org/get-started/spring/service-task/
External: https://docs.camunda.org/manual/latest/user-guide/process-engine/external-tasks/

What is a good practice to deploy webservices? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Is it a good practice to deploy web services separately or should they be part of the web application? For instance, I am developing a spring rest based web service. The function of this service is to, let's say, to get user data.
Each webapplication that queries this web service has it's user data in different schema. So, now the webservice will need to know who is calling it - is it Appilcation A or Application B? If it's AppA, then it should get data from Schema A, if it's AppB, then its another schema. Note, that AppA and AppB are just the same code packed into two different wars and the schema they are supposed to query is supplied from properties file.
In a situation like this, does it make sense to pack the webservice with the webapp code and deploy it under different contexts, so it becomes a duplciate service running in a different context. Or, should it be deployed separately and somehow the AppA and AppB are supposed to identify themselves to this web service?
I prefer below approach, which is in use for 50K concurrent users.
Make sure that each web service encapsulates both UI and Schema independently by executing required business use case. Each web service will have all three layers - Model, View and Controller for that business service. That means your App-A is one web service & App-B is other web service.
All web services will register and un-register with Master web service. Master web service is responsible to redirecting user request to appropriate web service like App-A OR App-B.
You should have cluster of Master web service & cluster of individual web services - App-A & App-B
In this approach, your schema can reside on different database instead of single database
Advantages of this approach:
Each web service can scale horizontally. Just add additional VM nodes if you want to increase the scale.
If you have different schemas on different databases in different locations, you are avoiding network performane bottlenecks in OLTP queries (Online transaction processing queries).
Disadvantages:
I see only one disadvantage since Master Web Service acts like a Facade and it should know the internals of individual web service. But it's not a drawback for the advantages it is offering if you consider the trade-off.
I have no idea about your business requirement to maintain different schemas for user data and going with webservice.
But instead of maintaining multiple wars with same code, i would suggest you to configure multiple datasources within the application and switch to datasource as per your requirement.
This link may help you to configure multiple DS
If you fallow aforementioned logic, you may end up with single deployable context.
Still want to stick with multiple wars as webservice, i would suggest you to have look at SpringBoot simple, container less deployable and scalable.
It is a matter of opinion, both choices are okay. You should take into account the usage of the service, scaling concerns etc.
You could look at Microservices as an idea, but it has to make sense from your standpoint.
About the two different apps: if the differences are only in configuration, try externalizing it (23. Externalized Configuration). This way you can have a single artifact being deployed twice.
Given that scenario, it is a good practice having only one web service, in this way you improve the maintainability of the system because you don’t have the same code twice. If you have in the future other new similar app you don’t have to implement a new service.
Approach 1:- (Preffered)
You should have a single web application in which will have the entire code for application UI and Repo/data interaction.
Based on the type of request dynamically switch the data source as needed. You can have at look at Spring Dynamic datasource routing here
Approach 2:-
In case your UI has a completely different type of interactions managed by different teams, it makes sense to have separate UI components and the backend web services maintained at a same location.
Again based on the type of request you can dynamically route the datasource.
Hope this helps :)
my inputs:
1) Any specific reasons to build 2 different wars for same code? Is it only because you have two different data sources for each of them?
Why cant you have single application deploy with some parameterized mechanism in each request to identify which schema to get data from?
2) Why do you need a web service in first place? Why not application hook directly to database it needed.
3) Is underlying database transactional DB or some historical data? How about merging both schemas in one as one-time effort OR using some sort of virtualized views which picks data from 2 schemas based on input parameters.
***** edited after Jay's inputs:
My suggestion will be to have web service deployed separately from 2 web apps because it provides single place to manage code in long run. I have following additional suggestions:
Define your own headers in SOAP XML Schema which can give you both appContext(application making call) as well as userContext(user). Give a good thought on this aspect keeping long term view.
Keep SOAP request-response stateless which will give you scalability. Dont maintain any state of SOAP request at server side.
I have in past used a data virtualization solution (CISCO Composite)..what benefits it provides: if there are two (or more) data sources containing similar type of data(entities), it can join,cleanse & merge it virtually and expose it as REST/SOAP based web service. Try evaluating this option as well.
What it can further help if in future you have other consumers to access your information using plain SQL/JDBC call, they will be able to do it...also data virtualization solutions support many other interfaces to consumers like Hadoop, OData etc...again it depends on budget and other constraints of project...I am not sure if there is any effective open source data virtualization solution available or not?
Personally, in my experience, it's a lot better to have them separated, it usually depends on how big and how critical your main project is.
But even if at the beginning your project isn't that big and there's only 1 person working on it, later on, as it continues to grow, if you have microservices for all the things your main project do, it will be a lot easier to maintain, rather than having many people working on the same code handling many versions of an unique project, handling many small projects is less confusing and errors are easier to find.
Plus if something fails, you can have 1 microservice down while your main still runs without interruption, it will only by denied of 1 service, instead of having everything down while you fix it.
High availability is very important in production, and having them separated helps with this.
Given your situation I'd advice going with ONE webapp (one "project") with some caveat and then consider one of the two solutions:
1) Given you are using spring, I'll assume (hope) you are using maven as well..
Make a different compilation goal and make it so that, based on the goal invoked to produce the war, the relevant properties file is different..
This way you have ONE webapp, and based on the compilation (or rather based on the properties file tied to that specific compilation) you will obtain a war tied to a specific environment&schema... You deploy an individual war for each webservice with a clean separation, though the root code is the very same and it's only one application... [CLEANER SOLUTION]
2) Make it so that you don't only get the json request but also the https certificate of the sender (thus you identify a specific "webapp" based on the https certificate exposed), and based on the certificate AND The source of the request, you ensure the source as "qualified" to receive data from schema X rather than schema Y.. You deploy ONE war only that will, at his own discretion, apply logic to reroute your "user data fetch query" to one database or the other [I DISCOURAGE THIS PRACTICE]
of course there are other approach as well, but I think these two are the most feasible..
It really depends on what you want to achieve.
If you want to encapsulate the database/schema/table, then it should really be one service for each application. The main advantage of doing this is that you could swap the database later on if there is some problem with the current one, it also simplifies caching and invalidation, etc etc.
If the database/schema/table is not encapsulated anyway, then the single service is much easier and better. Each web application just have to identify themselves, and each of them will get exactly what they need. This could be achieved by putting the query/schema information in property file, or creating db views with the same name as client, etc.
If we were to go for this approach, a question will pop up. Why bother having this layer at all? Couldn't each web application just query the db directly? If the answer is yes, then just remove the whole layer completely.
You are trying to implement a Data Provider, or DAO as a service.
To make it -
Simple
Scalable
Maintainence-friendly,
Adaptable
You can simply have a single webservice, deployed outside the WebApp(s) and driven off configuration. The configuration itself can be stored as property file, or from a DB. The identifier for the client should be being passed in the webservice request.
This is actually a pretty standard approach implemented to enable optimizations at the Data tier outside of DB, like caching (again driven of configuration), expiry, pooling, etc.
The other option, to include as a shared jar within the webapp, yes, has advantage of code-reuse (which you get with externally deployed service as well), but the following disadvantages outweigh the option.
Coupling
Employing optimizations are difficult
Release management (this also depends upon how your code is organized)
Versioning.
Hope it helps.
I would deploy to one instance. No matter what. Of course, there are circumstances where it may be necessary to deploy separately. From a best "coding" practice, one instance should be used to allow for "right once, use many".
Then...
Define different XSD's for each AppA, AppB, etc. Marshall accordingly.
Or, use Groovy to marshall appropriate objects as json or xml.

Usecase for Workflow Engine

We have an issue where a Database table has to be updated on the status for a particular entity. Presently, its all Java code with a lot of if conditions and an update to the status. I was thinking along lines of using a Workflow engine since there can be multiple flows in future. Is it an overkill to use a Workflow Engine here... where do you draw the line ?
It depends on the complexity of your use case.
In a simple use case, we have a database column updated by multiple consumers for each stage in an Order lifecycle. This is done by a web service calling into the database.
The simple lifecycle goes from ACKNOWLEDGED > ACCEPTED/REJECTED > FULFILLED > CLOSED. All of these are in the same table on the same column. This is executed in java classes with no workflow.
A workflow engine is suited in a more complex use case which involves actions on multiple data providers eg: database or Content Mgmt or Document Mgmt or search engine, multiple parallel processes, forking based on the success/failure of a previous step, sending an email at a certain step, offline error alerting.
You can look at Apache ODE to implement this.
We have an issue where a Database table has to be updated on the status for a particular entity. Presently, its all Java code with a lot of if conditions and an update to the status.
Sounds like something punctual, no need for orchestrating actions among workflow participants.
Maybe a rule engine is better suited for this. Drools could be a good candidate. When X then Y.
If you're using Spring, this is a good article on how to implement your requirement
http://www.javaworld.com/javaworld/jw-04-2005/jw-0411-spring.html
I think you should consider a workflow engine. Workflow should be separated from application logic.
Reasons:
Maintainable: Easier to modify, add new flows and even easier to replace by another workflow engine.
Business Process management: Workflows are mostly software representations of BPM. So it is usually designed by process designers (Non-tech people). So it is not a good idea to code inside the application. Instead BPM products such as ALBPM or JPBM should be used which support graphical workflow designs.
Monitoring business flows: They are often monitored by the Top level managers and used to make strategic decisions.
Easier for Data mining/Reports/Statistics.
ALBPM(Now Oracle BPM): is a commercial tool from Oracle suitable for large scope projects.
My recommendation is JBPM. Open source tool from JBOSS. Unlike ALBPM which requires separate DB and application server, it can be packaged with your application and runs as another module in your application. I think suitable for your project.

Client side caching in GWT

We have a gwt-client, which recieves quite a lot of data from our servers. Logically, i want to cache the data on the client side, sparing the server from unnecessary requests.
As of today i have let it up to my models to handle the caching of data, which doesn't scale very well. It's also become a problem since different developers in our team develop their own "caching" functionality, which floods the project with duplications.
I'm thinking about how one could implement a "single point of entry", that handles all the caching, leaving the models clueless about how the caching is handled.
Does anyone have any experience with client side caching in GWT? Is there a standard approach that can be implemented?
I suggest you look into gwt-presenter and the CachingDispatchAsync . It provides a single point of entry for executing remote commands and therefore a perfect opportunity for caching.
A recent blog post outlines a possible approach.
You might want to take a look at the Command Pattern; Ray Ryan held a talk at Google IO about best practices in GWT, here is a transcript: http://extgwt-mvp4g-gae.blogspot.com/2009/10/gwt-app-architecture-best-practices.html
He proposes the use of the Command Pattern using Action and Response/Result objects which are thrown in and out the service proxy. These are excellent objects to encapsulate any caching that you want to perform on the client.
Here's an excerpt: "I've got a nice unit of currency for implementing caching policies. May be whenever I see the same GET request twice, I'll cache away the response I got last time and just return that to myself immediately. Not bother with a server-side trip."
In a fairly large project, I took another direction. I developed a DtoCache object which essentially held a reference to each AsyncCallback that was expecting a response from a service call in a waiting queue. Once the DtoCache received the objects from the server, they were cached inside the DtoCache. The cached result was henceforth returned to all queued and newly created AsyncCallbacks for the same service call.
For an already-fully-built, very sophisticated caching engine for CRUD operations, consider Smart GWT. This example demonstrates the ability to do client-side operations adaptively (when the cache allows it) while still supporting paging for large datasets:
http://www.smartclient.com/smartgwt/showcase/#grid_adaptive_filter_featured_category
This behavior is exposed via the ResultSet class if you need to put your own widgets on top of it:
http://www.smartclient.com/smartgwtee/javadoc/com/smartgwt/client/data/ResultSet.html
There are two levels of caching:
Caching during one browser session.
Caching cross browser sessions, e.g the cached data should be available after browser restarted.
What to cache: depend on your application, you may want to cache
Protected data for particular user
Public static (or semi-static, e.g rarely to change) data
How to cache:
For the first caching level, we can use GWT code as suggested in the answers or write your own one.
For the second one, we must use Browser caching features. The standard approach is put your data inside html (whether static html files or dynamic generated by jsp/servlet for example). Your application then use http://code.google.com/webtoolkit/doc/latest/DevGuideCodingBasicsOverlay.html techniques to get the data.
I thought Itemscript was kind of neat. It's a RESTful JSON database that works on both the client (GWT) and server.
Check it out!
-JP

Architecture - Multiple web apps operating on the same data

I'm asking for a suitable architecture for the following Java web application:
The goal is to build several web applications which all operate on the same data. Suppose a banking system in which account data can be accessed by different web applications; it can be accessed by customers (online banking), by service personal (mostly read) and by the account administration department (admin tool). These applications run as separate web applications on different machines but they use the same data and a set of common data manipulation and search queries.
A possible approach is to build a core application which fits the common needs of the clients, namely data storage, manipulation and search facilities. The clients can then call this core application to fulfil their requests. The requirement is the applications are build on top of a Wicket/Spring/Hibernate stack as WARs.
To get a picture, here are some of the possible approaches we thought of:
A The monolithic approach. Build one huge web application that fits all needs (this is not really an option)
B The API approach. Build a core database access API (JAR) for data access/manipulation. Each web application is build as a separate WAR which uses the API to access a database. There is no separate core application.
C RMI approach. The core application runs as a standalone application (possibly a WAR) and offers services via RMI (or HttpInvoker).
D WS approach. Just like C but replace RMI with Web Services
E OSGi approach. Build all the components as OSGi modules and which run in an OSGi container. Possibly use SpringSource dm Server or ModuleFusion. This approach was not an option for us for some reasons ...
Hope I could make clear the problem. We are just going the with option B, but I'm not very confident with it. What are your opinions? Any other solutions? What are the drawbacks of each solution?
I think that you have to go in the oppposite direction - from the bottom up. Of course, you have to go forth and back to verify that everything is playing, but here is the general direction:
Think about your data - DB scheme, how transactions are important (for example in banking systems everything is about transactions) etc.
Then define common access method - from set of stored procedures to distributed transaction engine...
Next step is a business logic/presentation - what could be generalized and what is a subject of customization.
And the final stage are the interfaces, visualisation and reports
B, C, and D are all just different ways to accomplish the same thing.
My first thought would be to simply have all consumer code connecting to a common database. This is certainly doable, and would eliminate the code you don't want to place in the middle. The drawback, of course, is that if the schema changes, all consumers need to be updated.
Another solution you may want to consider is giving each consumer its own database, using some sort of replication to keep them in sync.
It looks like A and E are out of the picture as you have stated in your question for various reasons. Option A would be one huge application which would make maintenance difficult in the future.
B, C and D are essentially the same architecturally since they involve remote access to common libraries from the various web applications, the only difference is the transport mechanism. I would recommend implementing this in EJB 3 or Spring if possible instead of with your own RMI libraries since either of these provide a good framework over RMI / Webservices.
So I think this problem basically boils down to the following two options:
1) Include the business and DAO layer classes as a common jar included in the deployment of all web applications.
Advantages:
Deployment is easier.
Applications will perform better initially since there is no remote access to other servers.
Disadvantages:
You cannot add more hardware to the middle tier specifically (service and DAO layers) since it is included in each web application.
Other business teams in the organisation will not have access to your business services since there is no remote interface.
2) Deploy the business service and DAO layer classes in a separate application server and expose business methods remotely.
Advantages:
You can scale up the business service and DAO layer as needed depending on load from the various web applications calling it.
Other applications in the organisation can make use of your interfaces if needed.
More scalable
You get all the advantages of Java EE.
Disadvantages:
More complex deployment.
Another server to maintain and monitor.
Could be slower since calls will be made over the network although this shouldn't be too much of a problem.
In both cases if the interfaces change the client code will need to change so this isn't a factor in the decision. Transactions should be handled on the business service method level so this shouldn't be a factor either.
I think it depends on the size of the applications as well and how scalable the solution needs to be to warrant the extra complexity of option 2 above.
I think you need to have a separate application that all the client applications will use as their data layer. The reason for this is that you want to ensure they're all accessing the database in the same way. There are also some race conditions you can get into that database transactions may not be able to prevent. The other reason is that using the database as a form of RPC is a known antipattern. If all your apps access the database directly, you will almost inevitably end up with some "event" table that the various applications poll periodically... don't do that.
Apart from the provided responses, if you are considering having multiple applications working with the database at the same time, consider a distributed cache as part of your solution, as well. The beauty of the distributed cache is that it can be accessed by multiple applications at the same time, apart from being distributed. I am not sure if this holds true for all of the Java variations, such as Ehcache, etc, as I do not come from a Java background.
What we are currently doing is abstracting the data a level further than before. We now have a DAL that can be accessed directly, but we have put a "Model Factory" in front of the DAL. The purpose of the Model Factory is to broker both the cache and the data layer, acting as a passthrough. So, the caller always calls the Model Factory and not the DAL or caching code directly. This abstraction layer will basically retrieve data from the DAL on a cache miss without adding the complexity to the API.

Categories