I started working with REST services recently. I have several tools joined into the framework of integrated tools. Tools communicate over the common component (CC) which handles their requests (using REST services) and is actually an interface between all the tools. For every POST request a new resource is created and stored into memory. Every time the CC goes off all the data is lost. For that case, I created an Apache Derby database to store all the resources. With every resource creation, entry is created in the database. Every time CC turns on it fetches all the data from the database and the data is regurally synced. The problem is that multiple tools can POST at almost the same time. How does REST handle these requests? I hoped that it manages the requests in a queue-like way, but from what I see it does it at the same time in a thread-like way. My database goes down instantly. Am I on the right track or something else could be wrong?
Related
I have a giant monolithic which has around a million entities. I want to sync data to the micro-service so that it always has the same replica of entities with some fields as the monolithic system. There are 2 ways to do so:
Write an API for the microservice and fetch data through rest calls in
batches
Write an ETL service that directly connect to database of
monolithic and the database of microservice to load the data.
The drawback of the first approach is that it will include a number of Rest calls and would be slow as I could be having a million of records. The second approach breaks the microservices principle(Correct me if it is not the principal) as apart from microservice ETL service would be accessing the database.
Note:I only want to sync some fields from the record not all say if a record has 200 fields and in my service only 2 fields are being used, then I need to have all records with only those 3 fields.And number of records being used can be changed dynamically.Say after some times the service is using 4 fields than 3,then i need to bring that 4th field into the db of my microservice.
So can anyone suggest which approach is better?
The first approach is better in terms of low-coupling high-cohesion since you have a clear interface (the REST api) between what you expose from the monolith and the data inside the monolith. In the long run, it makes both the microservice and the monolith easier to maintain.
But there's a third approach that's especially suitable for data synchronisation: asynchronous integration. Basically you monolith would need to send out a stream of change data messages, e.g. to a message queue or something like kafka. These messages are the interface, so you get the same advantage of low-coupling as with the REST API. But you also get additional advantages.
You don't have the overhead of REST calls, just an asynchronous message listener.
If the monolith is down or slow responding, you microservice is not affected.
There is a problem however of bootstrapping: do you retro-actively need to generate events for everything that happened in the past, or can you start from some point in time and keep everything in sync from that point onwards?
What is your end goal here -
Is it to slowly migrate from Monolithic to Micro-services by distributing traffic between two systems.
or
On a fine day, completely cutover to new Micro-services.
If its second approach, I would do ETL for data migration.
If its First approach -
Implement an CDC/or just changes in monolithic service to publish the persistent operations to Messaging system (Kafka.Rabbit).
Implement the subscriber on Micro-services and update the DB.
Once confident on Pub/Sub implementation, redirect all reads to micro-services system.
Then slowly divert some percentage of persistent calls to micro-services which will do a rest call to old system to update old DB.
Once you are confide on new services and data quality and other requirements(performance), completely cutover to new Micro-services).
** you need to do historic sync before starting the Async Messaging process.
This is one way to smoothly cutover from old systems.
Why do you want to synchronise data between monolithic and microservice?
Are you rewriting monolithic to MicroService? If this is the case, I would prefer using ETL service for data synchronisation as this is more standardised for data synchronisation compared to rest calls.
In my java/spring application a database record is fetched at the server init and is stored as a static field. Currently we do a mbean refresh to refresh the database values across all instances. Is there any other way to programatically refresh the database value across all the instances of the server? I am reading about EntityManager refresh.Will that work across all instances?Any help would be greatly appreciated.
You could schedule a reload every 5 minutes for example.
Or you could send events and all instance react to that event.
Till now, Communication between databases and servers is one-sided i.e. app server requests for data from the database. This generally results in the problem, and as you mentioned, that all application servers cannot know about a database change if an application is being run in cluster mode.
The current solution includes refreshing the fields time-to-time (A poll based technique).
To make this a push based model, We can create wrapper APIs over databases and let those wrapper APIs pass on the change to all the application servers.
By this I mean, Do not directly update database values from one application server but instead, on an update request send this change request to another application which keeps track of your application servers and pushes an event (via API call or queues) for a refresh of the passed database table.
Luckily, if you are using some new database (like MongoDB), they provide this update push to app servers out of the box now.
I am using a 3rd party (address validation) WebService in our application. The license to the WS includes only certain number of calls, exceeding which we pay more. I am trying to keep track of the usage within our application, so we can warn the users, before they exceed. Sort of like a hit counter for Web Services.
I am currently using a Static variable in the (controller) class to track it. This works, but only until the server gets restarted, at which time it resets to 0 again.
Is there a way to keep the counter running across restarts? I saw some suggestions about serializing static variable. Is this the right approach? Or should I read/write to a file/DB table every time I make the request (sounds costly).
MY webservice client will be running in an old Sybase EAServer (built around Apache Tomcat). So, I can only use Java 1.4.
Thanks for any comments or suggestions.
I would go: every WS call, update a counter on database.
If you have many users accessing the WS Client and all of them writing to simple text file, you are going to have trouble on concurrent access to the file.
If you try to use some in-memory architecture, and your app. crashes you are going to lose the count information.
So use a database.
we have to develop a local server which will load itself with the real-time data of a industry (particularly time stamped data points like the temperature of a boiler,pressure values etc) which are stored in industrial server and we want to fetch them and populate our server with it, the data is not streamed at server end so how to fetch it continuously and populate the server...
we would like to store only past 2-3 days of history data as time advances, any recommendations about the server and the back end process to be used to fetch data are welcome, we don't have any idea were to start..
please help...
As others have stated,
You need to provide more information on how do you intend to populate your server.
What API do you have for the "real time server"?
I worked on a management system for solar engery devices
(i.e - devices that produce electricity from solar energy - they are called photo-volatic cells if I remember correctly).
In my case these devices had an FTP access , which provided me files with time-based information.
I constructed a java server that used the following technologies:
A. Apache tomcat web container - This web container allowed me on one hand to hold java logic, and on the other hand to expose HTTP-based interface to the customer.
The Java logic was located in a Servlet- which exposes methods to handle HTTP requests (and allows writing returned data using response objects).
B. The servlet has an init method, I used it to perform some initialization, such as starting a quartz periodic task to probe the ftp servers of the devices.
C. I used a database (postgresql database, which is an open source database) to store configuration for the application, and also to store results.
D. I used another periodic task to archive old data in an archiving table, so the main data table will hold relatively new data.
I ran the archiving task once in a few days, and it simple checked for record that were "too" old, inserted them to the archiving table, and deleted them from the main data table. In order to peform this efficiently I have have decided to use a function that I coded on the database.
E. In order to access the database from the application, I used the Hibernate object relational mapping technology.
This technology allowed me to define mappings between tables and their relations to java objects, and gave me generated create,read (by-id), delete and updated SQL statements.
Using the HQL query language, I wrote some more complex queries.
F. For presentation/client side - I used plain JSP.
You may choose other alternatives such as :
GWT, Apache Wicket, JSF
You may consider using some MVC framework to have some seperation between the logic and the presentation. Such frameworks can be:
Spring-MVC , Struts, and many others.
To conclude, you must understand that Java offers you a variety of technologies, you must define requirements well, and then start investigating which technology can meet your needs.
In one of our applications we need to call the Yahoo Soap Webservice to Get Weather and other related info.
I used the wsdl2java tool from axis1.4 and generated th required stubs and wrote a client. I use jsp's use bean to include the client bean and call methods defined in the client which call the yahoo webservice inturn.
Now the problem: When users make calls to the jsp the response time of the webservice differs greatly, like for one user it took less then 10 seconds and the other in the same network took more than a minute.
I was just wondering if Axis1.4 queues the requests even though the jsps are multithreaded.
And finally is there an efficient way of calling the webservice(Yahoo weather). Typically i get around 200 simultaneous requests from my users.
Why don't you schedule one thread to get the weather every minute or so, and expose that to the JSP, in stead of letting each JSP get its own weather report?
That's a lot more efficient for both you and Yahoo, and JSP's only need to lookup a local object (almost instantaneous) in stead of connecting to a web service.
EDIT
Some new requirements in the comments of this answer suggest a different way of choosing solutions.
It seems that not only weather, which not only doesn't change that often but is also the same for every user, is requested by web service but also other data like flight data.
The requirements for flight data retrieval are very much different than for weather data. So I think you should define a few types of (remote) data and choose a different solution
for each category.
As basis for the requirements I'd use something simple:
Users like their information promptly, they do not like waiting
The amount of data stored on the web server is finite
Remote web services have an EULA of sorts and are probably not happy with 200 concurrent requests of the same data by the same source (you)
Fast data access to users is best achieved by having the data locally, be it transient (kept in a bean) or persistent (a local database). That can be done by periodically requesting data from the remote source, and using the cached data in the JSP. That would also keep you in the clear with the third point.
A finite amount of data stored on the web service means that not everything can be cached. Data which differs per user, or large data sets which can vary over small periods of time, cannot readily be cached. It's not really a good idea to load data on all flights of all airports in the US every minute or so. That kind of requests would be better served by running a specific web service query when necessary.
The trick is now to identify when caching data is feasible. If it is feasible, do that, otherwise run the web service query in the background. That can be done by presenting the JSP now and starting the web service query in the background. The JSP can have an AJAX script which queries your web server whether the data is ready, and insert that data in the page when ready.
I'd use Google tools to monitor how long the call to the web service is taking.
There are several things going on here:
Map Java beans to XML request.
Send XML request to web service.
Unmarshall XML request on web service side.
Web service processes request
Web service marshalles XML response
Web service sends XML response to Java client
Unmarshall XML response and display on client.
You can't see inside the Yahoo web service, but do break out what you can see on the client side to see where the time is spent.
Check memory as well. If Axis is generating .class files, maybe your perm space is being consumed. Visual VM is available to you with the JDK. Attach it to the PID on your client to see what's going on in memory on your app server.
Maybe this would be a good place for an AJAX call. This will be a good solution if you can get the weather in the background while users are doing other things.
I would recommend local caching and data pooling. Instead of sending out 200 separate requests for similar/same locations run a background thread which pulls the weather for only the locations your users are interested in and caches them locally, this cache updates every minute or so. When users request their personal preferences, the requests hit the cache and refetch if the location is new or the data in the cache is stale. This way the user will have a more seamless experience and you will not hit Yahoo throttles and get denied service.