Global State in Java/Spring - java

I have a basic Java/Spring MVC CRUD application in production on my company's intranet. I am still a beginner really, this application is what I've used to learn Java and web applications. Basically it has a table that uses AJAX to refresh its data on regular intervals, and an html form that is input into the database. The refresh is important because the data is viewed on multiple computers that need to see the input from the others.
The problem is that, due to network issues outside of my control, the database transactions on certain computers can be very slow.
I have been playing around with React/Redux JavaScript client applications in the past few weeks and the concept of state. Now, as best I can tell, global state or variables are pretty reviled by the Java community. Bugs, difficulty in testing, etc.
But Redux gave me an idea that, when a user hits "submit" instead of inserting a row into SQL, it stores that object in memory on the server. Then at regular intervals that memory is inserted into the database - so the user does not have to wait for database transactions, only communication with the server. Table refreshes don't look at the database - they look at this memory.
But, again as a beginner, I don't see people do this. Why is it a bad idea?

In general, it isn't done for two reasons:
the state is not guaranteed, because it is not actually written.
If you restart the application before the data is flushed to the database, it is silently dropped. This is not a good thing in general, although obviously, but your interpretation may very. If you don't care so much, this might be ok. You could remedy this by persisting it somewhere locally.
the state is also not guaranteed, because you may end up not being able to write the data because, for example, some database constraint.
So, in general it is frowned upon, because you are lying to the client ... You say you wrote it, but there's no actual effort to ensure this has actually happened.
But then again. if the data is less important, it might be ok.

Related

Logging Features/Functions Used In an Application

Here's the scenario: I am trying to keep myself and fellow employees from wasting time on programming fixes on features that are never used by users. I work on an application that has been around for 10 years and has a lot of features that may never be used by customers, even though at some point they were. I find myself going down rabbit holes fixing problems that were never reported because QA and customers never encountered them. I don't want to leave in code that has significant problems but I also can't just remove it because perhaps parts of that class is used elsewhere.
I proposed some sort of a logging architecture that keeps track of every function call or Object creation within the application while a user uses the application. This way we can ask customers for the logging file stored somewhere in a folder we create. Once we obtain all the logs from all our users after a certain amount of time (1-2 months or so) we can analyze the logs and see exactly which features and functions are being used. I believe this data will be super useful in discovering how our customers are using our application. It can also point to features in our application that simply never get used and which features are used the most.
What would be a good solution/architecture, design pattern to implement this logging feature into an existing large code base? Or is this just simply a bad idea all together?
My initial thought would be something like this but with an extremely large application this would take a while. Plus it doesn't look great to have something like this in every single method but perhaps it's the only way?
private void createData()
{
write.toLog("ClassName.createData");
//the magic of the createData function
}
Aspect Oriented Programming is a way to implement the idea.
You are not the first one that has such an idea. Analytics are usual on web pages and in web apps. This is probably mainly driven by business considerations.
There are also ideas around in the DevOps community. I think there is a good chance that there is already an implementation of what you want. This podcast episode is about the topic. It is not in English, but the show notes on the linked page contain some links.
There might be privacy related questions, especially if your software is not used in only a single company.
I think it is a bad idea to log every method call for a large system. Some methods will be called a lot of times and the log files will be overwhelmed with method call log statements.
Every time I have seen every method call logged in a large system that logging was at a DEBUG level or finer and could be selectively turned on if needed (and rarely ever was).
It's better to be more strategic with this kind of logging and at a coarser level, such as logging high level requests, user initiated activities, job initiation, and so forth.

WebSocket data consistency vs. latency

Let's assume that I have a JavaScript front-end (Angular.js, for example), a Java-based back-end (Spring, running on Tomcat, for example) and a database management system (SAP HANA In-Memory, in my case). For example, I have graphs that can change relatively quickly.
I am wondering what an efficient and fast architecture could look like. Do you usually send a whole collection of objects to the UI or do you just send deltas?
In my case, data consistency on the UI is very important in order for the application to work properly, but low-latency as well, especially when it comes to data merges.
When it comes to consistency, I often tend to do a SELECT from the database on an insert and read the whole object collection again, but my concerns are that this does not scale.
Is there a generic approach to that problem or even existing frameworks?
Edit:
Currently, it is around 300 objects with a couple of integer attributes and cross-references that can change and rearrange in a millisecond time, but could go up to 10000 in the future. My challenge here is the communication between front-end and back-end, so the front-end always has a consistent data set in real-time.
How close is the client to the server? Is it a mile/km away or hundreds/thousands of miles away? Is the client on the internet or is it on a high-performance VPN? Are you close to the backbone or dozens of hops away? You're not normally going to consistently get 1 millisecond latency on the web if you're trusting the general internet.
If you are on an internal company network and the client is physically close to the server, e.g., same machine, same local network, you can get single digit ms latency with WebSocket (I personally have gotten 3-4 ms across internal data centers at a big investment bank).
Don't optimize too early. That's usually a bad thing.
Although with any high-performance UI, its always good to just send the deltas.
You may want to consider some sort of event mechanism to reduce your polling the data source. Then you would only update the data when it actually changed.

iOS remote MySQL database, technology recommendation

There is web application, journalism related, that uses MySQL databases and presents a web based interface to users.
I want to build a iOS app that does a mobile interface as well. The UI is pretty easy and I have experience with that.
The problem is with the database, which I have no experience with.
I will be learning about databases and probably take the Coursera course on it. I am not asking you to teach me that. I just wanna know which technologies I should invest my time in over the next couple months.
My understanding so far is that the app should not talk to the database directly,
but rather there should be some one on the server talking to the database on behalf of the App.
This is the question and the part I want to understand clearly, so correct me if I am wrong.
I will have to write some sort of a unix program that runs on the server and talks to the db and then communicates back to app? how? using a web view? Using unix sockets to talk to the app? ssh? Which one is cool with Apple?
My preference for writing something like that on the server would be: python(have experience), java(have experience), and maybe ruby(no experience). I'd prefer to avoid scripting languages.
Are they ok? Which one is best suited? Also is this middle dude going to have to be on the same server that has the database or can be another machine on the internet(i'd prefer this, so i can put it on my own VPS and not have to screw up with the server machine)
This is similar to another question from tonight, but you're coming at it from a different angle.
In general terms, an iOS application that needs to be able to run in offline mode will need to have its own database. This means creating Core Data models to store all of the data required by the application. Internally this is stored in a SQLite database.
If you want to make an application that's online-only, it's somewhat easier since you won't need to worry about the Core Data part and can instead focus on building your service API. If you're familiar with Python then your best bet is Django to provide that layer. You'll need to implement a number of endpoints that can receive requests, translate that into the appropriate database calls, then render the result in a machine readable format.
Scripting languages are what power most back-ends even for massive scale systems. In most cases the database will be the bottleneck and not the language used to interface with it. Even Twitter stuck with Ruby until they hit tens of millions of active users, so unless you're at that level, don't worry about it.
For most applications, using HTTP as your transport mechanism and JSON as your encoding method is the way to go. It's very simple to construct, easy to consume, and fairly easy to read. There are probably a number of ways you might go about reading and writing this, but that's another question.
For small-scale applications where the number of users is measured in the hundreds then you can host the application and database on the same server. Even a modest VPS with 512MB of memory might do the job, though for heavier loads you might want to invest in a 1GB instance. It really depends on how often people are accessing your application and what the peak loads are like.

Complex data-driven web application in Java - Decision on technologies

Dear Stack Overflow Community,
I am a Java programmer in front of a task of building a complex, data-driven, web application (SaaS) and I'm searching for technologies to use. I have already made some decisions and I believe I'm proficient enough to build the appliaction using just the technologies I have decided for (I'm definitely not saying it would be perfect, just that it would be working). However, I'd like to make my task easier and that's why I need your help.
Brief description of the project
Back-end
The application will be heavily data-driven, meaning that everything will be stored in a self-descripting database. This means the database itself will be entirely described with metadata and the application will not know what data it reads and writes. There won't probably be any regular entities (in terms of JPA #Entity) because the application won't know the structure of the data; it will obtain it from the metadata. Only the metadata will have a pre-determined structure. To put it simply, the metadata is the alpha-omega of the application because it will tell the application WHEN and WHAT to display and HOW to display it.
The application will probably utilize stored procedures to perform some low-level tasks on the data, such as automatical auditing, logging and translating to user's language, thus most likely eliminating any possibility to use ORM frameworks because there won't be just simple CRUD operations. Therefore, JDBC seems like my only option (doesn't it?).
Front-end
The UI will be "dumb" in terms that it will not know what data it is displaying (to some extent, of course). It will just know how to display it based on the metadata which it will obtain from the database. All UI controls (like menu items, buttons, etc) will be created based on current application's state and the UI will NOT know what the controls do. This means that clicking a menu item or a button will just send an identifier of associated action to the back-end and the server will decide what to do.
My goals
My main goal is to have the application as lightweight as possible with as least dependencies as possible. Because the application will be very complex, I'd like to avoid any heavy framework(s) because there is a very high probability that I'd need to customize a lot of its functionality.
What I have already decided for
Please object to the following decisions only if you think they're absolutely non-viable for my application, as I have already implemented some core functionality using these technologies:
Servlets on Tomcat, Guice DI, AOP (AspectJ)
I believe all of these technologies are lightweight enough and I don't need to learn J2EE.
GWT with GIN-jection on the front-end
Seems like the best option for me because I'm very familiar with Java and Swing and don't want to write any Javascript, PHP or learn a new language. GIN is a little brother of Guice and I will be using the same syntax and principles on both the client and server.
MSSQL RDBMS
This is actually a requirement from company management as I'd much rather like to go with an open-source solution. Too bad for me..
Maven 2
I think no-one can object to this :)
What I need help with
DB communication
I think that ORM is ruled out (is it?) so I need to use JDBC. Do you think Spring JDBC is lightweight and flexible enough for my use? I would often need to "blindly" read data from database, mapping it to some generic entity (because I won't assume any pre-determined structure), and then send the data using some generic DTO to the client along with the metadata telling it what data it is and how to display it. Or do you know any alternatives? Or should I do this myself?
Client/Server communication
GWT and its GWT-RPC mechanism seems not very suited for sending the generic data I need. Although I'm convinced that it's doable using GWT-RPC, are there any alternatives? But I definitely want to use GWT.
Security
Do you know any security libraries / frameworks that would help me? I'm aware of the existence of Spring-security; do you think it's flexible enough for my use or I'd be better off implementing that myself? Also, is Spring's IoC an integral part of the Spring framework, or would I be able to continue to use Guice?
Anything else that you think might be useful?
I really appreciate any advice and suggestions because I wouldn't dare to try to make such decisions myself. Please ask me if you need more information.
Thank you in advance!
eQui
I think you are over-engineering the solution. Take a look at
http://thedailywtf.com/Articles/Programming-Sucks!-Or-At-Least,-It-Ought-To-.aspx
If everything is driven by the DB, you are going to have immense difficulty making things happen in the UI, and you aren't going to be able to use many of the tools that make UI development easier.
I also suggest you take a look at Spring Roo, if your application is mainly just updating data in a database.
UI framework and implications for client/server communication
You say that any UI action will triger the backend (and potentially the DB). This mean that UI interraction will be somewhat slow anyway, and more than that will require a round trip to the server.
GWT is especially suited to avoid as much as possible round trips to the server and do all UI work on client side. In this model, only information that will transit from client to server is real data, and not UI metadata. GWT will do the job, but you'll be using a somewhat low level tool, needed for advenced optimisation you'll be unable to perform anyway...
Framework like ZK or Vaadin seems more suited to what you want to do. The client side has nice widgets with a rich UI, but you manipulate the UI from the server side. The framework manage client/server communication for you (no need of REST, RPC or javascript). The main limitation of theses framework is scalability, with all theses chatty round trip. But because your requirement impose that chatty behaviour anyway, you could really benefit from the abstraction they provide, are they are at not cost in your case.
I have tried both GWT and Zk to do some proof of concept for my company. We ended choosing GWT, because of it's hability to be embedded nicelly into any existing UI and to fine tune what you do... In particular avoid as much as possible rountrip to server. But ZK is really easier and faster in term of developmeent hours.
The side effect is that would totally solve your client/server communication concern, leting the framework performing it in an optimized way (Zk is able to intelligently regroup several UI event before sending them to server).
DB and ORM
For DB design, i tend to think that using fine granularity things in DB will make it very very slow. If each widget is one or several rows in the database you'll have to perform many lookup to perform the simpliest thing.
Problem is if your UI is just a little complex with a few dozen of elements (a few button, checkboxes, labels and widgets), compositing a screen will require lot of requests to the DB. Rendering just one page might be very slow and scalability would be very very bad.
I know this because i worked on somewhat generic bug tracking system with similar (but simpler) requirements than yours and we had exactly this problem.
So i would try to describe UI in some templating or XML format. Maybe you'll not show this data to the user, providing it with a nice abstraction, but instead of performning many queries for just one screen, you'll save the whole screen as one blob.
A really dumb and basic implementation of this would be to store HTML/CSS/PNG file in your DB and load it as needed, with user being responsible for making theses HTML file by hand. Of course this would be terrible for the user. That's why you need a nice and fancy editor UI editor that would work on an intermediary format of your own. Another dumb implementation would be some sort of wiki templating. This is not what you need, you need more. But you have the idea, I would seek in that direction...
For the maintenance and debugging too, this would be far easier to the whole UI description to a few file, to understand what is really implemented than to read lot of tabuled data in your prefered SQL editor. Users would have they export/import format to easily version, backup or experiment.
Security
I would say by hand... Because you have a generic UI generated by user it seem likely that the security will be generic too and dependant of database content.
Hope it help...
For the backend, i implemented a program which had a similar interaction with the database. the code was database structure oblivious, instead, it read a config file describing the db and could construct complex sql queries based on this information. most of the code is proprietary, but one bit of it got pushed into an open source project called sqlbuilder. may be useful to you on the backend.
I think you're on the right track, with your tool seclection. Your 100% data driven model is going to be hard to maintain. But I understand that's a requirement not an option. Normal source control is going to fail you becuase of the ui application logic all being in the meta-data. You'll need some good test databases and some way to maintain them, such as regularily mysqldump them out and check them in to souce code control to handle all the differences, etc..
You're wise to stay away from various ORM solutions and just use JDBC for this type of app.
Let me give you some warnings about GWT. On the surface it will abstract all the uglyness of html, javascript away and give you clean heirarchy's BUT...
1) If the abstraction fails you how do you easily debug?
2) Do you want any of your site to be visible to Google or other search engines, if yes GWT is not for you
3) Do you want to use any HTML5 technogies or do you want to be stuck in IE 5 compatability mode?
So...
I think you'll be much better off Implementing the UI as simple HTML controls with a small set of jQuery ajax interactions with the server. You can define an input type in your database, your serverlet can generate an input tag and then you have two options you can have some standard event bindings in jquery to tell your server that button1. is clicked, or that select2 has changed, etc.
Your server can send back javascript to change the state of the ui - simply load the javascript inside a div so it runs on the client. or 2) You let the input submit the data to the server and do an old school page refesh and the servlet build the next ui screen based on the database.
Building an interface dynamically in HTML from a database is easy and straight foward compared to doing the same in SWING or Windows Forms. You just have to write out a big text string, been doing that since 1999.
That approach is going to be much more lightweight - simpler to debug, understand and modify in the long run than going with the "GWT automatically compiles to unreadable javascript that doesn't run in my browser for some unknown reason" solution.

How to Implement caching for a web application

What are the different ways to cache a web application data, developed using Java and NoSQL database? Databases also provide caching, are they, the only & always the best option to go with, for caching?
How else can I cache my data of users on the application. Application contains very user specific data like in a social network. Are there some simple thumb rules of what type of things should be cached?
Can I also cache my data on the application server using Java ?
If you want a rule of thumb, here's what Michael Jackson (not that Michael Jackson) said:
The First Rule of Program Optimization: Don't do it.
The Second Rule of Program Optimization (for experts only!): Don't do it yet.
The ancient tradition is that you don't optimise until you've profiled - that is, until you have hard evidence as to what actually needs to be optimised. Cacheing is a kind of optimisation; it is very likely to be important for your app, but until you are able to put your app under load and look at what objects are taking a long time to obtain (loading from the database or whatever), you won't know what needs cacheing. It really doesn't matter how smart you are, or what advice you get here - until you do that, you will not know what needs to be cached.
As for things you can cache, it's anything, but i suppose you can classify it into three groups:
Things that have come fresh from the database. These are easy to cache, because at the point at which you go to the database, you have the identifying information you'd need for a cache key (primary key, query parameters, etc). By cacheing them, you save the time taken to get them from the database - this involves IO, so it is likely to be quite large.
Things that have been produced by computation in the domain model (news feeds in a social app, perhaps). These may be trickier to cache, because more contextual information goes into producing them; you might have to refactor your code to create a single point where the required information is all to hand, so you can apply cacheing to it. Or you might find that this exists already. Cacheing these will save all the database access needed to obtain the information that goes into making them, as well as all the computation; the time taken for computation may or may not be a significant addition to the time taken for IO. Invalidating cached things of this kind is likely to be much harder than pure database objects.
Things that are being sent to the browser - pages, or fragments of pages. These can be quite easy to cache, because in a properly-designed application, they're uniquely identified by either the URL, or the combination of URL and user. Cacheing these will save all the computation in your app; it can even avoid servicing requests, because it can be done by a reverse proxy sitting in front of your app server. Two problems. Firstly, it uses a huge amount of memory: the page rendered from a few kilobytes of objects could be tens or hundreds of kilobytes in size (my Facebook homepage is 50 kB). That means you have to save a vast amount of computation to make it a better deal than cacheing at the database or domain model layers, and there just isn't that much computation between the domain model and the HTML in a sensibly-designed application. Secondly, invalidation is even harder than in the domain model, and is likely to happen prohibitively often - anything which changes the page or the fragment needs to invalidate the cache.
Finally, the actual mechanism: start with something simple and in-process, like a map with limited size and a least-recently-used eviction policy. That's simple but effective. Something out-of-process like EHCache is more complicated, but has two advantages: you can share caches between multiple processes (helpful if you have a cluster, which you probably will at some point), and you can store data where the garbage collector won't see it, which might save some CPU time (might - this is too big a subject to get into here).
But i reiterate my first point: don't cache until you know what needs to be cached, and once you do, be mindful of the limitations on the benefits of cacheing, and try to keep your cacheing strategy as simple as possible (but no simpler, of course).
I'll assume you're building a relatively typical web application that:
has a single server used for persistence
multiple web servers
ties authenticated users to a single server via sticky sessions through a load balancer
Now, with that stated to answer so of your questions. Most persistence, database or NoSQL, likely have some sort of caching built in such that if you execute the same simple query repeatedly (e.g. retrieval by primary key) it's able to cache the result. However, the more complex the query, the less likely persistence can perform caching on it. In addition, if there's only one server for persistence (i.e. no sharding, or write master/read slaves) it quickly becomes the bottleneck. So the application level caching you want to do usually should occur on the web servers to reduce load on the database.
As far as what should be cached, the heuristic is items frequently accessed and/or expensive to generate (in terms of database/web server processing/memory). Typical candidates are the home page and any other landing page of a site - often the best approach for these is generating a static file and serving that. The next pieces depend on your application, but typically the most effective strategy is caching as close to the final result as possible - often the HTML being served. For your social network this might be a list of featured updates or some such.
As far as user sessions are concerned, these are definitely a good candidate for caching. In this case you can probably get a lot of mileage out of judicious use of the web server's session scope (assuming a JSP server). This data lives in memory and is a good place to keep of user specific information shown once a user authenticates on every page (e.g. first and last name).
Now the final thing to consider is dealing with cache invalidation and really is the hard part of all this (naming stuff is the other hard thing in computer science). In this case using something like memcached or ehcache as others have mentioned is the right approach. ehcache can easily run in process with your java application and does a good job of expiring things, with policies for least recently used and least frequently used, and allowing you to use both memory and disk for caching. What you'll need to think about is the situations where you need to expire something form the cache ahead of this schedule because data's changed. In this case you need to work through those dependencies in your application's architecture so that it read/writes to the cache as appropriate.

Categories