Clustering java standalone application [closed]

Clustering java standalone application [closed] - java

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I am working on a java standalone app which reads big files (500 Mo), deserializes these files (protobuf message - Google api) and inserts it into oracle 11 DB.
Important thing to say is that there is one main table in database, and several small tables (can be compare to dictionnaries).
For all dictionnaries, i have a Google cache (Guava).
There is no cache for the main table. In the main table there is only insertion, no update, no delete.
At the moment, this application runs onto a single JVM.
(Potentially, i can add multithreading.)
I would like to make it works on several JVM.
My problem is to know what to do in order to get higher performance and to make it works properly.
I identified two problems : if clustering the app will allow me to read several files at the same time, how to make the insertion into the main table faster, and how to update the cache?
Does someone have an idea about that?

how to make the insertion into the main table faster
Jackpot! You must identify your bottlenecks and most likely it's either reading files or the database. Files are simple, just split them and put on different machines. Of course running several JVMs on the same machine won't help since they will all compete for I/O. So you must split the files and distribute them over several machines, together with JVMs.
I assume deserializing protobuf is not a bottleneck, it requires some CPU, but not that much.
And finally you have a database. It's possible that a single, single-threaded JVM can fully utilize the database, but it's worth trying. First make your app multi-threaded and see whether it helps.
how to update the cache?
Jackpot again. You'll have to distribute/cluster your cache as well. Guava cache is not enough, you'll need something more sophisticated like RMI clustered EhCache, Terracotta or Hazelcast. Basically they provide cache API but notify other members of the cluster that cache changed and needs to invalidate.
BTW 500 MiB is not really that much, how long does it take to process? Again, you must profile to find what's slowing you down.

Related

Deploy a basic Java application to users in a network [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I need some tips on creating a Java application. A few words about the application: people in the company will log in with a pwd; customised access types; possibility to perform changes in a remote database from a host within the network.
This is probably a very simple program. At first I don't care for security matters, that can be optimised in time. I just don't know how to start.
So far I've played with some algorithms (I like algorithms), connected an applet to a database and done a few select/updates, a couple of swings.
Even something that sounds like detailed chapter titles that I can investigate can prove useful, as I'm not sure what to look for when I want to create an application that can be distributed in a network.
Based on the previous:
What do I need to create? An applet, a swing, etc? Should it be an .exe and how is that done roughly?
Thanks for any tips.
I hear the application should come with an installation kit? What will that install - probably the JRE after checking whether it's installed on the client PC? What else?
Other tips that I should know for starters?
I'm not worried about algorithms or class names in the context of a user action (a select/update,etc)- I can find those in libraries. I'm interested in how to actually create a basic application that can be sent to users all over the company (methodology/practices/trainings), that they can run and see on their screen the result of a simple select, let's say. Any pointers/good references - as there are a lot of sites out there and not all are good. Thank you!

For Desktop Applications with a GUI and Database interaction etc. you should consider the use of a Rich Client Framework like Netbeans Platform or Eclipse RPC. This will certainly make a few things way easier, for example deploying the application, creating installers or multiple windows with docking capabilities.
By the way
At first I don't care for security matters
is generally a very, very bad idea...

Are you trying to launch the application from a web link? Then what you're looking for is "JNLP" or "Java Web Start".
https://en.wikipedia.org/wiki/Java_Web_Start
There are ways to set up user perms as part of the launch, and it will provision and deploy code and updates if necessary.
Good luck!

How can a DAO benefit the scalability of a system? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
maybe this is not a technical question but I bet here are many experienced developers that can help me answering that
thanks

A DAO layer is essentially an abstraction like Sajit says. However I disagree with his interpretation. The point of abstracting something to achieve a goal - usually the simplification of some more complex use case.
You could easily create a DAO layer that also provides more functionality than simply doing application entity - data entity mapping. It could provide caching, optimisation, translation, resiliency etc. So There is no reason why it could not offer the ability to better scale your application.
Ultimately is depends on terms - what does scaling mean to your application? More/faster ??? etc.

Dao is typically used to abstract away the implementation details of the database in an application and has nothing to do with scalability.

The DAO (Data Access Object) is used to provide a layer of abstraction over the database. It tends to have methods which in turn eventually open connections and execute queries and/or stored procedure.
I think that when it comes to the scalability issue, you need to watch out for 1 major thing in a DAO: Connection management. If you are using some third party library, maybe something along the lines of Hibernate you will most likely have to worry less about connections since these are managed by the library itself.
On the other hand, if you implement everything yourself you will need to make sure that you open the connection at the last moment possible and release it at the first possible moment. Having a DAO which hogs connections will eventually limit how will your application scale.
Lastly, in some cases, the DAO passes direct queries to the database. You will need to keep an eye on how you build these queries to make sure that they do not involve any unneeded processing.

In really simple words, You really can't make a scalable application if everything is tied up. DAO is just another layer which helps you deals with the Data Access Logic. This way you know where you can find your dynamic sql and stuff and can enhance and maintain it.

Make a website for running code [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I'm currently developing a programming contest website, and want to implement support for submitting code and running it on the website. After quite a bit of googling, I still haven't found any "guides" for this.
Does anyone know of a website(or other sources) that contains some basic guidelines or ground rules for this?
Appreciate all replies.
PS: If anyone wonders about all the programming language tags, I'm planning on supporting at least these languages.

Careful -- if you're finding it difficult to break this project down into some smaller, more tactical problems, I'd strongly suggest that you make no attempt whatsoever to actually run anyone else's code on your site. In terms of creating the site itself, I'd suggest leveraging pre-built components or services where possible -- Wordpress, GitHub, etc.
Once you've got the submissions, you'll want to have a way to run them safely. For all practical purposes, this means that you should assume that any machine you run someone else's code on might spontaneously burst into flames. While it's true that some of these languages have features you should be able to use to run code in a "sandbox", you're probably not going to be expert enough in all these languages to be able to properly secure all of them.
It seems that something like Amazon's EC2 might be helpful -- spin up a VM when you need to run a submission, and throw it away when you're done. They've got some pre-configured images that would probably be well-suited to running this code, and if something gets buggered up because of buggy or malicious code, you don't mind too much because you're just going to throw it away when you're done.

There is a site that already does this, albeit for a particular purpose: scraping data.
https://scraperwiki.com/ - Unlike jsfiddle, scraperwiki executes server-side code. As far as I can gather, they likely sandbox the environment via amazon instances. Not sure that their code can be entirely audited and sanitized, given the variety of languages and scraping libraries they support.
I think most people are baffled as to how scraperwiki keeps hackers and spammers at bay from misusing their resources. They've been rather mum about it; either they've manually audited every bit of executed code, or hackers/spammers haven't caught onto them yet. Since the site has a specific function, they probably check data utilization to determine suspicious activity. ...but, one man's site scraping is another man's harassment and injection by get/post.
My hunch is that they'll never publicly spell out what their security audit process is like.
If you really had to do it, simplest mechanical way of doing this without virtualization is to use a variant of eval(). But, not all languages have that. Which brings you to option B, which is virtualization. Better people than I can explain how to regiment virtual machines to this effect, and will caution you properly on letting strangers abuse your resources. Instead, I'll share my PHP experience.
Some years back I've made a project that does code execution on the fly (on a local machine.) As you type, it takes the code via ajax and executes after each keystroke. Here's a video of its behavior: http://www.youtube.com/watch?v=Yfxrt2pc3pg.
Half a decade and 3 improvement prototypes later, I'm still not sure how I would responsibly lock this down as a common resource.

For Java it is quite simply:
You're have to create Servlet, for uploading source code into server (for
example, via POST request)
Use Java Compiler API to compile source code to bytecode ( tutorial )
Compiled bytecode you're might dynamically load via ClassLoader and launch it (also you're might configure SecurityManager)
And don't forget about MVC architecture :)

Hiding java code [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I would like to give an executable jar file to my client and I have a code in it to expire after a certain time. But if the client uses some decompiler and reads the class file, he can modify and make the code to work Is there any way I can stop this from happening.
Can I use deleteOnExit() or some other technique ???

If you are so worried someone will crack your software, you'll need to use some kind of client/server architecture where your client can only log in to a webpage in your servers.
Any code can be cracked if there's someone who really wants too. Of course, most of the time its simply not worth it.

Best solution for this is PaaS (platform as a service)..
put your logic on a server as server-app and let client-app communicating with server-app through web-service or any other way .. this is the best real solution..
BTW: using obfuscation cannot protect your code.

Sorry, not possible in any kind of environment currently in use. If you are so worried about the client stealing your code, it might be good to reevaluate your relationship with him.
On the other hand, you could provide him with a gated VNC view of your software, whereby he can use it but where you remain in control of the environment.

In the past, I've used Zelix KlassMaster to perform obfuscation. It does a very good job, however, you have to spend time to configure it such that it works properly for your needs. If you use reflection, then you have to ensure that it doesn't obfuscate those class/method/property names, etc. One of its strong suits is that it will obfuscate strings as well.
All that being said, the end result is that your client will still have your code, alebit in a very difficult-to-understand format. However, if he truly has the time and effort, he can reverse engineer it.
A lot also depends on exactly what it is that you are trying to protect. If you are trying to protect the actual IP then obfuscation will help you out. If you are trying to protect licensing, then obfuscation just makes it a little more challenging for someone to figure out where your licencing module(s) are and how to circumvent them. In the latter case, I would then suggest that you use something like AspectJ to weave in licensing checks in several different classes just to make it more difficult to break. However that too is not fail-proof.
As others have already said, the only fool-proof system is to not give the client the code in the first place and change to a SAAS (Service as a software) solution.

What about using a modified classloader that is able to load your classes from an encrypted storage. As you do not directly expose the jar and the classes inside it might help but as all others said above - it won't be bulletproof.

java Saas Product is very slow [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I am developing a SAAS based java product using Spring 3 framework and using hibernate with annotations.
The site is very slow.I use ApacheTomcat server 6.0.26
Can someone tel me what Changes should I make ?
Thanks in advance!

It could be any number of reasons, though I doubt Apache has anything to do with it.
Possible reasons include:
unoptimized queries
database tables don't have indexes where they are needed
your database server hardware is crap
your algorithms and choice of data structures are bad
you are calling database too many times (requesting data even though it was loaded on previous screen etc..)
your database configuration and caching might be wrong
your frontend technology might be problematic (JSF is slow compared to JSP)
First thing you should do is figure out how much time per page render is spent querying database, how much on application server. You should also record count how many queries are executed per render (and which ones). Then substract these 2 times from time to load the page on the client side and you get the third time which is time to output the page to the client and to render the page on the browser.
If the time spent in the database on a single query is large then use a database profiler to see where DB is performing long table scans and set up indexes there. If the query returns a lot of data but you are using just a bit of it, try writing a more specific query. If your spend a lot of time in DB because of the number of queries try to reduce the number of queries by caching or reusing data on application server.
If time spent of application server seems to be the problem you might need to rethink your algorithms and design choices.
If a lot of time is spent in the third part - transferring and rendering it on a client, try optimizing javascript, using expiration headers on your static content, CDNs, etc...
Download and install YSlow pluging and use it to test your page and follow its suggestions.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.