I am trying to find the better way to update my database.
I have two solution but I don't know which is the less demanding.
The first one is to create a java program with a Thread.sleep and an infinite loop.
The second way is always with a java program but no thread sleep and infinite loop, just the update program and I execute this program using cron.
Thanks for your help
"No thread sleep" + cron is simpler and easier to maintain:
Your program does just one main thing - updates the database.
Scheduling is delegated to cron. You are more flexible in configuration and do not need to write extra code to support "scheduling"
Related
I want my servlet to wait for the result of a database update before moving to the next line.
I have the following code snippet:
//wait for this to finish and get the status
status=profiledao.updateProfile(profile);
//then execute this statement
httpresponse.getWriter().print(fileName);
What is the best way to do this?
Thanks.
Unless you are running this line asynchronously with additional unshown code, the code you have posted will do exactly what you are intending to do in a normal Java environment.
If you are running this asynchronously and want to continue doing so while still accomplishing the task you are talking about, you want what is called a Promise
CompletableFuture.supplyAsync(this::sendMsg)
.thenAccept(this::notify);
This code is a very very simple way of doing this in an asynchronous setting, however, if the code you have shown is Synchronous then this is not needed.
I'm trying to execute a Spark program with spark-submit (in particular GATK Spark tools, so the command is not spark-submit, but something similar): this program accept an only input, so I'm trying to write some Java code in order to accept more inputs.
In particular I'm trying to execute a spark-submit for each input, through the pipe function of JavaRDD:
JavaRDD<String> bashExec = ubams.map(ubam -> par1 + "|" + par2)
.pipe("/path/script.sh");
where par1 and par2 are parameters that will be passed to the script.sh, which will handle (splitting by "|" ) and use them to execute something similar to spark-submit.
Now, I don't expect to obtain speedup compared to the execution of a single input because I'm calling other Spark functions, but just to distribute the workload of more inputs on different nodes and have linear execution time to the number of inputs.
For example, the GATK Spark tool lasted about 108 minutes with an only input, with my code I would expect that with two similar inputs it would last something similar to about 216 minutes.
I noticed that that the code "works", or rather I obtain the usual output on my terminal. But in at least 15 hours, the task wasn't completed and it was still executing.
So I'm asking if this approach (executing spark-submit with the pipe function) is stupid or probably there are other errors?
I hope to be clear in explaining my issue.
P.S. I'm using a VM on Azure with 28GB of Memory and 4 execution threads.
Is it possible
Yes, it is technically possible. With a bit caution it is even possible to create a new SparkContext in the worker thread, but
Is it (...) wise
No. You should never do something like this. There is a good reason for Spark disallowing nested parallelization in the first place. Anything that happens inside a task is a black-box, therefore it cannot be accounted during DAG computation and resources allocation. In the worst case scenario job will just deadlock with the main job waiting for the tasks to finish, and tasks waiting for the main job to release required resource.
How to solve this. The problem is rather roughly outlined so it hard to give you a precise advice but you can:
Use driver local loop to submit multiple jobs sequentially from a single application.
Use threading and in-application scheduling to submit multiple jobs concurrently from a single application.
Use independent orchestration tool to submit multiple independent applications, each handling one set of parameters.
I'm fairly new to java and I was creating a program which would run indefinitely. Currently, the way I have the program set up is calling a certain method which would perform a task then call another method in the same class, this method would perform a task then call the initial method. This process would repeat indefinitely until I stop the compiler.
My problem is when I try to create a GUI to make my program more user friendly, once I press the initial start button this infinite loop will not allow me to perform any other actions -- including stopping the program.
There has to be another way to do this?
I apologize if this method is extremely sloppy, I sort of taught myself java from videos and looking at other programs and don't entirely understand it yet.
You'll need to run your task in a new thread, and have your GUI stuff in another thread.
Actually, if you keep working on this problem, you'll eventually invent event driven programming. Lots of GUI based software, like Android, use this paradigm.
There are several solutions. The first that comes to mind is that you could put whatever method needs to run forever in its own thread, and have a different thread listen for user input. This might introduce difficulties in getting the threads to interact with each other, but it would allow you to do this.
Alternatively, add a method that checks for user input and handles it inside the infinite loop of your program. something like below
while(true){
//do stuff
checkForUserInput();
//do other stuff
}
To solve this problem, you need to run your UI in another thread.
Many programs are based on an infinite loop (servers that keep waiting for a new user to connect for example) and your problem isn't there.
Managing the CPU time (or the core) allocated to your infinite loop and the one allocated to take care of your UI interactions is the job of the operating system, not yours : that's why your UI should run in a separate thread than your actual code.
Depending on the GUI library (Swing, ...) you're using there may be different ways to do it and the way to implement it is well answered on Stack Overflow
I have a thread cleaner in my code that is being created if the DB capacity was exceeded, the capacity is checked on every insertion to the DB. I would like to add more functionality to this cleaner and clean also when number of files exceeding, lets say 10000 files. The new functionality should run scheduled.
I want to be able to clean the DB in 2 ways:
1. On demand.
2. Scheduled, every day on X hour.
Which concurrent java class to use?
How can I make sure that the same thread will be used by the 2 ways above?
Code that would perform cleanup of DB should be completely separated out of scheduling (single responsibility principle), so that you could execute it at any time from some other code.
As for scheduling, I would suggest you looking at Quartz scheduler, and get familiar with CRON so that you could extract it to properties to have possibility to change scheduling trigger without modifying your code.
You should synchronize your code so that no more than one cleanup gets performed at the same time, this should be easy with standard synchronize.
If you wish to make it very simple and don't want to add new dependencies, you can go with standard Java solution: Timer. Timer#scheduleAtFixedRate can provide fixed rate execution. Which means you'll have to add extra code whenever new requirements will show up (e.g., don't schedule at weekend).
I am creating a java service which will continuously run in the background and the job of the service is to create a copy of the table at a particular date. To be exact, i read data from some table and if record_date in table matches the current date, i need to create the table copy. Then the service should sleep until the next date to run. Next date to run is also determined by looking at the record in the table.
Currently, how i do this, is to create a thread which runs in while(true) loop. and when thread is finished performing the task i.e. creating a table copy, I put it to sleep using Thread.sleep() until next time it needs to run. The number of milliseconds for thread to sleep, i calculate by taking the difference between the current date (date on which the task is performed by thread) and the next run date.
Is this the right approach, is using thread.sleep() for this particular scenario the right thing? I say this because next run date for a thread could be after three months or even a year. Also please let me know if i am not very clear here.
What about dissecting both operations?
Write a Java Job which when invokes checks for date in the table and create a copy.
Schedule the java job to run the way you want it to run.
Since we UNIX so cron helps us a lot in doing such tasks.
Have a look at the Lock interface. This is an abstraction for wait() and notify(), which is what you should use instead of sleep().
There is an answer here which illustrates why.
Check out the Java Timer API or the Quartz library