Why do my map tasks take at least 3 seconds? - java

I previously asked Why are the durations of hadoop mapper tasks always a multiple of 3 seconds? and by setting the value of Task.PROGRESS_INTERVAL to a small number (>1000) I was able to get fast running tasks to finish in 3 seconds rather than the 6 they were taking. These tasks are literally processing no data and I'd like to get them to finish more quickly if possible.
What are the parameters that are causing the tasks to take 3 seconds

Related

How to implement scheduling inside a running thread group?

I am testing a certain "functionality" that happens after log in.
The test case is 500 users exercising that functionality within 5 minutes.
I can add a synchronising timer after the log in, to ensure all 500 threads have logged in but then it will do all 500 "functionality" tasks at once, rather than 5 minutes, which will crash the app (it thinks there's a DDoS attack and shuts down).
Right now, I am handling this by giving some think time after login, to slow down login to a stable figure that I can predict and then start "functionality" at each thread's turn, as scheduled by: the main scheduler + the the log in response time + the think time...
But that's a bit fuzzy.
Is there a way to "ramp up" tasks once already running?
I can think in two options.
The first one is two use random times. You would use the range from 0 seconds to 300 - 1 that is [0-300) or using millis [0-300000). Then sleep the thread basesd on this ramdon time.
This approach can be a little more realist, because for instance, in a specific second of the given interval you don't have any threads starting and in other particular second you have 2-3. This still should be well balanced in general, since you won't make all petitions at start.
The second one is to start the threads uniformly. During your configuration time (login and before firing the threads) you can use something like an AtomicInteger, initializing it with new AtomicInteger(0) and calling getAndIncrement() to assign the possition of the thread, in the range [0-500) and then when you fire the threads sleep 300.0 * id / 500.0 milliseconds to execute the task/petition.
By default JMeter executes requests as fast as it can, you can "throttle" the execution to desired throughput (request per minute) rate using Constant Throughput Timer.
Example Test Plan would look like:
Thread Group
Login
Synchronizing Timer
Functionality
Constant Throughput Timer
Constant Throughput Timer follows JMeter Scoping Rules so you can apply it either to single sampler or to a group of samplers.

AppEngine Scheduled Task syntax for a task to run after X years

I have a task that needs to run every 4 years in my application. How can I configure that in cron.xml. I know that <schedule>1 of Jan</schedule> would run it yearly but is there a syntax for running it after X years something like <schedule>every 4 years 1 of Jan</schedule> ?
-Srikanth
The documented schedule format doesn't seem to have support for what you desire.
You could convert such very long intervals in hours and use this format instead (might be tricky to get exact dates/times in some cases - leap years, etc):
every N hours
I just tried a number of hours equivalent to over 4 years, the development server doesn't seem unhappy:
Another possible approach (which also allows precise date/time specs) is to invoke the task yearly as you mentioned, keep track inside your app of the last execution date or the number of invocations and only execute the actual task's job in 1 out of 4 invocations and simply exiting without doing anything in the other 3 invocations.

How to simulate CPU bound job

I want to simulate CPU bound jobs in my simulator and i need a calculation or code that run for 1 second in the cpu ...how i will do it...
i am using the folllowing code
long Time1 = System.currentTimeMillis();
///calculation or loop that spends 1 second in cpu
long Time2 = System.currentTimeMillis();
System.out.println(Time2-Time1);
Now i need the calculation that take 1 second...I also need to simulate for 2 ,3 to 4 seconds
what code i should put in line 2.?
If you really meant binding a job to the CPU for 1 second, I don't think it is possible only with pure Java. The reason is that the OS will still remove the process from the CPU, schedule it again and so on for a number of times within 1 second. So you need to make a special kind of request to the OS to do this. The request should go and affect the process scheduling algorithm of the OS. But this is not how we want our applications to consume the CPU. We want the CPU to honor the interrupts and so on. So may be your intension is not clearly mentioned in the question or you might be trying to test something special and uncommon.
If you are just simulating something, may be you might just use a sleep() call as suggested in the comments which actually would not be consuming the CPU for 1 second but allows you to assume so for the simulation purpose.

Scheduling tasks to prevent overlap of processing and bandwidth usage in Java

I am creating a daemon that will run certain scheduled tasks for logging but I am worried about bottle necking certain points.
Effectively I have some logging tasks that I want to execute every 15 minutes and some I only want to execute every 30 minutes and so on up to tasks that only need running once a month. Basically I have a list of checks to make at each time interval. These are put into a queue and processed by a thread pool.
At the moment I see the tasks as running something like this
15
15, 30
15, 30, 60
15, 30, 60, 120
15, 30, 60, 120, 240...
This means that if the daemon starts at 00:00 hours then by 04:00 hours there will be five processes running simultaneously and this is not the end of it. At present this has led the next task scheduled for 15 minutes to run slowly and have access to a restricted amount of bandwidth.
It is not neccessary for the tasks to run on the hour however. So if the 15 minute task runs on the hour the 30 minute may start at 5 minutes past the hour so as to minimise overlap. It would even be possible to split the two 30 minutes tasks (e.g. 00:00 and 00:30) across the four 15 minute process to reduce being hit by an 'all at once' type problem but this really gets my head swimming.
Are there any well known methodologies for managing this type of issue?
You should definitely have a look at Quartz, especially the cron style triggers.
Cheers,
Well, for long run (as I see from your question), you will have to look for something like quartz.
Apart from this, I have couple of more concerns and suggestions here:
Use [ScheduledExecutorService][2] for managing these threads. Even with ScheduledExecutorService, looks like you want to run them at separate interval irrespective of their execution time. SchedulewithFixedRate rather than ScheduleWithFixedDelay.
Even if you implement anything like this, your logic will fail if your threads start to run on multiple hosts. 2 hosts running hourly threads will effectively run every 30 minutes.
I would prefer to have a centralized management in terms of Database to keep track of last run and all. This along with ExecutorService would be scalable and accurate.
Suppose you have 1000 schedules, create 1000 rows in the DB.
Columns will somewhat be like this
id, P_Key
ScheduleName, AnyIdentifier for the daemon to run or task to do.
lastRunTime, lastTime it was run.
granularity, 15 mins, 30 mins etc.
You can keep CreationTime and ModificationTime as best practices.

Jmeter - simulating more complex load scenarios?

Been experimenting with Jmeter, and I'd like to know the best way to accomplish:
20 users logging onto an application, over 20 minutes, and performing some actions for another 20 minutes, before logging off over a period of 20 minutes. I.e. have 200 users logging on, and then once ALL of them are logged on, begin 20 minute timer. Once 20 minutes are up, start to log the ones who logged on earliest off.
I realise this MAY or MAY NOT BE a realistic testing scenario, but i'd like to see if it's possible.
At the moment I have a test plan whereby a user logs on, performs some actions, and then logs off. I can't see how I can ramp up and ramp down.
There's an option in Test Plan "Run thread groups consecutively". Set it to checked.
Then add 3 thread groups to your test plan. I'd suggest using Thread Group for first (20 threads, loop count 1, ramp up time 1), Ultimate Thread Group (20 threads starting immediately and holding load for 20min) for second and Thread Group again for third (20 threads, loop count 1, ramp up time 1).
Place appropriate samplers inside each TG - first just logs in, second does actions, third logs off.
That's it. If you have any troubles - let me know.
You'll need several thread groups in JMeter starting off and running at different intervals, in that way you could ensure that the users who start first will end first.
Also see a related question on this.
You can have no of users=20, ramp up time=1200 sec (1 per min), difference of time between test start and test end time=20 min to achieve that.
I think I had a similar problem in the past
Here Want I've did:
First set your thread group to have 20 thread with a rampup period of 60 seconds
After the login put a "test action" (in the sampler menu)
target = current thread, with the action pause and 20 minutes (1 200 000 ms) or more if you want to be safe.
After this test action, put all your navigating request.
Once your navigation is done, put another "test action" with the same setting has the previous one
(target = current thread, with the action pause and 20 minutes (1 200 000 ms))
put the logouf request after the sampler.
This should cover you case.
Take note that the sampler just pause your thread so the first thread that start should be the first thread that end.
If you want to scale it to 200 you just need to change your thread group rampup period to 6 or 5 seconds.
hope it's help.

Categories