Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I need to run some calculations on a distributed map. But I cannot decide which approach to take.
My calculations will result in a map data structure. where the results will be mapped to their keys. think of it as a word count example. where word is the key and occurrence count is the value.
I have looked into both solutions and as I understand, map reduce fits best in this scenario but i want to keep things simple. and i also cannot see why this is not possible with distributed executor.
Both options are possible. Before we had the generic mapreduce framework people build solutions like this using the ExecutorService implementation.
At the moment, it'll change in the near future, the mr solution doesn't offer a way to write to an IMap directly, so all results are send to the caller first and he would have to store it then.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I need a very lightweight and persistent key/value-store in Java.
The amount of data is very very low and it should be very simple (getter and setter and all can operate on strings).
So I think of using some small NoSQL-DB or even giving some integrated collection a serializer/deserializer to the filesystem.
But I think NoSQL is a overkill and I hope a persister also exists for such a simple requirement.
Whats the best approach here? Any ideas?
You can either implement your own thing if it is a simple key-value string. (Have a look at Java's Properties class too in case it suits your requirements).
If your requirements are slightly more complex have a look at the embedded lightweight databases you can use. Maybe BerkleyDB might work for you. There are quite a number of others if you do a bit of search.
Also think about what you actually need to do with the data. Do you need to query it (so it needs to be indexed?) or do you just want to load it back all into memory? (in which case using a simple JSON or YAML text format would also suffice.)
Most Map<String,String> can be serialized. So for example look into https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html
there you find Serializable. Under that point information to help yourself solve the Problem.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I want to know if there is any efficient method to Run a math expression from string in java , Having some example input and results of that function.
Starting from simple linear functions : a*x+b .To more complex ones
Or is there any good source i can start reading.
I take your task as: take observed input-output and learn some representation which is able to do that transformation with new inputs.
(Some) Neural Networks can learn an approximation-function (Universal approximation theorem
) (and probably other approaches), but there is something important to remark:
Without assumptions about your function (e.g. smoothness), there can't be an algorithm achieving what you want to do! Without assumptions there are infinite many approximation-functions, which are all equally good on your examples, but behave arbitrarily different on new data!
(I'm also ignoring special-cases as: random-data or cryptographic random-generators where this mapping also can't be learned (the former in theory; the latter at least in practice)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm new to using a database, specifically MySQL. I'm creating a web application for class in which you can look up the name of a book and it'll display the summary of the book. My question is should I send a query to the database that collects all of the books' data on initialization and put them into a HashMap inside a manager class for lookup or should I use a query each time to lookup a specific book information?
It depends on the data transport time I would say. If your average query time times the number of request goes faster than a script to put everything into a HashMap, use queries. Otherwise, use a script that collects everything and puts it into a HashMap.
But if you have thousands of rows, you should use queries, because otherwise you will use too much RAM.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a requirement where I am writing a small utility to test apis(ofcourse there are existing tools but it has been decided to write one). I am required to bombard the api, for the same api call, with say 100 threads, around say 100,000 times.
I am using 'PoolingHttpClientConnectionManager' for the making the calls. I am using something as mentioned in the below link:
https://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html
My question is:
(1) How can I run the above code for 100,000 iterations? Using that many number of threads is obviously a bad idea. Initially thought of using ExecutorService for maintaining thread count and number of jobs to be submitted but it felt redundant.
(2)I read about 'setMaxTotal'(max connections) and 'setDefaultMaxPerRoute'(concurrent connections) but I dont think it will help achieve(1) though I will obviously be required to increase the values.
Please advise. Thanks in advance.
You could use a threadpool and push the workerfunction the required number of times. Then you could even vary the number of workerthreads executing the functions to simulate different loadsituations.
Threadpool tutorial:
https://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html
Why don't you use Jmeter for such performance/load testing?
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am putting many images on HDFS. However each one is taking 64MB Block there.As the count of images are very high So wanted to put all image information in one big file. Now this will be feed to mapper to process it faster . What inputformat i can use? Or do i need to use sequencefile concepts ?i am not much sure as how to proceed further could someone please suggest some better way to deal this.
Just throw them all in a Zip.
Really you would be better off using a Database (for example MongoDB) and store them all in there though.