I have a .JSON file that has content as:
"Name":"something"
"A":10
"B": 12
"Name":"something else"
"A":5
"B":9
....
I want to read this file and then find among these objects which one have the most number of parts(sum of A+B). What would be a good approach to that? I thought about reading JSON data in a Map, and then going through each object of the Map, finding the its total number of A+B, and then storing that in another linked list as LinkedList<String,Integer> (where String would be a name, and Integer would be the sum of A+B). Then after that sort LinkedList, and then findout which ever has the most number of A+B. Would this be a good solution?
Related
I'm completely new to programming and to java in particular and I am trying to determine which data structure to use for a specific situation. Since I'm not familiar with Data Structures in general, I have no idea what structure does what and what the limitations are with each.
So I have a CSV file with a bunch of items on it, lets say Characters and matching Numbers. So my list looks like this:
A,1,B,2,B,3,C,4,D,5,E,6,E,7,E,8,E,9,F,10......etc.
I need to be able to read this in, and then:
1)display just the letters or just the numbers sorted alphabetically or numerically
2)search to see if an element is contained in either list.
3)search to see if an element pair (for example A - 1 or B-10) is contained in the matching list.
Think of it as an excel spreadsheet with two columns. I need to be able to sort by either column while maintaining the relationship and I need to be able to do an IF column A = some variable AND the corresponding column B contains some other variable, then do such and such.
I need to also be able to insert a pair into the original list at any location. So insert A into list 1 and insert 10 into list 2 but make sure they retain the relationship A-10.
I hope this makes sense and thank you for any help! I am working on purchasing a Data Structures in Java book to work through and trying to sign up for the class at our local college but its only offered every spring...
You could use two sorted Maps such as TreeMap.
One would map Characters to numbers (Map<Character,Number> or something similar). The other would perform the reverse mapping (Map<Number, Character>)
Let's look at your requirements:
1)display just the letters or just the numbers sorted alphabetically
or numerically
Just iterate over one of the maps. The iteration will be ordered.
2)search to see if an element is contained in either list.
Just check the corresponding map. Looking for a number? Check the Map whose keys are numbers.
3)search to see if an element pair (for example A - 1 or B-10) is
contained in the matching list.
Just get() the value for A from the Character map, and check whether that value is 10. If so, then A-10 exists. If there's no value, or the value is not 10, then A-10 doesn't exist.
When adding or removing elements you'd need to take care to modify both maps to keep them in sync.
I have a huge JSON file (400MB). I want to sort by TimeStamp. Do you have any idea, how can I do this?
I created a program with the loop, which is sorted small files, but my file is the too big, and I got an infinitive loop.
enter image description here
Use the library big-sorter as below.
Sorter
.serializer(Serializer.jsonArray())
.comparator((x, y) ->
x.get("time").asText().compareTo(y.get("time").asText()))
.input(new File("input.json"))
.output(new File("sorted.json"))
.sort();
With this method I sorted 10 million records in a 440MB file in 54s with max heap set at 64MB (-Xmx64m).
I would create an object T implemeting Comparable. The object T should representing an entry in the Json file.
I would then load the Json into a list of object T using Gson.
Then you can do a Collections.sort on your list.
I want to sort some number+string combination but the sorting will be based on the number from that combination. Can you suggest an optimal solution?
Say my strings are:
12 Masdf
4 Oasd
44 Twer
and so on. The sorting will be based on the numbers like 12, 4, 44 and after the sorting I have to show the full alphanumeric strings.
As the program will run on thousands of data I don't want to split the string and compare the number on each iteration. My plan is to extract the numbers and take those in an array and then sort the array. After sorting done, I want to put back the numbers with associated strings and keep those in a string array to show.
It should be done in C++. Algorithms should be applied - Insertion sort, Quick sort, Merge sort, etc.
Create a class to store the full string and the number. Make the class Comparable. Convert your list of string to list of Class. Sort the list using which sort method is relevant. Iterate the list and print the string fields.
Sorry, that was an answer for Java, since you tagged it Java. Replace/remove Comparable for whatever is good for C++.
I am going to assume these two parts are in separate variables and are not together as one string (if they were you could just store them in a list).
First consider a Map. Each 'bucket' of the map can be represented by a number. Within each of the maps buckets is a bunch of strings in a list. (Note this could also be solved with an array especially if the Integer part is always under some fixed value) The java equivalent would look like:
Map map = new HashMap<Integer,ArrayList<String>>();
For sorting on this custom collection first the integer part of the value would be searched on the map returning a list. Every item in the list will have the same starting number. So we now search the list the string part of the value (I am assuming the list is sorted so you can do whatever sort you want ie: selection/quicksort).
The advantages of this search mean that if the number is not found in the Hashmap you instantly know there is no string part for it.
I want to implement a method to merge two huge file (the files contains JsonObject for each row) through a common value.
The first file is like this:
{
"Age": "34",
"EmailHash": "2dfa19bf5dc5826c1fe54c2c049a1ff1",
"Id": 3,
...
}
and the second:
{
"LastActivityDate": "2012-10-14T12:17:48.077",
"ParentId": 34,
"OwnerUserId": 3,
}
I have implemented a method that read the first file and take the first JsonObject, after it takes the Id and if in the second file there is a row that contains the same Id (OwnerUserId == Id), it appends the second JsonObject to the first file, otherwise I wrote another file that contains only the row that doesn't match with the first file. In this way if the first JsonObject has 10 match, the second row of the first file doesn't seek these row.
The method works fine, but it is too slow.
I have already trying to load the data in mongoDb and query the Db, but it is slow too.
Is there another way to process the two file?
What you're doing simply must be damn slow. If you don't have the memory for all the JSON object, then try to store the data as normal Java objects as this way you surely need much less.
And there's a simple way needing even much less memory and only n passes, where n is the ratio of required memory to available memory.
On the ith pass consider only objects with id % n == i and ignore all the others. This way the memory consumption reduces by nearly factor n, assuming the ids are nicely distributed modulo n.
If this assumption doesn't hold, use f(id) % n instead, where f is some hash function (feel free to ask if you need it).
I have solved using a temporary DB.
I have created a index with the key in which I want to make a merge and in this way I can make a query over the DB and the response is very fast.
I'm implementing this in Java.
Symbol file Store data file
1\item1 10\storename1
10\item20 15\storename6
11\item6 15\storename9
15\item14 1\storename250
5\item5 1\storename15
The user will search store names using wildcards like storename?
My job is to search the store names and produce a full string using symbol data. For example:
item20-storename1
item14-storename6
item14-storename9
My approach is:
reading the store data file line by line
if any line contains matching search string (like storename?), I will push that line to an intermediate store result file
I will also copy the itemno of a matching storename into an arraylist (like 10,15)
when this arraylist size%100==0 then I will remove duplicate item no's using hashset, reducing arraylist size significantly
when arraylist size >1000
sort that list using Collections.sort(itemno_arraylist)
open symbol file & start reading line by line
for each line Collections.binarySearch(itemno_arraylist,itmeno)
if matching then push result to an intermediate symbol result file
continue with step1 until EOF of store data file
...
After all of this I would combine two result files (symbol result file & store result file) to present actual strings list.
This approach is working but it is consuming more CPU time and main memory.
I want to know a better solution with reduced CPU time (currently 2 min) & memory (currently 80MB). There are many collection classes available in Java. Which one would give a more efficient solution for this kind of huge string processing problem?
If you have any thoughts on this kind of string processing problems that too in Java would be great and helpful.
Note: Both files would be nearly a million lines long.
Replace the two flat files with an embedded database (there's plenty of them, I used SQLite and Db4O in the past): problem solved.
So you need to replace 10\storename1 with item20-storename1 because the symbol file contains 10\item20. The obvious solution is to load the symbol file into a Map:
String tokens=symbolFile.readLine().split("\\");
map.put(tokens[0], tokens[1]);
Then read the store file line by line and replace:
String tokens=storelFile.readLine().split("\\");
output.println(map.get(tokens[0])+'-'+tokens[1]));
This is the fastest method, though still using a lot of memory for the map. You can reduce the memory storing the map in a database, but this would increase the time significantly.
If your input data file is not changing frequently, then parse the file once, put the data into a List of custom class e.g. FileStoreRecord mapping your record in the file. Define a equals method on your custom class. Perform all next steps over the List e.g. for search, you can call contains method by passing search string in form of the custom object FileStoreRecord .
If the file is changing after some time, you may want to refresh the List after certain interval or keep the track of list creation time and compare against the file update timestamp before using it. If ifferent, recreate the list. One other way to manage the file check could be to have a Thread continuously polling the file update and the moment, it is updated, it notifies to refresh the list.
Is there any limitation to use Map?
You can add Items to Map, then you can search easily?
1 million record means 1M * recordSize, therefore it will not be problem.
Map<Integer,Item> itemMap= new HashMap();
...
Item item= itemMap.get(store.getItemNo());
But, the best solution will be with Database.