I have code in which I am given a large JSON string (could be anywhere from 50MB to 250MB) that is an array of JSON objects to be parsed and sanitized then serialized to a file. Everything was going fine with 50MB JSON strings but when the string gets over a hundred or so MB my app crashes with OutOfMemoryError. I know I can increase the size of the heap but would like to avoid doing so if possible. I have included some thoughts I have been having recently. I tried moving try blocks around a little bit to no avail.
1) I suspect there is some way to do this with streams but I have no idea how to stream the result String (which is a json array string of json objects) one JSON object at a time.
2) Since result is a Java string, it is immutable. How can we consume this string and get it out of memory ASAP?
3) Would cleanedResult be better to instantiate a new object each time rather than just assign the same object something different each time?
4) At the end of the for loop shouldn't there only be roughly 2x memory used as before the loop as now json stringbuilder variable contains the same memory as the result string which should be the two largest variables in memory?
I have included the code below.
String result = getLargeJSONString(...); // function that gives me a large JSON string which is an array of JSON objects
StringBuilder json = new StringBuilder(); // to hold final JSON values to write to file
// try to parse said large JSON String
JSONArray results = new JSONArray();
try {
results = new JSONArray(result);
} catch (JSONException j) {
j.printStackTrace();
}
// do json sanitation on each object and then append to stringbuilder
// note the final result should be a string with a JSON object on each newline
JSONObject cleanedResult = new JSONObject();
for (int i = 0; i < results.length(); i++) {
try {
cleanedResult = JSONSanitizer.sanitize((JSONObject) results.get(i));
} catch (JSONException j) {
cleanedResult = new JSONObject();
}
json.append(cleanedResult.toString());
json.append('\n');
}
// write built string to file
try {
Files.write(Paths.get("../file.json"), json.toString().getBytes());
} catch (IOException i) {
System.out.println(i);
}
Of corse you should prefere streaming over contiguous memory allocation (String, StringBuilder, arrays and so) to process large amounts of data. So your best chance is to use a streaming JSON parser/serializer.
However, you should first try to optimize your code through several easy-gain fixes:
One: If you really need to store the result before wrinting it to a file, pre-size the StringBuilder to the estimated maximum final size it will have, so it won't need to be resized on every execution of append. For example, like this:
StringBuilder json = new StringBuilder(result.length());
You'd better even take in account an extra size for the newline characters. For example, oversizing 5%:
StringBuilder json = new StringBuilder((int)(1.05d*result.length()));
Two: If you just need to write the result out to a file, do not even store it into a StringBuilder:
String result = getLargeJSONString(...);
JSONArray results = new JSONArray(result);
try(Writer output=new OutputStreamWriter(new FileOutputStream(outputFile), "UTF8")) {
for (int i = 0; i < results.length(); i++) {
JSONObject cleanedResult = JSONSanitizer.sanitize((JSONObject) results.get(i));
output.write(cleanedResult.toString());
output.write('\n');
}
}
Related
I am trying to store a multiple JsonObjects in ArrayList to parse and display in tableview at later stage.
for some odd reason I can't add objects to the list.
I am using javax.json
Here is my Try statement:
try {
JsonReader jsonReader = Json.creatReader(new StringReader(test));
JsonObject obj = jsonReader.readObject();
jsonReader.close();
key = obj.getString("key"); // this works with no issue
ArrayList<JsonObject> jsonList = new ArrayList<>();
jsonList.add(obj); // everything hangs with no errors when I try to do this
}
debugger shows jsonList size 0 obj size 3 for this line
jsonList.add(obj);
if I commented these lines everything works as expected
ArrayList<JsonObject> jsonList = new ArrayList<>();
jsonList.add(obj);
when they are not commented I am getting catch final throwable (be sure to set the state after the cause of failure.
I have following type of JSON array (actually I received it as string so I'm trying to convert it to JSON array),
[{"Message":{"AccountId":"0","CreationDate":"02-DEC-16","Sbu":null,"ProfileId":"28261723","messageSeqId":69},"Offset":6},
{"Message":{"AccountId":"0","CreationDate":"02-DEC-16","Sbu":null,"ProfileId":"28261271","messageSeqId":76},"Offset":7},
{"Message":{"AccountId":"0","CreationDate":"06-DEC-16","Sbu":null,"ProfileId":"28261871","messageSeqId":99},"Offset":8},
{"Message":{"AccountId":"0","CreationDate":"06-DEC-16","Sbu":null,"ProfileId":"28261921","messageSeqId":101},"Offset":9},
{"Message":{"AccountId":"0","CreationDate":"07-DEC-16","Sbu":null,"ProfileId":"28260905","messageSeqId":105},"Offset":10}]
Sometimes this JSON array parsing fails because one JSON objects has fails to parse (I'm using JSON.simple to the JSON parsing). Is there a way to identify the erroneous JSON object?
Here is the code part(ResponseJson is above string that want to convert to JSON array),
JSONParser jsonParser = new JSONParser();
try{
JSONArray jsonArray = (JSONArray) jsonParser.parse(ResponseJson);
int jsonArrayLength = jsonArray.size();
System.out.println("jsonArray length: " + jsonArrayLength);
if (jsonArrayLength > 0) {
subscribeMessageEvent(topic,qStart,jsonArrayLength,jsonArray);
}
}catch (Exception e){
e.printStackTrace();
}
No, you can't identify which JSON Object is not properly formed with your current implementation.
Anyways, if you're receiving your input as a String, you could split it into the different messages and then parse them separately. That way you're in control and you can decide what to do with them individually.
I want to read CSV file, create objects from every rows and then save these objects to a database.
When i read all lines from my file, and store all objects inside ArrayList i get Java Heap Space Error.
I tried to save every record immediately after reading, but then saving records by Hibernate method save() take a lot of time.
I also tried to check size of my arrayList and save data when this size equals 100k (commented part of code).
Question: Is any way to read file partly or better way to store data in Java?
String[] colNames;
String[] values;
String line;
Map<Object1, Object1> newObject1Objects = new HashMap<Object1, Object1>();
Map<Object1, Integer> objIdMap = objDao.createObjIdMap();
StringBuilder raportBuilder = new StringBuilder();
Long lineCounter = 1L;
BufferedReader reader = new BufferedReader(new InputStreamReader(
new FileInputStream(filename), "UTF-8"));
colNames = reader.readLine().split(";");
int columnLength = colNames.length;
while ((line = reader.readLine()) != null) {
lineCounter++;
line = line.replace("\"", "").replace("=", "");
values = line.split(";", columnLength);
// Object1
Object1 object1 = createObject1Object(values);
if (objIdMap.containsKey(object1)) {
object1.setObjId(objIdMap.get(object1));
} else if (newObject1Objects.containsKey(object1)) {
object1 = newObject1Objects.get(object1);
} else {
newObject1Objects.put(object1, object1);
}
// ==============================================
// Object2
Object2 object2 = createObject2Object(values, object1,
lineCounter, raportBuilder);
listOfObject2.add(object2);
/*
logger.error("listOfObject2.size():"+listOfObject2.size());
if(listOfObject2.size() % 100000 == 0){
object2Dao.performImportOperation(listOfObject2);
listOfObject2.clear();
}
*/
}
object2Dao.performImportOperation(listOfObject2);
Increase of max heap size won't help you if you want to process really large files. Your friend is batching.
Hibernate doesn’t implicitly employ JDBC batching and each INSERT and UPDATE statement is executed separately. Read "How do you enable batch inserts in hibernate?" to get information on how to enable it.
Pay attention to IDENTITY generators, as it disables batch fetching.
I am trying to parse a JSON schema and I need to get all the image links from the JSONArray and store it in a java array. The JSONArray looks like this:
How can I get only the number of strings in the image array for e.g. In this case it should be 4? I know how to get the full length of array but how can I only get the number of strings?
UPDATE:
I am simply parsing it using the standard JSON parser for android. The length of JSONArray can be calculated using:
JSONArray imageArray = hist.getJSONArray("image");
int len = imageArray.length();
len will be equal to 9 in this case.
I'm not sure if there's a better way (there probably is), but here's one option:
According to the Android docs, getJSONObject will throw a JSONException if the element at the specified index is not a JSON object. So, you can try to get the element at each index using getJSONObject. If it throws a JSONException, then you know it's not a JSON object. You can then try and get the element using getString. Here's a crude example:
JSONArray imageArray = hist.getJSONArray("image");
int len = imageArray.length();
ArrayList<String> imageLinks = new ArrayList<String>();
for (int i = 0; i < len; i++) {
boolean isObject = false;
try {
JSONArray obj = imageArray.getJSONObject(i);
// obj is a JSON object
isObject = true;
} catch (JSONException ex) {
// ignore
}
if (!isObject ) {
// Element at index i was not a JSON object, might be a String
try {
String strVal = imageArray.getString(i);
imageLinks.add(strVal);
} catch (JSONException ex) {
// ignore
}
}
}
int numImageLinks = imageLinks.size();
So I understand that you can convert JSON strings to strings and handle JSON objects in general through the org.json bundle in Android, but here's my current situation:
I need to take a JSON string from a certain URL (I'm already able to successfully do this) and make it into an array. Well actually two arrays. The framework I'm using runs on Python and returns a dict that contains lists (arrays in Python). However, it is displayed as a JSON object. Here's an example of what I would be getting from the URL to my Java code:
{"keywords": ["middle east", "syria"], "link": [["middle east", "http://www.google.com/#q=middle east"], ["syria", "http://www.google.com/#q=syria"]]}
As you can see, it's a dict of two indices. The first one is "keywords" that has a list and the second one is "link" that contains a list of lists. The two lists (the first one and the second multidimensional one) are what I want to be able to manipulate in Java. I'm aware that you can use JSONArray, but the problem is that the arrays are stored in a Python dict, and my Android application does not properly make a JSONArray. Do you guys have any ideas of how I can handle this? I'm pretty lost. Here is my code for getting the actual JSON string (the URL in the code is not accessible to everyone, it's being served by paste on my machine):
static public void refreshFeed(){
try{
String url = "http://192.17.178.116:8080/getkw?nextline="+line;
line++;
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet(url);
HttpResponse response;
response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
InputStream in = entity.getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder sb = new StringBuilder();
String input = null;
try {
while ((input = reader.readLine()) != null) {
sb.append(input + "\n");
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
String enter = sb.toString();
feedEntry add = new feedEntry(enter);
addNewEntry(add);
in.close();
} catch(MalformedURLException e){
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
Please also note that this is without the JSONString being made into a JSONArray. It simply translates the JSON object into a regular String that is added to a "feedEntry" object.
Mapping a python dict to a json array is ... more work than you'd expect. It'd be better to make it into either a json object or start with a list, which can be mapped straight to a json array. Info on serializing between python and java.
Here's a code example where I create a list structure in Python, and then grab it in an Android application:
#!/usr/bin/python
print "Content-type: text/html\n\n"
import json
from collections import defaultdict
mystuff = list()
mystuff.append( ('1', 'b', 'c', 'd') )
mystuff.append( ('2', 'f', 'g', 'h') )
stufflist = list()
for s in stufflist:
d = {}
d['a'] = s[0]
d['b'] = s[1]
d['c'] = s[2]
d['d'] = s[3]
stufflist.append(d)
print json.write(stufflist)
And in Android:
// Convert the string (sb is a string butter from the http response) to a json array.
JSONArray jArray = new JSONArray(sb.toString());
for(int i = 0; i < jArray.length(); i++){
// Get each item as a JSON object.
JSONObject json_data = jArray.getJSONObject(i);
// Get data from object ...
Int a = json_data.getInt("a");
String b = json_data.getString("b");
String c = json_data.getString("c");
String d = json_data.getString("d");
// Do whatever with the data ...
}