I have a huge JSON file (1GB) which is basically an array of objects in the below format
[{"x":"y", "p":"q"}, {"x1":"y1", "p1":"q1"},....]
I want to parse this file such the all the data is not loaded in memory.
Basically I want to get for eg: first 1000 objects in the array to memory process it and then get the next 1000 objects into the memory process it and so on util all data is read.
Is there any JSON library that supports this use case? I currently use Gson. However it loads all the data to memory when I call gson.fromJson()
Thanks in advance for the help.
It looks like Gson has a streaming API, which is what you want: https://sites.google.com/site/gson/streaming
With Jackson you can use a SAX-like approach (streaming) using a JsonParser object, in your case it would be something like this:
JsonFactory jsonFactory = new JsonFactory();
JsonParser parser = jsonFactory.createParser(new File("/path/to/my/jsonFile"));
// Map where to store your field-value pairs per object
Map<String, String> fields = new HashMap<String, String>();
JsonToken token;
while ((token = parser.nextToken()) != JsonToken.END_ARRAY) {
switch (token) {
// Starts a new object, clear the map
case START_OBJECT:
fields.clear();
break;
// For each field-value pair, store it in the map 'fields'
case FIELD_NAME:
String field = parser.getCurrentName();
token = parser.nextToken();
String value = parser.getValueAsString();
fields.put(field, value);
break;
// Do something with the field-value pairs
case END_OBJECT:
doSomethingWithTheObject(fields)
break;
}
}
parser.close();
Related
Background
I have a list of strings (records) that are dynamically created by a class. Each record may have different keys (e.g. favorite_pizza on first, favorite_candy on second).
// Note: These records are dynamically created and not stored
// in this way. This is simply for display purposes.
List<String> records =
Arrays.asList(
"{\"name\":\"Bob\",\"age\":40,\"favorite_pizza\":\"Cheese\"}",
"{\"name\":\"Jill\",\"age\":22,\"favorite_candy\":\"Swedish Fish\"}");
The list of records is then passed to a separate HTTP request class.
public Response addRecords(List<String> records) {
...
}
Inside the HTTP request service, I want to build a JSON request body:
{
"records": [
{
"name": "Bob",
"age": 40,
"favorite_pizza": "Cheese"
},
{
"name": "Jill",
"age": 22,
"favorite_candy": "Swedish Fish"
}
]
}
I'm using org.json.JSONObject to add the records key and create the request body:
JSONObject body = new JSONObject();
// Add the "records" key
body.put("records", records);
// Create the request body
body.toString();
Issues
When I run my junit test in IntelliJ, the request body contains a backslash before each quote:
org.junit.ComparisonFailure:
Expected :"{"records":["{"name":"Bob","age":40,"favorite_pizza":"Cheese"}","{"name":"Jill","age":22,"favorite_candy":"Swedish Fish"}"]}"
Actual :"{"records":["{\"name\":\"Bob\",\"age\":40,\"favorite_pizza\":\"Cheese\"}","{\"name\":\"Jill\",\"age\":22,\"favorite_candy\":\"Swedish Fish\"}"]}"
And when I make the request it fails because the body is not formatted correctly:
{
"records": [
"{\"name\":\"Bob\",\"age\":40,\"favorite_pizza\":\"Cheese\"}",
"{\"name\":\"Jill\",\"age\":22,\"favorite_candy\":\"Swedish Fish\"}"
]
}
Questions
Why is JSONObject including the backslashes before each quote?
How do I remove the backslashes?
You are creating a list of string, which is not what you want.
You should instead create a list of objects (Maps)
Map<String, Object> m1 = new LinkedHashMap<>();
m1.put("name", "Bob");
m1.put("age", 40);
m1.put("favorite_pizza", "Cheese");
LinkedHashMap<String, Object> m2 = new LinkedHashMap<>();
m2.put("name", "Jill");
m2.put("age", 22);
m2.put("favorite_candy", "Swedish Fish");
List<LinkedHashMap<String, Object>> records = Arrays.asList(m1,m2);
JSONObject body = new JSONObject();
// Add the "records" key
body.put("records", records);
This is a quite common mistake (it seems), to try to serialize strings formatted like json objects expecting is the same thing as passing a the object itself.
UPDATE:
Or if you have a json serialized object list then ...
List<String> recordSource =
Arrays.asList(
"{\"name\":\"Bob\",\"age\":40,\"favorite_pizza\":\"Cheese\"}",
"{\"name\":\"Jill\",\"age\":22,\"favorite_candy\":\"Swedish Fish\"}");
List<JSONObject> records =
recordSource.stream().map(JSONObject::new).collect(Collectors.toList());
JSONObject body = new JSONObject();
// Add the "records" key
body.put("records", records);
System.out.println(body.toString());
If your record strings are already valid json you can either
Iterate over them, converting them one at a time into a JSONObject (see here) and then add the result to a JSONArray which you can manipulate if needed.
Create the array entirely by hand since it's just comma separated record strings inside square brackets.
After deserializing my string and converting it to JSON using the code below:
JSONObject returnValue = new JSONObject();
String toJson = null;
try
{
Object otherObjectValue = SerializationUtils
.deserialize(myBytesArray);
Gson gson = new Gson();
toJson = gson.toJson(otherObjectValue);
returnValue.put(key, toJson);
}
some part of the JSON still has something like:
{ "key":"ATTRIBUTE_LIST", "value":"{\"attributeContract\":[{\"scope\":\"sso\",\"name\":\"SAML_SUBJECT\",\"description\":\"Click to Edit\",\"required\":true}]}"}
which means everything in:
"{\"attributeContract\":[{\"scope\":\"sso\",\"name\":\"SAML_SUBJECT\",\"description\":\"Click to Edit\",\"required\":true}]}"
is one string instead being another object with fields. Is there something I can do to sanitize by JSONObject to make it properly JSON?
The key part is OK, means the whole String is JSON formatted.
For the value part, /shows that the value of value is JSON formatted already.
So you may "deserialize" the value of value again to retrieve an Object result. Or you may ask the creator of origin JSON, to serialize origin Object one time into JSON format.
Because of the project requirement, I have to use com.fasterxml.jackson.databind library to parse JSON data cannot use other JSON libraries available.
I am new to JSON parsing, so not sure if there are better options here?
I would like to know how can I update a string value in an Array node in the JSON file.
Following is a sample JSON. Please note this is not the entire file content, it's a simplified version.
{
"call": "SimpleAnswer",
"environment": "prod",
"question": {
"assertions": [
{
"assertionType": "regex",
"expectedString": "(.*)world cup(.*)"
}
],
"questionVariations": [
{
"questionList": [
"when is the next world cup"
]
}
]
}
}
Following is the code to read JSON into java object.
byte[] jsonData = Files.readAllBytes(Paths.get(PATH_TO_JSON));
JsonNode jsonNodeFromFile = mapper.readValue(jsonData, JsonNode.class);
To update a root level node value e.g. environment in the JSON file , I found following approach on some SO threads.
ObjectNode objectNode = (ObjectNode)jsonNodeFromFile;
objectNode.remove("environment");
objectNode.put("environment", "test");
jsonNodeFromFile = (JsonNode)objectNode;
FileWriter file = new FileWriter(PATH_TO_JSON);
file.write(jsonNodeFromFile.toString());
file.flush();
file.close();
QUESTION 1: Is this the only way to update a value in JSON file and is it the best way possible? I'm concerned on double casting and file I/O here.
QUESTION 2: I could not find a way to update the value for a nested Array node e.g. questionList. Update the question from when is the next world cup to when is the next soccer world cup
You can use ObjectMapper to parse that JSON, it is very easy to parse and update JSON using pojo class.
use link to convert your json to java class, just paste your json here n download class structure.
You can access or update nested json field by using . (dot) operator
ObjectMapper mapper = new ObjectMapper();
String jsonString="{\"call\":\"SimpleAnswer\",\"environment\":\"prod\",\"question\":{\"assertions\":[{\"assertionType\":\"regex\",\"expectedString\":\"(.*)world cup(.*)\"}],\"questionVariations\":[{\"questionList\":[\"when is the next world cup\"]}]}}";
TestClass sc=mapper.readValue(jsonString,TestClass.class);
// to update environment
sc.setEnvironment("new Environment");
System.out.println(sc);
//to update assertionType
Question que=sc.getQuestion();
List assertions=que.getAssertions();
for (int i = 0; i < assertions.size(); i++) {
Assertion ass= (Assertion) assertions.get(i);
ass.setAssertionType("New Type");
}
In my Android app, I used Gson in order to save/load the object's Arraylist in SharedPreferences. Follows are my code using Gson.
public static ArrayList<RequestModal> getModalList(Context ctx) {
Gson gson = new Gson();
String json = getSharedPreferences(ctx).getString("ModalList", new Gson().toJson(new ArrayList<>()));
Type type = new TypeToken<ArrayList<RequestModal>>() {}.getType();
return gson.fromJson(json, type);
}
In here "RequestModal" is the simple object include a bit of strings and integers.
It works well in case "online". But if internet is offline, forever works on below code.
Type type = new TypeToken<ArrayList<RequestModal>>() {}.getType();
How can I solve it? What is the way implement the feature like this with/without using Gson? Please help me anyone having a good idea.
Thank you in advance.
You can implement this without Gson:
public static EpisodeDetails parseEpisodeDetails(String content) {
EpisodeDetails episodeDetails = new EpisodeDetails();
try {
JSONObject jsonObject = new JSONObject(content);
episodeDetails.title = jsonObject.getString("title");
episodeDetails.subTitle = jsonObject.getString("subtitle");
episodeDetails.synopsis = jsonObject.getString("synopsis");
episodeDetails.ends_on = jsonObject.getString("ends_on");
JSONArray images = jsonObject.getJSONArray("image_urls");
if (images.length() > 0) {
episodeDetails.image_url = images.getString(0);
}
} catch (JSONException e) {
e.printStackTrace();
return null;
}
return episodeDetails;
}
What I'm doing is just taking the String, in your case the one saved on the shared prefs called ModalList and inserting the values on my structure, on my code the structure is called EpisodeDetails, on your code the correspondent is RequestModal. If you don't want to do it via code and want to try another library I recommend Jackson.
Another thing, on this line:
String json = getSharedPreferences(ctx).getString("ModalList", new Gson().toJson(new ArrayList<>()));
Your second parameter is not necessary. getString takes the key to load as first parameter and a default value as second paramter (in the case of empty result). You could change this to "" or null.
Well, another solution to your problem could be TinyDB. It makes use of Gson to save ArrayLists of objects in sharedPrefs, its usage is so simple as:
Person person = new Person("john", 24);
tinydb.putObject("user1", person);
ArrayList<Person> usersWhoWon = new ArrayList<Person>();
tinydb.putListObject("allWinners", usersWhoWon);
and that's it, check out my link given above to see the usage details.
HttpGet getRequest=new HttpGet("/rest/auth/1/session/");
getRequest.setHeaders(headers);
httpResponse = httpclient.execute(target,getRequest);
entity = httpResponse.getEntity();
System.out.println(EntityUtils.toString(entity));
Output as follows in json format
----------------------------------------
{"session":{"name":"JSESSIONID","value":"5F736EF0A08ACFD7020E482B89910589"},"loginInfo":{"loginCount":50,"previousLoginTime":"2014-11-29T14:54:10.424+0530"}}
----------------------------------------
What I want to know is how to you can manipulate this data using Java without writing it to a file?
I want to print name, value in my code
Jackson library is preferred but any would do.
thanks in advance
You may use this JSON library to parse your json string into JSONObject and read value from that object as show below :
JSONObject json = new JSONObject(EntityUtils.toString(entity));
JSONObject sessionObj = json.getJSONObject("session");
System.out.println(sessionObj.getString("name"));
You need to read upto that object from where you want to read value. Here you want the value of name parameter which is inside that session object, so you first get the value of session as JSONObject using getJSONObject(KeyString) and read name value from that object using function getString(KeyString) as show above.
May this will help you.
Here's two ways to do it without a library.
NEW (better) Answer:
findInLine might work even better. (scannerName.findInLine(pattern);)
Maybe something like:
s.findInLine("{"session":{"name":"(\\w+)","value":"(\\w+)"},"loginInfo":{"loginCount":(\\d+),"previousLoginTime":"(\\w+)"}}");
w matches word characters (letters, digits, and underscore), d matches digits, and the + makes it match more than once (so it doesnt stop after just one character).
Read about patterns here https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
OLD Answer:
I'm pretty sure you could use a scanner with a custom delimiter here.
Scanner s = new Scanner(input).useDelimiter("\"");
Should return something like:
{
session
:{
name
:
JSESSIONID
,
value
:
5F736EF0A08ACFD7020E482B89910589
And so on. Then just sort through that list/use a smarter delimiter/remove the unnecessary bits.
Getting rid of every other item is a pretty decent start.
https://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html has info on this.
I higly recomend http-request built on apache http api.
private static final HttpRequest<Map<String, Map<String, String>>> HTTP_REQUEST = HttpRequestBuilder.createGet(yourUri, new TypeReference<Map<String, Map<String, String>>>{})
.addDefaultHeaders(headers)
.build();
public void send(){
ResponseHandler<Map<String, Map<String, String>>> responseHandler = HTTP_REQUEST.execute();
Map<String, Map<String, String>> data = responseHandler.get();
}
If you want use jackson you can:
entity = httpResponse.getEntity();
ObjectMapper mapper = new ObjectMapper();
Map<String, Map<String, String>> data = mapper.readValue(entity.getContent(), new TypeReference<Map<String, Map<String, String>>>{});