Jackson JSON Streaming API: Read an entire object directly to String - java

I'm trying to stream in an array of JSON, object by object, but I need to import it as a raw JSON String.
Given an array of input like so:
[
{"object":1},
{"object":2},
...
{"object":n}
]
I am trying to iterate through the Strings:
{"object":1}
{"object":2}
...
{"object":n}
I can navigate the structure using the streaming API to validate that I have encountered an object, and all that, but I think the way I'm getting my String back is ideal.
Currently:
//[...]
//we have read a START_OBJECT token
JsonNode node = parser.readValueAsTree();
String jsonString = anObjectMapper.writeValueAsString(node);
//as opposed to String jsonString = node.toString() ;
//[...]
I imagine the building of the whole JsonNode structure involves a bunch of overhead, which is pointless if I'm just reserializing, so I'm looking for a better solution. Something along the lines of this would be ideal:
//[...]
//we have read a START_OBJECT token
String jsonString = parser.readValueAsString()
//or parser.skipChildrenAsString()
//[...]
The objects are obviously not as simple as
{"object":1}
which is why I'm looking to not waste time doing pointless node building. There may be some ideal way, involving mapping the content to objects and working with that, but I am not in a position where I am able to do that. I need the raw JSON string, one object at a time, to work with existing code.
Any suggestions or comments are appreciated. Thanks!
EDIT : parser.getText() returns the current token as text (e.g. START_OBJECT -> "{"), but not the rest of the object.
Edit2 : The motivation for using the Streaming API is to buffer objects in one by one. The actual json files can be quite large, and each object can be discarded after use, so I simply need to iterate through.

There is no way to avoid JSON tokenization (otherwise parser wouldn't know where objects start and end etc), so it will always involve some level of parsing and generation.
But you can reduce overhead slightly by reading values as TokenBuffer -- it is Jackson's internal type with lowest memory/performance overhead (and used internally whenever things need to be buffered):
TokenBuffer buf = parser.readValueAs(TokenBuffer.class);
// write straight from buffer if you have JsonGenerator
jgen.writeObject(buf);
// or, if you must, convert to byte[] or String
byte[] stuff = mapper.writeValueAsBytes();
We can do bit better however: if you can create JsonGenerator for output, just use JsonGenerator.copyCurrentStructure(JsonParser);:
jgen.copyCurrentStructure(jp); // points to END_OBJECT after copy
This will avoid all object allocation; and although it will need to decode JSON, encode back as JSON, it will be rather efficient.
And you can in fact use this even for transcoding -- read JSON, write XML/Smile/CSV/YAML/Avro -- between any formats Jackson supports.

Related

How to parse a mulidimensional JSONString in Java

I have a JSON-formatted String that has a singular key-value pair and a Map consisting of various String-typed keys and values within it, as follows:
"{"Key":"value","Map":{"key1":"val1","key2":"val2",...}}"
What I want to do is convert this String into a JSONObject (because I have other code that can easily interpret a JSONObject). My first instinct was to use a parser (JSONParser) like the code snippet below...
JSONParser parser = new JSONParser();
Object o = new JSONParser();
o = (JSONObject) parser.parse(jsonStr);
JSONObject j = (JSONObject) o;
…but I got a ParseException instead of the convenient JSONObject. Why is that? Should I be treating the String differently, since it has a Map inside of it? Or am I doing something beyond the capabilities of a JSONParser?
... but I got a ParseException instead of the convenient JSONObject. Why is that?
If you got a ParseException, that means that what you think is JSON is (in fact) not valid JSON. It is not a problem with your parsing code or the JSONObject parser. It is either a problem with the way the (supposed) JSON was produced in the first place, or with "channel" by which it reached the code that was supposed to parse it.
Should I be treating the string differently, since it has a map inside of it?
Nope.
I note that your example code snippets are not sufficiently clear / complete to be able to tell exactly what you are doing. (In future, please provide a real MCVE rather than code snippets that don't make a lot of sense1 ... and certainly can't be compiled and run.) But there is nothing to indicate that that code is the cause of the ParseException.
Or am I doing something beyond the capabilities of a JSONParser?
Nope. A JSON parser can cope with any JSON provided that it is well-formed.
To fix this, you are going to need to work out why the parser thinks your JSON is bad, and work back to the root cause of the badness.
1 - For example, why are you assigning a JSONParser object to a variable of type Object?

Compress object to a string or a hash and and viceversa?

I am trying to do this...
I have a big object that I need to send from the backend to the view and from that view to other backend as parameter in a GET post. So, since I cannot send all the object, I thought about make it a very small string or a number, and for this I tried doing this.
First I converted the object to JSON, using JSONObject,but due the special characters I cannot take it as a normal string (it has many ", /, {},....) so, I thought make it a byte, but I am not sure if is a correct approach, any idea how to do this?
Or something similar to have my object in an acceptable GET parameter for the URL.
So far I only have this code.
byte[] foo = new JSONObject(myObject).toString().getBytes();
String bar = foo.toString();
But dont know who to parse it back to byte and have not even tried to parse it back to JSON, any idea?
I am using spring and I dont have gson or jackson for convert to json an object.

Parseing JSON Array with unnamed fields with GSON

So I want to parse flightradar24.com data into a Java program. The data seems to be JSON-ish and thus I am trying to read it using GSONs JSonParser.
Here is a short excerpt of what I try to parse (you can see it in full e.g. in http://arn.data.fr24.com/zones/italy_all.js?callback=pd_callback):
pd_callback({"45da286":["4CAAB6",43.5609,12.6837,211,34975,420,"7311","F-LIBP2","B772","EI-ISA",1411143824,"NRT","FCO","AZ785",0,0,"AZA785",0],
"45dae4b":["7809B9",47.5892,15.8256,204,36000,442,"7303","F-LKTB2","A332","B-5921",1411143824,"PVG","FCO","MU787",0,0,"CES787",0],
"45dae73":["76CEF7",47.6238,17.2440,291,36000,428,"0342","F-LKTB1","B77W","9V-SWW",1411143826,"SIN","LHR","SQ318",0,0,"SIA318",0],
"45db3e2":["71BC61",46.0211,10.0549,242,22800,344,"7313","F-LIPE1","B744","HL7461",1411143824,"ICN","MXP","KE927",0,-2176,"KAL927",0],
"full_count":10394,"version":4});
Each line in the example above resembles the data of a certain flight at that moment, with id, lat, lon, and so on. I have removed pd_callback(...); and just to try it I removed ""full_count":10394,"version":4" as well. When I try to parse this string to an iterable JSON Object Array, e.g. like this:
com.google.gson.JsonParser jsonParser = new JsonParser();
com.google.gson.JsonArray jsa = jsonParser.parse(line).getAsJsonArray();
I always get parsing errors. I think one of the problems is that the field names are not given and the syntax might be wrong, but thats actually how flightradar reads the data. In the end, I just want to be able to iterate through each flight and to read each attribute as a primitive to then feed it into my own objects.
Anyone with an idea where the problem is or how I can parse this kind of data conveniently? Any comments appreciated.

Efficient transcoding of Jackson parsed JSON

I'm using Jackson streaming API to deserialise a quite large JSON (on the order of megabytes) into POJO. It's working fine, but I'd like to optimize it (both memory and processing wise, code runs on Android).
The main problem I'd like to optimize away is converting a large number of strings from UTF-8 to ISO-8859-1. Currently I use:
String result = new String(parser.getText().getBytes("ISO-8859-1"));
As I understand it, parser originally copies token content into String (getText()), then creates a byte array from it (getBytes()), which is then used to create a final String in desired encoding. Way too much allocations and copying.
Ideal solution would be if getText() would accept the encoding parameter and just give me the final string, but that's not the case.
Any other ideas, or flaws in my thinking?
You can use:
parser.getBinaryValue() (present on version 2.4 of Jackson)
or you can implement an ObjectCodec (with a method readValue(...) that knows converting bytes to String in ISO8859-1) and set it using parser.setCodec().
If you have control over the json generation, avoid using a charset different than UTF-8.

Is a JSON array absolutely necessary for a list of JSON objects stored as carriage-delimited lines?

Suppose I have an extremely large data set, where each data item is serialized as a JSON object. When I place these into a file, I have two choices.
(1) I can write them out as one JSON object per line, delimited by carriage returns / newlines:
{ jsonobject }
{ jsonobject }
...
(2) However, the official spec says the above formatting isn't legal. I have to use a JSON array:
[
{ jsonobject },
{ jsonobject }
]
I think the first approach is easier to parse in code because you can read in one text line at a time (I'm using Java and will use BufferedReader.readLine()), and then parse that line into a JSON object.
The second approach may run out of memory depending on what parser you're using, right? The parser may need to read the whole file into memory to construct the array.
What is the best practice for this problem?

Categories