Spring & MongoCursor & Jackson JsonNode (Database conversion) - java

I have a problem parsing a String I got from MongoCursor to work with JsonNode. I'm trying to get the MongoCursor's returned Json to work with my Spring SQL POJO, so I can insert it to my SQL database. Basically this is a database conversion, and SQL end is just for history storage. I didn't use spring's mongo, because the fields are somewhat different than the POJO's.(MongoDB and SQL have slightly different schema)
Currently, it works by using pattern matcher/ string split and replace then HashMap them so I can get a key and value pair of each field and then insert that into my spring POJO. I know I can also use jackson's POJO to work, but was told to use jsonNode as a better solution. There must be something i'm missing.
In the Jackson docs, the format of a "json" string is:
{ \"color\" : \"Black\", \"type\" : \"BMW\" }
However, that is not going to be the case of what MongoCursor returns me with. With the cursor, I get something like:
Document{{_id=G8HQW9123, User=test}}
which I used string pattern matcher and replaceAll to reduce to:
{_id:G8G8HQW9123, User:test}
However, jackson's slashes and double quotes are throwing me off and unable to parse that. Am I missing something? or do I have to actually add in those slashes & quotes in my code to make things work? Currently getting parse error which requests double quotes.

I think you're missing something here.
MongoCursor is returning you a Document Object, not a String.
Are you calling Document.toString() and working with the String result?
There should be no need for you to be doing any String parsing at all. You should be able to just take the Document Object from Mongo and call the getter functions on it to get the fields out that you need, which will preserve their data types as well (strings, numbers, booleans and dates), for example check all the functions on the BsonDocument class Javadocs: https://mongodb.github.io/mongo-java-driver/3.4/javadoc/org/bson/BsonDocument.html

Related

Parsing a large json file using gson

I have a really big json to read and store into database. I am using mix mode of stream and object using gson. If file format is correct it works like a charm. but if format is not correct within an object then whole file is skipped with an exception (reader.hasNext() throws exception).
Is there a way to skip a particular bad record and continue to read with rest of file?
Sample json file structure -
[{
"A":1,
"B":2,
"C":3
}]
and let say comma or colon is missing in this object.
Another example is if there are multiple objects and comma is missing between }(no comma){ 2 objects.
let say comma or colon is missing in this object
Unfortunately if you're missing a comma or a colon, then it's impossible to parse the JSON data.
But:
it's actually a good thing the parser doesn't accept this data because it protects you from accidentally reading garbage. Since you are putting this data into a database, it's protecting you from potentially filling your database with garbage.
I believe the best solution is to fix the producer of this JSON data and implement the necessary safe guards to prevent bad JSON data in the future.

How to check if a SPARQL query is a query or an update in Java?

I need to parse SPARQL and SPARQL Update queries in a Java application. I tried to do this by the use of the rdf4j library. This library provides possibilities to parse Queries (e.g. QueryParserUtil.parseQuery(...) or SyntaxTreeBuilder.parseQuery(...)) and possibilities to parse Updates (e.g. QueryParserUtil.parseUpdate(...) or SyntaxTreeBuilder.parseUpdateSequence(...)). But there is no method that allows to parse both of them. Therefore I need to figure out if the query string represents a query or an update.
When an update string is applied to a parseQuery() method an ParseException is thrown. This is also the case for the other way round. Of course it would be possible to always try the other method if an exception is thrown. But that would be a bad programming style.
Is there a method in the rdf4j library that can be used to check whether the queryString represents an update or a simple query?
And if not, are there other solutions for parsing both updates and queries?
You can use QueryParserUtil.parseOperation() for this purpose. It parses the String, and provides you back with a ParsedOperation object, which has two subtypes: ParsedUpdate and ParsedQuery. Then you can do a simple instanceof check:
String str = "....";
ParsedOperation operation = QueryParserUtil.parseOperation(QueryLanguage.SPARQL, str, null);
if (operation instanceof ParsedQuery) {
// it's a query
} else {
// it's an update
}

Parse json string of the underline format to compare the values with another map

I have a JSON string that looks like:
"{\"info\":{\"length\":{\"value\":18},\"name\":{\"value\":\"ABC\"}}}"
say, length and name are attribute names
I have another map (say attributeMap) that (created from the results I retrieve from the database) map has attribute name and attribute value association stored.
I need to be able to parse the string and compare the value an attribute has in the above string with the value returned from the attributeMap. Based on those comparisons, I will need to take some decisions.
In order to do this, I should convert the above string to a format that would help make the above comparison easier and efficient. I don't think I should be writing my own parser to do this. what would a right way to do this?
You should use any JSON Parser, like GSON (Google) (Recommended for simplicity), JACKSON, the simple org.json, or any other..
Then you will get a JSONObject/JSONNode to navigate and do the comparison.
You can find a parsing example here: How to parse JSON in Java

Parsing Apache CSV like string into objects

I'm trying to parse data obtained via Apache HTTPClient in the fastest and most efficient way possible.
The data returned by the response is a string but in a CSV like format:
e.g. the String looks like this:
date, price, status, ...
2014-02-05, 102.22, OK,...
2014-02-05, NULL, OK
I thought about taking the string and manually parsing it, but this may be too slow as I have to do this for multiple requests.
Also the data returned is about 23,000 lines from one source and I may have to parse potentially several sources.
I'm also storing the data in a hash map of type:
Map<String, Map<String, MyObject>>
where the key is the source name, and value is a map with the parsed objects as a key.
So I have 2 questions, best way to parse a 23,000 line file into objects, and best way to store it.
I tried a csv parser, however the double's if not present are stored as NULL and not 0 so I will need to manually parse it.
Thanks

On Google App Engine is it possible JDOQL Date without using a parameterised query?

Is it possible to do a date query in JDOQL without using a parameterrised query on Google App Engine.
I am trying to write some generic code that looks something like this, where criteria is just a string, and I would like to be able to specify anything - with this piece of code not needing to to know much about the underlying data.
Query query = pm.newQuery(tClass);
if (criteria!=null) {
query.setFilter(criteria);
}
criteria could be "startdate = 'someproperlyformatteddatetime'"
Thanks for your suggestions.
Of course, GAE JDO queries support JDOQL. You could simply do something like this: q.setFilter("height <= 200") or q.setFilter("name == 'Smith'"), where you would programmatically assemble the JDOQL filter string. The only downside is that you need to know the type of parameters (as saved in Datastore), as strings need to be enclosed in single or double quotes.
Note that all restrictions on queries still apply.
Also, if you want to query on multiple properties where you also use inequality operator, then you need to define compound indexes beforehand.
Update: JDOQL literal parameter specification works with string and numeric values; all other value types must use parameter substitution. You could still do that programmatically.
Another workaround would be if you use long instead of Date and convert dates to UNIX timestamps (which are of type long).

Categories