Load Dictionary Text File Into Java - java

I need to load a text file of information into Java. The Text file looks like this
"reproduce": {
"VB": 7
},
"drill": {
"VB": 8,
"NN": 16
},
"subgross": {
"JJ": 2
},
"campsites": {
"NNS-HL": 1,
"NNS": 1
},
"streamed": {
"VBN": 1,
"VBD": 2
}
It is basically a huge collection of words with some tags included. I need to save this information in some sort of Java data-structure so that the program can search and retrieve tag statistics for a given word.
From what I've read, using a type of HashMap would be the best idea? Something like:
Map<KeyType, List<ValueType>>
Is that a good idea? How would I go about scanning this data from the text file? I could probably find a way to print the dictionary to the text file that would be easier to scan into Java.

While your input does not look exactly like JSON, you might be able to preprocess[1] it in a simple way to make it valid JSON. Because JSON is probably much more widespread and therefore better supported than your custom format.
If your problem then is JSON deserialization, then take a look at Jackson or Gson, which will convert your input string into objects.
Simple example in Jackson:
ObjectMapper mapper = new ObjectMapper(); // can reuse, share globally
Map<String,Object> data = mapper.readValue(new File("file.json"), Map.class);
// process data further here ...
Both Jackson and Gson have a lot of options and can handle complex inputs in various ways, e.g. they can serialize and deserialize from and to Maps, custom Objects, can handle polymorphism (mapping different inputs to objects of different classes) and more.
Given the input, that is currently in your question, you can simply prepend and append a curly bracket, and you would have valid JSON:
{
"reproduce": {
"VB": 7
},
"drill": {
"VB": 8,
"NN": 16
},
"subgross": {
"JJ": 2
},
"campsites": {
"NNS-HL": 1,
"NNS": 1
},
"streamed": {
"VBN": 1,
"VBD": 2
}
}

Related

Best way to change JSON keys with JACKSON or other java lib

I have a Json file or string for example:
{
"my-key0": "ke0",
"key-Arr": [
{
"nested-key1": {
"value": "val",
"seqno": 12
},
"nested2": 1
},
{
"dns-sss-qqq": [
{
"some": "aaaaa"
}
]
}
],
"recsize": 459,
"my-obj": {
"my-key1": {
"my-key2": "key2"
}
}
}
My purpose is to replace "-" char to "_" char only in keys in Scala/Java.
In first I thought it can be done with REGEX but the keys can be UNQUOTED and it also can effect on values.
What is most efficient way to it?(Performance is matter)
I have to process GBs of such records.
Thank you
Try jsoniter-scala - it supports kebab-case since v0.17.0 and also it is more efficient in parsing and serialization than jackson-module-scala.
Here are latest results of benchmarks which compare parsing & serialization performance of jsoniter-scala vs. jackson-module-scala, circe and play-json libraries using JDK 8.
Also it has ability to parse streaming JSON values and JSON arrays from java.io.InputStream w/o need of holding all parsed values in the memory.
Extraction of some selected fields or substructures instead of parsing whole message or document is where jsoniter-scala shines.
So try just use it instead of conversion of all your data.

Replace json / xml key or values based on other source

I want to encode json/xml payload based on other property/json file.
master file:
{
   "keys": {
      "Name": "abcd",
      "age": "trst",
      "USA": "bcd",
      "country": "wert"
   }
}
Source payload:
{
   "Name": "John",
   "age": 23,
   "Address": {
      "state": "Texas",
      "country": "USA"
   }
}
expected encoded payload:
{
   "abcd": "John",
   "trst": 23,
   "Address": {
      "state": "Texas",
      "wert": "bcd"
   }
}
NOTE:
this Source payload can be a xml file if needed. ( if that can provide fast solution than json. in that case expected encoded payload also can be a xml)
I have couple of ideas,
keep master file in a map and traverse through json object / xml file reading each key and value. while traversing read from map and replace
consider source payload as string and do string replace using regex. (create dynamic regex using master file like ("Name"|"age"|"USA"|"country") and parse and replace that
Objective is find most accurate and performance wise good solution. appreciate if you can share your ideas and small sample if possible. OR is there any library where we can do this type of things?

GSON parsing without a lot of classes

I have the following JSON and I'm only interested in getting the elements "status", "lat" and "lng".
Using Gson, is it possible to parse this JSON to get those values without creating the whole classes structure representing the JSON content?
JSON:
{
"result": {
"geometry": {
"location": {
"lat": 45.80355369999999,
"lng": 15.9363229
}
}
},
"status": "OK"
}
You don't need to define any new classes, you can simply use the JSON objects that come with the Gson library. Heres a simple example:
JsonParser parser = new JsonParser();
JsonObject rootObj = parser.parse(json).getAsJsonObject();
JsonObject locObj = rootObj.getAsJsonObject("result")
.getAsJsonObject("geometry").getAsJsonObject("location");
String status = rootObj.get("status").getAsString();
String lat = locObj.get("lat").getAsString();
String lng = locObj.get("lng").getAsString();
System.out.printf("Status: %s, Latitude: %s, Longitude: %s\n", status,
lat, lng);
Plain and simple. If you find yourself repeating the same code over and over, then you can create classes to simplify the mapping and eliminate repetition.
It is indeed possible, but you have to create a custom deserializer. See Gson documentation here and Gson API Javadoc here for further info. And also take a look at other reponses of mine here and here... and if you still have doubts, comment.
That said, in my opinion it is much easier for you to parse it creating the correspondent classes, even more taking into account the simplicity of your JSON response... With the usual approach you only have to write some super-simple classes, however, writing a custom deserializer, although is not that complex, it will take you probably longer, and it will be more difficult to adapt if later on you need some data else of your JSON...
Gson has a way of operating that has been designed for developers to use it, not for trying to find workarounds!
Anyway, why do you not want to use classes? If you don't like to have many classes in your project, you can just use nested classes and your project will look cleaner...

ANDROID usage of Jackson library: How to load object with indexes - range from to

I have really big JSON file for parsing and managing. My JSON file contains structure like this
[
{"id": "11040548","key1":"keyValue1","key2":"keyValue2","key3":"keyValue3","key4":"keyValue4","key5":"keyValue5","key6":"keyValue6","key7":"keyValue7","key8":"keyValue8","key9":"keyValue9","key10":"keyValue10","key11":"keyValue11","key12":"keyValue12","key13":"keyValue13","key14":"keyValue14","key15":"keyValue15"
},
{"id": "11040549","key1":"keyValue1","key2":"keyValue2","key3":"keyValue3","key4":"keyValue4","key5":"keyValue5","key6":"keyValue6","key7":"keyValue7","key8":"keyValue8","key9":"keyValue9","key10":"keyValue10","key11":"keyValue11","key12":"keyValue12","key13":"keyValue13","key14":"keyValue14","key15":"keyValue15"
},
....
{"id": "11040548","key1":"keyValue1","key2":"keyValue2","key3":"keyValue3","key4":"keyValue4","key5":"keyValue5","key6":"keyValue6","key7":"keyValue7","key8":"keyValue8","key9":"keyValue9","key10":"keyValue10","key11":"keyValue11","key12":"keyValue12","key13":"keyValue13","key14":"keyValue14","key15":"keyValue15"
}
]
My JSON file contains data about topics from news website and practically every day this JSON file will be increased dramatically.
For parsing of that file I use
URL urlLinkSource = new URL(OUTBOX_URL);
urlLinkSourceReader = new BufferedReader(new InputStreamReader(
urlLinkSource.openStream(), "UTF-8"));
ObjectMapper mapper = new ObjectMapper();
List<DataContainerList> DataContainerListData = mapper.readValue(urlLinkSourceReader,new TypeReference<List<DataContainerList>>() { }); //DataContainerList contains id, key1, key2, key3..key15
My problem is that I want to load in this line
List<DataContainerList> DataContainerListData = mapper.readValue(urlLinkSourceReader,new TypeReference<List<DataContainerList>>() { });
only range of JSON object - just first ten object, just second ten object - because I need to display in my app just 10 news in paging mode (all the time I know the index of which 10 I need to display). It totally stuped to load 10 000 objects and to iterate just first 10 of them. So my question is how I can load
in similar way like this one:
List<DataContainerList> DataContainerListData = mapper.readValue(urlLinkSourceReader,new TypeReference<List<DataContainerList>>() { });
only objects with indexes FROM -TO (for example from 30 to 40) without loading of all objects in the entire JSON file?
Regards
It depends of what you mean by "load object with indexes from to", if you want to
Read everything but bind only a sublist
The solution in that case is to read the full stream and only bind values within those indexes.
You can use jacksons streaming api and do it yourself. Parse the stream use a counter to keep track of actual index and then bind to POJOs only what you need.
However this is not a good solution if your file is large and its done in real time.
Read only the data between those indexes
You should do that if your file is big and performance matters. Instead of having a single big file, do the pagination by splitting your json array into multiple files matching your ranges, and then just deserialize the specific file content into your array.
Hope this helps...

How do I modify a large json string?

Dead silence! Not often you experience that on Stackoverflow... I've added a small bounty to get things going!
I've built a json document containing information about the location of various countries. I have added some custom keys. This is the beginning of the json-file:
{
"type": "FeatureCollection",
"features": [
{ "type": "Feature", "properties": {
"NAME": "Antigua and Barbuda",
"banned/censored": "AG",
"Bombed": 29,
"LON": -61.783000, "LAT": 17.078000 },
"geometry": { "type": "MultiPolygon", "coordinates": [ [ [ [ -61.686668,...
All the custom keys (like bombed, banned/censored etc.) have values, but they are just old (bogus if you want) values. The real values are kept in a .csv file extracted from a excel document.
I e.g. have this:
banned/censored bombed
Antigua and Barbuda 2 120
...
Now I want to match these values with the proper key in the json-file. Is there any programs out there that I can use? Another option would be a json library for java, which somehow supports what I want. I havent been able to find an easy solution for it yet. The document is pretty large ~ 10MB, if it makes any difference!
EDIT: I've used QGIS to manipulate the .shp file, so some kind of extension could be of use too.
Just convert both the JSON and the CSV to a fullworthy Java object. This way you can write any Java logic to your taste to alter the Java objects depending on the one or other. Finally convert the modified Java object representing the JSON data back to a JSON string.
There is however one problem in your JSON. The / in banned/censored is not a valid character for a JSON field name, so many of the existing JSON deserializers may choke on this. If you fix this, then you'll be able to use one of them.
I can recommend using Google Gson for the converting between JSON and Java. Here's a kickoff example based on your JSON structure (with banned/censored renamed to bannedOrCensored):
class Data {
private String type;
private List<Feature> features;
}
class Feature {
private String type;
private Properties properties;
private Geometry geometry;
}
class Properties {
private String NAME;
private String bannedOrCensored;
private Integer Bombed;
private Double LON;
private Double LAT;
}
class Geometry {
private String type;
private Double[][][][] coordinates;
}
You only need to add/generate getters and setters yourself. Then, you'll be able to convert between JSON and Java like follows:
Data data = new Gson().fromJson(jsonString, Data.class);
To convert between CSV and a Java object, just pick one of the many CSV parsers, like OpenCSV. You can even homegrow your own with help of BufferedReader.
Finally, after altering the Java object representing the JSON data, you can convert it back to JSON string with help of Gson as follows:
String json = new Gson().toJson(data);
While BalusC's answer tells you how to do it in your current setup, I have a more radical suggestion: get rid of the JSON.
By idea JSON is not meant to store data - it is meant to be used as a "lightweight text-based open standard designed for human-readable data interchange". That is:
low-traffic (as little non-meaningful data as possible)
human-readable
easy to handle with dynamic languages
Data storages on the other hand have much more requirements than this. That's why databases exist. So move your storage to a database. If you don't want a full-featured database, use something like HSQLDB or JavaDB.

Categories