Jackson JSON API: Find all occurrences of key in a string - java

I've used Jackson off and on for a while, so I'm familiar with it, but it has been a while and I find myself facing a little issue that I feel should be simpler than what I'm seeing, but I could be wrong.
I have a JSON string in Java. I need to find all occurrences of a particular key within that string. This key could pop up in a number of places, either in the root object, subobject, or an array of objects with who knows how much nesting. Basically, I need to find all instances of this key (along with the value), rename the key, and change the value. It's also possible that this key will not exist at all, depending on the string.
Basically, I need to find key "a", but I could have any of the following:
{b: 3, c: [{a: 0},{a: 7}]}
{a: 5, c: [], f: {a: 12}}
You get the idea. The structure has some variability to it, but in all cases I need to find all (if any) occurrences of this key and make the needed changes. Is there a simple way to do this? In a nutshell, I know that I could do something like this:
Map<String, Object> map = objectMapper.readValue(...);
And iterate through the Map, typechecking, recursing, etc as needed, but that seems to be an excessive amount of code for this. I know I could do something similar with ObjectMapper.readTree(), but then I'd be doing a pretty similar operation (typecheck, iterate, recurse), perhaps with a little less code, but still bulky. I feel like there should be a simpler way, but maybe I'm mistaken.

I am guessing that what you are looking for could be one of methods in JsonNode:
findValues()
findValuesAsText()
findParents(...)
which are all slight variations of finding values of fields with given field name; returning either List of JsonNodes (first and third), or List of Strings (second).

Related

JAVA the efficient way to read a logs from file

I'm looking for most effective way to get all the elements from List<String> which contain some String value ("value1") for example.
First thought - simple iteration and adding the elements which contains "value1" to another List<String> But this task must be done very often and by many users.
Thought about list.RemoveAll(), but how do I remove all elements which don't contain "value1"?
So, what is the way to make it most efficiently?
UPDATE:
The whole picture - need to need to read the logs from file very often and for multiple users simultaneously. The logs must be filtered by the username from file. each string in file contains username.
In terms of time efficiency, you cannot get to better result than linear (O(n)) if you want to iterate through the whole list.
Deciding between LinkedList and ArrayList etc. is most likely irrelevant as the differences are small.
If you want a better time than linear to list size, you need to build on some assumptions and prerequisites:
if you know beforehand what string you'll search for, you can build another list along with your original list containing only relevant records
if you know you're going to query one list multiple times, you could build an index
If you just have a list on input that someone gave you, and you need to read through this one input once and find the relevant strings, then you're stuck with linear time since you cannot avoid reading the list at least once.
From your comments it seems like your list is a couple of log statements that should be grouped by user id (which would be your "value1"). If you really need to read the logs very often and for multiple users simultaneously you might consider some caching, possibly with grouping by user id.
As an example you could maintain an additional log file per user and just display it when needed. Alterantively you could keep the latest log statements in memory by employing some FIFO buffer which is grouped by user id (could be a buffer per user and maybe another LIFO layer on top of that).
However, depending on your use case it might not be worth the effort and you might just go and filter the list whenever the user requests to do so. In that case I'd recommend reading the file line by line and only adding the matching lines to the list. If you first read everything into a single list and then remove non-matching elements it'll be less efficient (you'd have to iterate more often, shift elements etc.) and temporarily use more memory (as opposed by discarding every non-matching line right after checking it).
Instead of List, Use TreeSet with provided Comparator so that all Strings with "value1" are at the beginning. When iterating, as soon as the string does not contain "value1", all the remaining do not have it, and you can stop to iterate.
The iteration is likely the only way, but you can allow Java to optimize it as much as possible (and use an elegant, non imperative syntax) by employing Java 8's streams:
// test list
List<String> original = new ArrayList<String>(){
{
add("value1");add("foo");add("foovalue1");add("value1foo");
}
};
List<String> trimmed = original
.stream()
.filter((s) -> s.contains("value1"))
.collect(Collectors.toList());
System.out.println(trimmed);
Output
[value1, foovalue1, value1foo]
Notes
One part of your question that may require more information is "performed often, by many users" - this may call for some concurrency-handling mechanism.
The actual functionality is not very clear. You may still have room to optimize your code early by fetching and collecting the "value1"-containing Strings prior to building you List
Ok, in this I can suggest you the simplest one, I had used.
Use of an Iterator, makes it easier but if you go with list.remove(val) , where val = "value1" , may give you UnsupportedOperationException
List list = yourList; /contains "value1"/
for (Iterator<String> itr = list.iterator(); itr.hasNext();){
String val = itr.next();
if(!val.equals("value1")){
itr.remove();
}
}
Try this one and let me know. :)

Storing a dictionary in a hashtable

I have an assignment that I am working on, and I can't get a hold of the professor to get clarity on something. The idea is that we are writing an anagram solver, using a given set of words, that we store in 3 different dictionary classes: Linear, Binary, and Hash.
So we read in the words from a textfile, and for the first 2 dictionary objects(linear and binary), we store the words as an ArrayList...easy enough.
But for the HashDictionary, he want's us to store the words in a HashTable. I'm just not sure what the values are going to be for the HashTable, or why you would do that. The instructions say we store the words in a Hashtable for quick retrieval, but I just don't get what the point of that is. Makes sense to store words in an arraylist, but I'm just not sure of how key/value pairing helps with a dictionary.
Maybe i'm not giving enough details, but I figured maybe someone would have seen something like this and its obvious to them.
Each of our classes has a contains method, that returns a boolean representing whether or not a word passed in is in the dictionary, so the linear does a linear search of the arraylist, the binary does a binary search of the arraylist, and I'm not sure about the hash....
The difference is speed. Both methods work, but the hash table is fast.
When you use an ArrayList, or any sort of List, to find an element, you must inspect each list item, one by one, until you find the desired word. If the word isn't there, you've looped through the entire list.
When you use a HashTable, you perform some "magic" on the word you are looking up known as calculating the word's hash. Using that hash value, instead of looping through a list of values, you can immediately deduce where to find your word - or, if your word doesn't exist in the hash, that your word isn't there.
I've oversimplified here, but that's the general idea. You can find another question here with a variety of explanations on how a hash table works.
Here is a small code snippet utilizing a HashMap.
// We will map our words to their definitions; word is the key, definition is the value
Map<String, String> dictionary = new HashMap<String, String>();
map.put("hello","A common salutation");
map.put("chicken","A delightful vessel for protein");
// Later ...
map.get("chicken"); // Returns "A delightful vessel for protein";
The problem you describe asks that you use a HashMap as the basis for a dictionary that fulfills three requirements:
Adding a word to the dictionary
Removing a word from the dictionary
Checking if a word is in the dictionary
It seems counter-intuitive to use a map, which stores a key and a value, since all you really want to is store just a key (or just a value). However, as I described above, a HashMap makes it extremely quick to find the value associated with a key. Similarly, it makes it extremely quick to see if the HashMap knows about a key at all. We can leverage this quality by storing each of the dictionary words as a key in the HashMap, and associating it with a garbage value (since we don't care about it), such as null.
You can see how to fulfill the three requirements, as follows.
Map<String, Object> map = new HashMap<String, Object>();
// Add a word
map.put('word', null);
// Remove a word
map.remove('word');
// Check for the presence of a word
map.containsKey('word');
I don't want to overload you with information, but the requirements we have here align with a data structure known as a Set. In Java, a commonly used Set is the HashSet, which is almost exactly what you are implementing with this bit of your homework assignment. (In fact, if this weren't a homework assignment explicitly instructing you to use a HashMap, I'd recommend you instead use a HashSet.)
Arrays are hard to find stuff in. If I gave you array[0] = "cat"; array[1] = "dog"; array[2] = "pikachu";, you'd have to check each element just to know if jigglypuff is a word. If I gave you hash["cat"] = 1; hash["dog"] = 1; hash["pikachu"] = 1;", instant to do this in, you just look it up directly. The value 1 doesn't matter in this particular case although you can put useful information there, such as how many times youv'e looked up a word, or maybe 1 will mean real word and 2 will mean name of a Pokemon, or for a real dictionary it could contain a sentence-long definition. Less relevant.
It sounds like you don't really understand hash tables then. Even Wikipedia has a good explanation of this data structure.
Your hash table is just going to be a large array of strings (initially all empty). You compute a hash value using the characters in your word, and then insert the word at that position in the table.
There are issues when the hash value for two words is the same. And there are a few solutions. One is to store a list at each array position and just shove the word onto that list. Another is to step through the table by a known amount until you find a free position. Another is to compute a secondary hash using a different algorithm.
The point of this is that hash lookup is fast. It's very quick to compute a hash value, and then all you have to do is check that the word at that array position exists (and matches the search word). You follow the same rules for hash value collisions (in this case, mismatches) that you used for the insertion.
You want your table size to be a prime number that is larger than the number of elements you intend to store. You also need a hash function that diverges quickly so that your data is more likely to be dispersed widely through your hash table (rather than being clustered heavily in one region).
Hope this is a help and points you in the right direction.

Help need in creating a hashset from a hashmap

I've been able to read a four column text file into a hashmap and get it to write to a output file. However, I need to get the second column(distinct values) into a hashset and write to the output file. I've been able to create the hashset, but it is grabbing everything and not sorting. By the way I'm new, so please take this into consideration when you answer. Thanks
Neither HashSet nor HashMap are meant to sort. They're fundamentally unsorted data structures. You should use an implementation of SortedSet, such as TreeSet.
Some guesses, related to mr Skeets answer and your apparent confusion...
Are you sure you are not inserting the whole line in the TreeSet? If you are going to use ONLY the second column, you will need to split() the strings (representing the lines) into columns - that's nothing that's done automatically.
Also, If you are actually trying to sort the whole file using the second column as key, You will need a TreeMap instead, and use the 2:nd column as key, and the whole line as data. But that won't solve the splitting, it only to keep the relation between the line and the key.
Edit: Here is some terminology for you, you might need it.
You have a Set. It's a collection of other objects - like String. You add other objects to it, and then you can fetch all objects in it by iterating through the set. Adding is done through the method add()and iterating can be done using the enhanced for loop syntax or using the iterator() method.
The set doesn't "grab" or "take" stuff; You add something to the set - in this case a String - Not an array of Strings which is written as String[]
(Its apparently possible to add array to a TreeSet (they are objects too) , but the order is not related to the contents of the String. Maybe thats what you are doing.)
String key = splittedLine[1]; // 2:nd element
"The second element of the keys" doesn't make sense at all. And what's the duplicates you're talking about. (note the correct use of apostrophes... :-)

Get a value from hashtable by a part of its key

Say I have a Hashtable<String, Object> with such keys and values:
apple => 1
orange => 2
mossberg => 3
I can use the standard get method to get 1 by "apple", but what I want is getting the same value (or a list of values) by a part of the key, for example "ppl". Of course it may yield several results, in this case I want to be able to process each key-value pair. So basically similar to the LIKE '%ppl%' SQL statement, but I don't want to use a (in-memory) database just because I don't want to add unnecessary complexity. What would you recommend?
Update:
Storing data in a Hashtable isn't a requirement. I'm seeking for a kind of a general approach to solve this.
The obvious brute-force approach would be to iterate through the keys in the map and match them against the char sequence. That could be fine for a small map, but of course it does not scale.
This could be improved by using a second map to cache search results. Whenever you collect a list of keys matching a given char sequence, you can store these in the second map so that next time the lookup is fast. Of course, if the original map is changed often, it may get complicated to update the cache. As always with caches, it works best if the map is read much more often than changed.
Alternatively, if you know the possible char sequences in advance, you could pre-generate the lists of matching strings and pre-fill your cache map.
Update: Hashtable is not recommended anyway - it is synchronized, thus much slower than it should be. You are better off using HashMap if no concurrency is involved, or ConcurrentHashMap otherwise. Latter outperforms a Hashtable by far.
Apart from that, out of the top of my head I can't think of a better collection to this task than maps. Of course, you may experiment with different map implementations, to find the one which suits best your specific circumstances and usage patterns. In general, it would thus be
Map<String, Object> fruits;
Map<String, List<String>> matchingKeys;
Not without iterating through explicitly. Hashtable is designed to go (exact) key->value in O(1), nothing more, nothing less. If you will be doing query operations with large amounts of data, I recommend you do consider a database. You can use an embedded system like SQLite (see SQLiteJDBC) so no separate process or installation is required. You then have the option of database indexes.
I know of no standard Java collection that can do this type of operation efficiently.
Sounds like you need a trie with references to your data. A trie stores strings and lets you search for strings by prefix. I don't know the Java standard library too well and I have no idea whether it provides an implementation, but one is available here:
http://www.cs.duke.edu/~ola/courses/cps108/fall96/joggle/trie/Trie.java
Unfortunately, a trie only lets you search by prefixes. You can work around this by storing every possible suffix of each of your keys:
For 'apple', you'd store the strings
'apple'
'pple'
'ple'
'le'
'e'
Which would allow you to search for every prefix of every suffix of your keys.
Admittedly, this is the kind of "solution" that would prompt me to continue looking for other options.
first of all, use hashmap, not hashtable.
Then, you can filter the map using a predicate by using utilities in google guava
public Collection<Object> getValues(){
Map<String,Object> filtered = Maps.filterKeys(map,new Predicate<String>(){
//predicate methods
});
return filtered.values();
}
Can't be done in a single operation
You may want to try to iterate the keys and use the ones that contain your desired string.
The only solution I can see (I'm not Java expert) is to iterate over the keys and check for matching against a regular expression. If it matches, you put the matched key-value pair in the hashtable that will be returned.
If you can somehow reduce the problem to searching by prefix, you might find a NavigableMap helpful.
it will be interesting to you to look throw these question: Fuzzy string search library in Java
Also take a look on Lucene (answer number two)

Properties in java - can we have comma-separated keys with single value?

I want to have multiple keys (>1) for a single value in a properties file in my java application. One simple way of doing the define each key in separate line in property file and the same value to all these keys. This approach increases the maintainability of property file. The other way (which I think could be smart way) is define comma separated keys with the value in single line. e.g.
key1,key2,key3=value
Java.util.properties doesn't support this out of box. Does anybody did simillar thing before? I did google but didn't find anything.
--manish
I'm not aware of an existing solution, but it should be quite straightforward to implement:
String key = "key1,key2,key3", val = "value";
Map<String, String> map = new HashMap<String, String>();
for(String k : key.split(",")) map.put(k, val);
System.out.println(map);
One of the nice things about properties files is that they are simple. No complex syntax to learn, and they are easy on the eye.
Want to know what the value of the property foo is? Quickly scan the left column until you see "foo".
Personally, I would find it confusing if I saw a properties file like that.
If that's what you really want, it should be simple to implement. A quick first stab might look like this:
Open file
For each line:
trim() whitespace
If the line is empty or starts with a #, move on
Split on "=" (with limit set to 2), leaving you with key and value
Split key on ","
For each key, trim() it and add it to the map, along with the trim()'d value
That's it.
Since java.util.Properties extends java.util.Hashtable, you could use Properties to load the data, then post-process the data.
The advantage to using java.util.Properties to load the data instead of rolling your own is that the syntax for properties is actually fairly robust, already supporting many of the useful features you might end up having to re-implement (such as splitting values across multiple lines, escapes, etc.).

Categories