Java String array parsing and getting data - java

String input data is
{phone=333-333-3333, pr_specialist_email=null, sic_code=2391, status=ACTIVE, address1=E.BALL Drive, fax=333-888-3315, naics_code=325220, client_id=862222, bus_name=ENTERPRISES, address2=null, contact=BYRON BUEGE}
Key and values will increase in the array.
I want to get the value for each key ie myString.get("phone") should return 333-333-3333
I am using Java 1.7, is there any tools I can use this to parse the data and get the values.
Some of my input is having values like,
{phone=000000002,Desc="Light PROPERTITES, LCC", Address1="C/O ABC RICHARD",Address2="6508 THOUSAND OAKS BLVD.,",Adress3="SUITE 1120",city=MEMPHIS,state=TN,name=,dob=,DNE=,}
Comma separator doesn't work here

Here is a simple function that will do exacly what you want. It takes your string as an input and returns a Hashmap containing all the keys and values.
private HashMap<String, String> getKeyValueMap(String str) {
// Trim the curly ({}) brackets
str = str.trim().substring(1, str.length() - 1);
// Split all the key-values tuples
String[] split = str.split(",");
String[] keyValue;
HashMap<String, String> map = new HashMap<String, String>();
for (String tuple : split) {
// Seperate the key from the value and put them in the HashMap
keyValue = tuple.split("=");
map.put(keyValue[0].trim(), keyValue[1].trim());
}
// Return the HashMap with all the key-value combinations
return map;
}
Note: This will not work if there's ever a '=' or ',' character in any of the key names or values.
To get any value, all you have to do is:
HashMap<String, String> map = getKeyValueMap(...);
String value = map.get(key);

You can write a simple parser yourself. I'll exclude error checking in this code for brevity.
You should first remove the { and } characters, then split by ', ' and split each resulting string by =. At last add the results into a map.
String input = ...;
Map<String, String> map = new HashMap<>();
input = input.substring(1, input.length() - 1);
String elements[] = input.split(", ");
for(String elem : elements)
{
String values[] = elem.split("=");
map.put(values[0].trim(), values[1].trim());
}
Then, to retrieve a value, just do
String value = map.get("YOURKEY");

You can use "Google Core Libraries for Java API" MapSplitter to do your job.
First remove the curly braces using substring method and use the below code to do your job.
Map<String, String> splitKeyValues = Splitter.on(",")
.omitEmptyStrings()
.trimResults()
.withKeyValueSeparator("=")
.split(stringToSplit);

Related

Java1.8 Split string into key-value pairs

I have a string like this but very big string
String data = "created:2022-03-16T07:10:26.135Z,timestamp:2022-03-16T07:10:26.087Z,city:Bangalore,Country:Ind";
Now : indicates key-value pairs while , separates the pairs. I want to add the key-value pairs to a HashMap. I am expecting output:-
{created=2022-03-16T07:10:26.135Z,timestamp=2022-03-16T07:10:26.087Z,city=Bangalore,Country=Ind}
I tried in multiple way but I am getting like that
{timestamp=2022-03-16T07, created=2022-03-16T07}
Based on the information provided, here one way to do it. It required both splitting in sections and limiting the size and location of the split.
String data = "created:2022-03-16T07:10:26.135Z,timestamp:2022-03-16T07:10:26.087Z,city:Bangalore,Country:Ind";
Map<String, String> map =
Arrays.stream(data.split(","))
.map(str -> str.split(":", 2))
.collect(Collectors.toMap(a -> a[0], a -> a[1]));
map.entrySet().forEach(System.out::println);
See this code run live at IdeOne.com.
city=Bangalore
created=2022-03-16T07:10:26.135Z
Country=Ind
timestamp=2022-03-16T07:10:26.087Z
As I said in the comments, you can't use a single map because of the duplicate keys. You may want to consider a class as follows to hold the information
class CityData {
private String created; // or a ZonedDateTime instance
private String timeStamp;// or a ZonedDateTime instance
private String city;
private String country;
#Getters and #setters
}
You could then group all the cities for of a given country for which you had data in a map as follows:
Map<String, List<CityData>> where the Key is the country.
var data="created:2022-03-16T07:10:26.135Z,timestamp:2022-03-16T07:10:26.087Z";
var split = data.split(","); // splitting created and timestamp
var created = split[0].substring(8); // 8 is size of 'created:'
var timestamp = split[1].substring(10); // 10 is size of 'timestamp:'
Map<String, String> result = new HashMap<>();
result.put("created", created);
result.put("timestamp", timestamp);
output:
{created=2022-03-16T07:10:26.135Z, timestamp=2022-03-16T07:10:26.087Z}
You need to split the data and iterate on this, split it one more time on colon by specifying index=2 and store the result in a Map. If you want to preserve the order use LinkedHashMap.
Map<String, String> map = new LinkedHashMap<>();
String data = "created:2022-03-16T07:10:26.135Z,timestamp:2022-03-16T07:10:26.087Z,city:Bangalore,Country:Ind";
String[] split = data.split(",");
for (String str: split) {
String[] pair = str.split(":", 2);
map.put(pair[0],pair[1]);
}
System.out.println(map);
Output: {created=2022-03-16T07:10:26.135Z, timestamp=2022-03-16T07:10:26.087Z, city=Bangalore, Country=Ind}

Replace a map of values in string

Let's say I have a String text = "abc" and I want to replace a map of values, eg:
a->b
b->c
c->a
How would you go for it?
Because obviously:
map.entrySet().forEach(el -> text = text.replaceAll(el.getKey(), el.getValue()))
won't work, since the second replacement will overwrite also the first replacement (and at the end you won't get bca)
So how would you avoid this "replacement of the previous replacement"?
I saw this answer but I hope in a more concise and naive solution (and hopefully without the use of Apache external packages)
By the way the string can be also more than one character
I came up with this solution with java streams.
String text = "abc";
Map<String, String> replaceMap = new HashMap<>();
replaceMap.put("a", "b");
replaceMap.put("b", "c");
replaceMap.put("c", "a");
System.out.println("Text = " + text);
text = Arrays.stream(text.split("")).map(x -> {
String replacement = replaceMap.get(x);
if (replacement != null) {
return x.replace(x, replacement);
} else {
return x;
}
}).collect(Collectors.joining(""));
System.out.println("Processed Text = " + text);
Output
Text = abc
Processed Text = bca
This is a problem I'd normal handle with regex replacement. The code for that in Java is a bit verbose, but this should work:
String text = "abc";
Map<String, String> map = new HashMap<>();
map.put("a", "b");
map.put("b", "c");
map.put("c", "a");
String regex = map.keySet()
.stream()
.map(s -> Pattern.quote(s))
.collect(Collectors.joining("|"));
String output = Pattern.compile(regex)
.matcher(text)
.replaceAll((m) -> {
String s = m.group();
String r = map.get(s);
return r != null ? r : s;
});
System.out.println(output);
// bca
It's relatively straightforward, if a little verbose because Java. First, create a regex expression that will accept any of the keys in the map (using Pattern.quote() to sanitize them), and then use lambda replacement to pluck the appropriate replacement from the map whenever an instance is found.
The performance-intensive part is just compiling the regex in the first place; the replacement itself should make only one pass through the string.
Should be compatible with Java 1.9+
Java 8 onwards, there is a method called chars that returns an IntStream from which you can get a character corresponding to integer represented by the character and map it using your map.
If your map is String to String map then you could use:
text = text.chars().mapToObj(el -> map.get(String.valueOf((char)el))).
collect(Collectors.joining(""));
if your map is Character to Character then just remove String.valueOf()
text = text.chars().mapToObj(el -> map.get((char)el)).collect(Collectors.joining(""));

Fast way to creating a word-occurrence counting vector

I have a HashMap<String, Integer> vocabulary, containing words and their weight (not important, only the string is important here):
vocabulary = ["this movie"=5, "great"=2, "bad"=2, ...]
and a tokenized string as a list:
String str = "this movie is great";
List<String> tokens = tokenize(str) // tokens = ["this", "movie", "is", "great", "this movie", "is great", ...]
Now I need a fast way to create a vector for this tokenized string, that counts for every entry of the vocabulary, the number of occurrences of this word within the tokenized string
HashMap<String, Integer> vec = new HashMap();
Iterator it = vocabulary.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pair = (Map.Entry) it.next();
String word = (String) pair.getKey();
int count = 0;
for (String w : tokens) {
if (w.equals(word)) {
count += 1;
}
}
vec.put(word, count);
}
So, vec should be ["this movie"=1, "great"=1, bad = 0]
Is there a better performing way to do this? I'm having performance issues in a larger context and assumed that the issue must be here, since vocabulary has approximately 300'000 entries. A normal tokenized text contains around 100 words.
Is it a problem that vocabulary is a hashMap?
Count the number of occurrences of each element of tokens:
Map<String, Long> tokensCount = tokens.stream().collect(
Collectors.groupingBy(Function.identity(), Collectors.counting()));
Then just look up from this map instead of your inner loop:
count = tokensCount.getOrDefault(word, 0L).intValue();
This is faster because the lookup in the map is O(1), whereas iterating the tokens looking for equal elements is O(# tokens).
Also note that you aren't using pair other than to get its key, so you can iterate vocabulary.keySet(), rather than vocabulary.entrySet().
Additionally, if you weren't using a raw iterator, you wouldn't need the explicit casts:
Iterator<Map.Entry<String, Integer>> it = ...
Edit, now that you've added the relative sizes of the two collections:
You can simply iterate tokens, and see if vocabulary contains that:
Map<String, Integer> vec = new HashMap<>();
for (String token : tokens) {
if (vocabulary.contains(token)) {
vec.merge(token, 1, (old,v) -> old+v);
}
}
If vocabulary is already a HashMap, there is no need to iterate over it. Simply use the method contains which, in the case of the HashMap, is constant (O(1)), so you only have to iterate over the token list.
for(String w : tokens) {
if(vocabulary.contains(w)) {
vec.put(w, vec.get(w) + 1);
}
}

Java: Replace in TreeMap

I have a Treemap:
TreeMap<String, Integer> map = new TreeMap<String, Integer>();
It counts words that are put in, for example if I insert:
"Hi hello hi"
It prints:
{Hi=2, Hello=1}
I want to replace that "," with a "\n", but I did not understand the methods in Java library. Is this possible to do? And is it possible to convert the whole TreeMap to a String?
When printing the map to the System.out is uses the map's toString function to print the map to the console.
You could either string replace the comma with a newline like this:
String stringRepresentation = map.toString().replace(", ", "\n");
This might however poses problems when your key in the map contains commas.
Or you could create a function to produce the desired string format:
public String mapToMyString(Map<String, Integer> map) {
StringBuilder builder = new StringBuilder("{");
for (Map.Entry<String, Integer> entry : map.entrySet()) {
builder.append(entry.getKey()).append('=').append(entry.getValue()).append('\n');
}
builder.append('}');
return builder.toString();
}
String stringRepresentation = mapToMyString(map);
Guava has a lot of useful methods. Look at Joiner.MapJoiner
Joiner.MapJoiner joiner = Joiner.on('\n').withKeyValueSeparator("=");
System.out.println(joiner.join(map));

Java String not replacing with hashmap values

for (Entry<String, String> entry : map.entrySet()) {
String delimiter = "**";
result = result.replace(delimiter + entry.getKey() + delimiter, entry.getValue());
}
result is my string to be replaced by hashmap values.
Here string (result variable) is returning as itself not replacing any value.
Please any one have suggestions ?
From comment
My hashmap contains,
HashMap<String, String> map = new HashMap<String, String>();
map.put("Rid", serviceBooking.getId().toString());
map.put("Rname", customer.getName());
map.put("Rnic", "");
The algorithm OK Here's a working, executable example:
// sample input
String input = "abcd **Rid** efgh";
// a small map
Map<String, String> map = new HashMap<String, String>();
map.put("Rid","VALUE");
// the loop that replaces the **Rid** substring
for (Map.Entry<String,String> entry:map.entrySet()){
input = input.replace("**"+entry.getKey()+"**", entry.getValue());
System.out.println(input);
}
It prints
abcd VALUE efgh
Either your HashMap is empty, or the original string doesn't contain anything corresponding to the keys in the map, or you wrote two asterisks where you meant one, or you didn't escape it/them when you needed to, or ...
Impossible to improve on that without seeing the original string and the contents of the map.

Categories