Convert delimited String to delimited Long as per String to Long mapping - java

I have a string :
My name is amit
and a mapping :
My -> 1121
name -> 1122
is -> 1123
amit -> 1124
I want to get back :
1121 1122 1123 1124
where every word is mapped to its corresponding long value which is held in a HashMap.
Here is my version:
public String delimtedStringToLong(String input, String delimiter, Map<String, Long> map) {
String[] arr = input.split(delimiter);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < arr.length; i++) {
sb.append(String.valueOf(map.get(arr[i])) + delimiter);
}
return sb.toString();
}
I am doing this in Java 8. Can there be better approach for this. Thanks!

I'd split the input string, stream it through a mapping function that takes the value from the map and then collect it back:
String input = "My name is amit";
Map<String, Long> map = new HashMap<>();
map.put("My", 1121L);
map.put("name", 1122L);
map.put("is", 1123L);
map.put("amit", 1124L);
String output =
Arrays.stream(input.split(" "))
.map(s -> String.valueOf(map.get(s)))
.collect(Collectors.joining());

Related

Get,Put key and values from nested hashmap

I want to create a nested HashMap which returns the frequency of terms among multiple files. Like,
Map<String, Map<String, Integer>> wordToDocumentMap=new HashMap<>();
I have been able to return the number of times a term appears in a file.
Map<String, Integer> map = new HashMap<>();//for frequecy count
String str = "Wikipedia is a free online encyclopedia, created and edited by
volunteers around the world."; //String str suppose a file a.java
// The query string
String query = "edited Wikipedia volunteers";
// Split the given string and the query string on space
String[] strArr = str.split("\\s+");
String[] queryArr = query.split("\\s+");
// Map to hold the frequency of each word of query in the string
Map<String, Integer> map = new HashMap<>();
for (String q : queryArr) {
for (String s : strArr) {
if (q.equals(s)) {
map.put(q, map.getOrDefault(q, 0) + 1);
}
}
}
// Display the map
System.out.println(map);
In my code its count the frequency of the given query Individually. But I want to Map the query term and its frequency with its filenames. I have searched around the web for a solution but am finding it tough to find a solution that applies to me. Any help would be appreciated!
I hope I'm understanding you correctly.
What you want is to be able to read in a list of files and map the file name to the map you create in the code above. So let's start with your code and let's turn it into a function:
public Map<String, Integer> createFreqMap(String str, String query) {
Map<String, Integer> map = new HashMap<>();//for frequecy count
// The query string
String query = "edited Wikipedia volunteers";
// Split the given string and the query string on space
String[] strArr = str.split("\\s+");
String[] queryArr = query.split("\\s+");
// Map to hold the frequency of each word of query in the string
Map<String, Integer> map = new HashMap<>();
for (String q : queryArr) {
for (String s : strArr) {
if (q.equals(s)) {
map.put(q, map.getOrDefault(q, 0) + 1);
}
}
}
// Display the map
System.out.println(map);
return map;
}
OK so now you have a nifty function that makes a map from a string and a query
Now you're going to want to set up a system for reading in a file to a string.
There are a bunch of ways to do this. You can look here for some ways that work for different java versions: https://stackoverflow.com/a/326440/9789673
lets go with this (assuming >java 11):
String content = Files.readString(path, StandardCharsets.US_ASCII);
Where path is the path to the file you want.
Now we can put it all together:
String[] paths = ["this.txt", "that.txt"]
Map<String, Map<String, Integer>> output = new HashMap<>();
String query = "edited Wikipedia volunteers"; //String query = "hello";
for (int i = 0; i < paths.length; i++) {
String content = Files.readString(paths[i], StandardCharsets.US_ASCII);
output.put(paths[i], createFreqMap(content, query);
}

Find duplicates in first column and take average based on third column

My issue here is I need to compute average time for each Id and compute average time of each id.
Sample data
T1,2020-01-16,11:16pm,start
T2,2020-01-16,11:18pm,start
T1,2020-01-16,11:20pm,end
T2,2020-01-16,11:23pm,end
I have written a code in such a way that I kept first column and third column in a map.. something like
T1, 11:16pm
but I could not able to compute values after keeping those values in a map. Also tried to keep them in string array and split into line by line. By same issue facing for that approach also.
**
public class AverageTimeGenerate {
public static void main(String[] args) throws IOException {
File file = new File("/abc.txt");
try (BufferedReader reader = new BufferedReader(new FileReader(file))) {
while (true) {
String line = reader.readLine();
if (line == null) {
break;
}
ArrayList<String> list = new ArrayList<>();
String[] tokens = line.split(",");
for (String s: tokens) {
list.add(s);
}
Map<String, String> map = new HashMap<>();
String[] data = line.split(",");
String ids= data[0];
String dates = data[1];
String transactionTime = data[2];
String transactionStartAndEndTime = data[3];
String[] transactionIds = ids.split("/n");
String[] timeOfEachTransaction = transactionTime.split("/n");
for(String id : transactionIds) {
for(String time : timeOfEachTransaction) {
map.put(id, time);
}
}
}
}
}
}
Can anyone suggest me is it possible to find duplicates in a map and compute values in map, Or is there any other way I can do this so that the output should be like
`T1 2:00
T2 5:00'
I don't know what is your logic to complete the average time but you can save data in map for one particular transaction. The map structure can be like this. Transaction id will be the key and all the time will be in array list.
Map<String,List<String>> map = new HashMap<String,List<String>>();
You can do like this:
Map<String, String> result = Files.lines(Paths.get("abc.txt"))
.map(line -> line.split(","))
.map(arr -> {
try {
return new AbstractMap.SimpleEntry<>(arr[0],
new SimpleDateFormat("HH:mm").parse(arr[2]));
} catch (ParseException e) {
return null;
}
}).collect(Collectors.groupingBy(Map.Entry::getKey,
Collectors.collectingAndThen(Collectors
.mapping(Map.Entry::getValue, Collectors.toList()),
list -> toStringTime.apply(convert.apply(list)))));
for simplify I've declared two functions.
Function<List<Date>, Long> convert = list -> (list.get(1).getTime() - list.get(0).getTime()) / 2;
Function<Long, String> toStringTime = l -> l / 60000 + ":" + l % 60000 / 1000;

Processing the particular key value pair in set of key value pairs in java

The string can be
"accountno=18&username=abc&password=1236" or "username=abc&accountno=18&password=1236" or the accountno can be present anywhere in the string.
I need to get the accountno details from this string using a key value pair. I used spilt on "&" but I'm unable to get the result.
import java.util.regex.*;
public class RegexStrings {
public static void main(String[] args) {
String input = "accountno=18&username=abc&password=1236";
String exten = null;
Matcher m = Pattern.compile("^accountno: (.&?)$", Pattern.MULTILINE).matcher(input);
if (m.find()) {
exten = m.group(1);
}
System.out.println("AccountNo: "+exten);
}
}
How can I get the accountno value from this above string as key value pair in java
You may handle this by first splitting on & to isolate each key/value pair, then iterate that collection and populate a map:
String input = "accountno=18&username=abc&password=1236";
String[] parts = input.split("&");
Map<String, String> map = new HashMap<>();
for (String part : parts) {
map.put(part.split("=")[0], part.split("=")[1]);
}
System.out.println("account number is: " + map.get("accountno"));
This prints:
account number is: 18
Using some simple tools, like string.split and Map, you can easly do that:
Map<String, String> parse(String frase){
Map<String, String> map = new TreeMap<>();
String words[] = frase.aplit("\\&");
for(String word : words){
String keyValuePair = word.split("\\=");
String key = keyValuePair[0];
String value = keyValuePair[1];
map.put(key, value);
}
return map;
}
To get a specific value, like "accountno", just retrive that key map.get("accountno")
The same answer mentioned by #vinicius can be achieved using Java 8 by :
Map<String, String> map = Arrays.stream(input.split("&"))
.map(str -> str.split("="))
.collect(Collectors.toMap(s -> s[0],s -> s[1]));
//To retrieve accountno from map
System.out.println(map.get("accountno"));
As you said
the accountno can be present anywhere in the string
String input = "accountno=18&username=abc&password=1236";
//input = "username=abc&accountno=19&password=1236";
Matcher m = Pattern.compile("&?accountno\\s*=\\s*(\\w+)&?").matcher(input);
if (m.find()) {
System.out.println("Account no " + m.group(1));
}
This would work even when accountno is somewhere in the middle of the string
Output:
Account no 18
You can try out regex here:
https://regex101.com/r/nOHmzc/2

unique number of words in a propertyfile

What is the optimum way to count the unique number of words in a propertyfile (Just the Values) in java (java 1.8)
for example entries may be:
key1=This is my value for error {0}
key2=This is success message.Great.
Output should be 10 (including {0})
What I tried
property.load(in);
Enumeration em = property.keys();
while (em.hasMoreElements()) {
String str = (String) em.nextElement();
completeString =completeString+property.get(str);
}
Set<String> myset=new HashSet<>();
String s[]=completeString.split("[ .]");
for(int i=1;i<s.length;i++){
myset.add(s[i]);
}
for (String sss: myset){
System.out.println(sss);
}
System.out.println(myset.size());
Do we have a simpler way in java 1.8
Data used :
I used a dummy Properties
Properties prop = new Properties();
prop.put("A", "This is my value for error {0}");
prop.put("B", "This is success message.Great.");
Good old Java:
Using the same logic you used, you can simply split the String of each property in the iteration :
Set<String> set = new HashSet<>();
Enumeration em = property.keys();
while (em.hasMoreElements()) {
String str = (String) em.nextElement();
for(String s : str.split("[ .]")){
set.add(s);
}
}
In Java 8 - Stream API :
Define the pattern to split each "word".
Pattern pattern = Pattern.compile("[ .]");
Now, first let's get our Stream<String> for our values.
You can either take a List<Object> :
Stream<String> stream =
//Create a `List<Object>` from the enumeration and stream it
Collections.list(prop.elements()).stream()
//Convert in String
.map(o -> (String)o);
Or Stream the Map.Entry of the Properties :
Stream<String> stream =
prop.entrySet().stream() //Iterate the Map.Entry<Object,Object>
.map(e -> (String)e.getValue())
(Not sure which is more efficient)
Then, all you have to do is to flatMap the Stream to split each String into new Stream<String>.
stream.flatMap(pattern::splitAsStream) //split based on the pattern define and return a new `Stream<String>`
Then collect the Stream into a Set
.collect(Collectors.toSet()); //collect in a `Set<String>`
The result would be a nice Set printed like:
[Great, success, for, This, {0}, is, my, error, message, value]
Summary :
Set<String> set =
prop.entrySet().stream()
.map(e -> (String)e.getValue())
.flatMap(Pattern.compile(pattern)::splitAsStream)
.collect(Collectors.toSet());

Reading and matching contents of two big files

I have two files each having the same format with approximately 100,000 lines. For each line in file one I am extracting the second component or column and if I find a match in the second column of second file, I extract their third components and combine them, store or output it.
Though my implementation works but the programs runs extremely slow, it takes more than an hour to iterate over the files, compare and output all the results.
I am reading and storing the data of both files in ArrayList then iterate over those list and do the comparison. Below is my code, is there any performance related glitch or its just normal for such an operation.
Note : I was using String.split() but I understand form other post that StringTokenizer is faster.
public ArrayList<String> match(String file1, String file2) throws IOException{
ArrayList<String> finalOut = new ArrayList<>();
try {
ArrayList<String> data = readGenreDataIntoMemory(file1);
ArrayList<String> data1 = readGenreDataIntoMemory(file2);
StringTokenizer st = null;
for(String line : data){
HashSet<String> genres = new HashSet<>();
boolean sameMovie = false;
String movie2 = "";
st = new StringTokenizer(line, "|");
//String line[] = fline.split("\\|");
String ratingInfo = st.nextToken();
String movie1 = st.nextToken();
String genreInfo = st.nextToken();
if(!genreInfo.equals("null")){
for(String s : genreInfo.split(",")){
genres.add(s);
}
}
StringTokenizer st1 = null;
for(String line1 : data1){
st1 = new StringTokenizer(line1, "|");
st1.nextToken();
movie2 = st1.nextToken();
String genreInfo2= st1.nextToken();
//If the movie name are similar then they should have the same genre
//Update their genres to be the same
if(!genreInfo2.equals("null") && movie1.equals(movie2)){
for(String s : genreInfo2.split(",")){
genres.add(s);
}
sameMovie = true;
break;
}
}
if(sameMovie){
finalOut.add(ratingInfo+""+movieName+""+genres.toString()+"\n");
}else if(sameMovie == false){
finalOut.add(line);
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
return finalOut;
}
I would use the Streams API
String file1 = "files1.txt";
String file2 = "files2.txt";
// get all the lines by movie name for each file.
Map<String, List<String[]>> map = Stream.of(Files.lines(Paths.get(file1)),
Files.lines(Paths.get(file2)))
.flatMap(p -> p)
.parallel()
.map(s -> s.split("[|]", 3))
.collect(Collectors.groupingByConcurrent(sa -> sa[1], Collectors.toList()));
// merge all the genres for each movie.
map.forEach((movie, lines) -> {
Set<String> genres = lines.stream()
.flatMap(l -> Stream.of(l[2].split(",")))
.collect(Collectors.toSet());
System.out.println("movie: " + movie + " genres: " + genres);
});
This has the advantage of being O(n) instead of O(n^2) and it's multi-threaded.
Do a hash join.
As of now you are doing an outer loop join which is O(n^2), the hash join will be amortized O(n)
Put the contents of each file in a hash map, with key the field you want (second field).
Map<String,String> map1 = new HashMap<>();
// build the map from file1
Then do the hash join
for(String key1 : map1.keySet()){
if(map2.containsKey(key1)){
// do your thing you found the match
}
}

Categories