I currently have a TreeMap of the form <String, List<List<String>>
I'm trying to write my tree map to an output file where I get the inner values of my string[] all separated by a colon.
Do I need a second for loop to loop through each inner list and format it using a .join(":", elements)?
Or is there a more concise way to keep it all in a single for loop statement?
I've tried a few things and my current code is:
new File(outFolder).mkdir();
File dir = new File(outFolder);
//get the file we're writing to
File outFile = new File(dir, "javaoutput.txt");
//create a writer
try (BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(outFile), "utf-8"))) {
for (Map.Entry<String, String[]> entry : allResults.entrySet()) {
writer.write(entry.getKey() + ", "+ Arrays.toString(entry.getValue()).replace("null", "").toString());
writer.newLine();
}
Current output:
ANY, [[469, 470], [206, 1013, 1014], [2607, 2608]]
Desired output:
ANY, 469:470, 206:1013:1014, 2607:2608
Any suggestions would be greatly appreciated.
String.join(":", arr) can be used to take the String array and return a colon-separated String. This can then be used with Streams with a Collector to join these strings with a comma-separator, so :
TreeMap<String, String[]> allResults = new TreeMap<>();
allResults.put("a", new String[]{"469", "470"});
allResults.put("b", new String[]{"206", "1013", "1014"});
allResults.put("c", new String[]{"2607", "2608"});
String result = allResults.entrySet().stream()
.map(e -> String.join(":", e.getValue()))
.collect(Collectors.joining(", "));
System.out.println(result);
produces :
469:470, 206:1013:1014, 2607:2608
With a List<List<String>>, you need a stream within a stream, so :
TreeMap<String, List<List<String>>> allResults = new TreeMap<>();
allResults.put("a", Arrays.asList(Arrays.asList("469", "470"), Arrays.asList("206", "1013", "1014"), Arrays.asList("2607", "2608")));
allResults.put("b", Arrays.asList(Arrays.asList("169", "470")));
allResults.put("c", Arrays.asList(Arrays.asList("269", "470")));
String result = allResults.entrySet().stream()
.map(i -> i.getKey() + "," + i.getValue().stream().map(elements -> String.join(":", elements))
.collect(Collectors.joining(", "))
)
.collect(Collectors.joining("\n"));
System.out.println(result);
which produces :
a,469:470, 206:1013:1014, 2607:2608
b,169:470
c,269:470
Related
My issue here is I need to compute average time for each Id and compute average time of each id.
Sample data
T1,2020-01-16,11:16pm,start
T2,2020-01-16,11:18pm,start
T1,2020-01-16,11:20pm,end
T2,2020-01-16,11:23pm,end
I have written a code in such a way that I kept first column and third column in a map.. something like
T1, 11:16pm
but I could not able to compute values after keeping those values in a map. Also tried to keep them in string array and split into line by line. By same issue facing for that approach also.
**
public class AverageTimeGenerate {
public static void main(String[] args) throws IOException {
File file = new File("/abc.txt");
try (BufferedReader reader = new BufferedReader(new FileReader(file))) {
while (true) {
String line = reader.readLine();
if (line == null) {
break;
}
ArrayList<String> list = new ArrayList<>();
String[] tokens = line.split(",");
for (String s: tokens) {
list.add(s);
}
Map<String, String> map = new HashMap<>();
String[] data = line.split(",");
String ids= data[0];
String dates = data[1];
String transactionTime = data[2];
String transactionStartAndEndTime = data[3];
String[] transactionIds = ids.split("/n");
String[] timeOfEachTransaction = transactionTime.split("/n");
for(String id : transactionIds) {
for(String time : timeOfEachTransaction) {
map.put(id, time);
}
}
}
}
}
}
Can anyone suggest me is it possible to find duplicates in a map and compute values in map, Or is there any other way I can do this so that the output should be like
`T1 2:00
T2 5:00'
I don't know what is your logic to complete the average time but you can save data in map for one particular transaction. The map structure can be like this. Transaction id will be the key and all the time will be in array list.
Map<String,List<String>> map = new HashMap<String,List<String>>();
You can do like this:
Map<String, String> result = Files.lines(Paths.get("abc.txt"))
.map(line -> line.split(","))
.map(arr -> {
try {
return new AbstractMap.SimpleEntry<>(arr[0],
new SimpleDateFormat("HH:mm").parse(arr[2]));
} catch (ParseException e) {
return null;
}
}).collect(Collectors.groupingBy(Map.Entry::getKey,
Collectors.collectingAndThen(Collectors
.mapping(Map.Entry::getValue, Collectors.toList()),
list -> toStringTime.apply(convert.apply(list)))));
for simplify I've declared two functions.
Function<List<Date>, Long> convert = list -> (list.get(1).getTime() - list.get(0).getTime()) / 2;
Function<Long, String> toStringTime = l -> l / 60000 + ":" + l % 60000 / 1000;
What is the optimum way to count the unique number of words in a propertyfile (Just the Values) in java (java 1.8)
for example entries may be:
key1=This is my value for error {0}
key2=This is success message.Great.
Output should be 10 (including {0})
What I tried
property.load(in);
Enumeration em = property.keys();
while (em.hasMoreElements()) {
String str = (String) em.nextElement();
completeString =completeString+property.get(str);
}
Set<String> myset=new HashSet<>();
String s[]=completeString.split("[ .]");
for(int i=1;i<s.length;i++){
myset.add(s[i]);
}
for (String sss: myset){
System.out.println(sss);
}
System.out.println(myset.size());
Do we have a simpler way in java 1.8
Data used :
I used a dummy Properties
Properties prop = new Properties();
prop.put("A", "This is my value for error {0}");
prop.put("B", "This is success message.Great.");
Good old Java:
Using the same logic you used, you can simply split the String of each property in the iteration :
Set<String> set = new HashSet<>();
Enumeration em = property.keys();
while (em.hasMoreElements()) {
String str = (String) em.nextElement();
for(String s : str.split("[ .]")){
set.add(s);
}
}
In Java 8 - Stream API :
Define the pattern to split each "word".
Pattern pattern = Pattern.compile("[ .]");
Now, first let's get our Stream<String> for our values.
You can either take a List<Object> :
Stream<String> stream =
//Create a `List<Object>` from the enumeration and stream it
Collections.list(prop.elements()).stream()
//Convert in String
.map(o -> (String)o);
Or Stream the Map.Entry of the Properties :
Stream<String> stream =
prop.entrySet().stream() //Iterate the Map.Entry<Object,Object>
.map(e -> (String)e.getValue())
(Not sure which is more efficient)
Then, all you have to do is to flatMap the Stream to split each String into new Stream<String>.
stream.flatMap(pattern::splitAsStream) //split based on the pattern define and return a new `Stream<String>`
Then collect the Stream into a Set
.collect(Collectors.toSet()); //collect in a `Set<String>`
The result would be a nice Set printed like:
[Great, success, for, This, {0}, is, my, error, message, value]
Summary :
Set<String> set =
prop.entrySet().stream()
.map(e -> (String)e.getValue())
.flatMap(Pattern.compile(pattern)::splitAsStream)
.collect(Collectors.toSet());
I have the below file :
name = David
city = sydney
COuntry = Australia
I am trying to create a hash map using groovy and split it at = and store it in an array such that part[0] contains before equal and part[1] contains after equal. I am then trying to create a map here .
Desired output :
def mapedData = [name :david , city : sydney , country :australia ]
My try :
String s=""
def myfile = new File("C:/Users/.............")
BufferedReader br = new BufferedReader(new FileReader(myfile));
Map<String, String> map = new HashMap<String, String>();
while((s = br.readLine()) != null) {
if(!s.startsWith("#")) {
StringTokenizer st=new StringTokenizer(s, "=")
while(st.hasMoreElements()) {
String line=st.nextElement().toString().trim()
print line
}
}
}
}
If you want to create a map from a file in Groovy, you can use java.util.Properties for that. Here is an example:
def file = new File("C:\\stackoverflow\\props.properties")
def props = new Properties()
file.withInputStream { stream ->
props.load(stream)
}
println(props)
This prints out:
[key1:value1, key2:value2]
The props.properties file contains this:
# Stackoverflow test
key1 = value1
key2 = value2
Try with this code:
def map =[:]
new File("file.txt").eachLine{line->
if(line.contains('=')&& (!line.startsWith("#"))){
map[line.split('=')[0]]=line.split('=')[1]
}
}
println map
Here a one-liner that does what you want:
new File(/C:\Users\.............\input.txt/).readLines().collectEntries { it.trim().split(/\s*=\s*/) as List }
I have two files each having the same format with approximately 100,000 lines. For each line in file one I am extracting the second component or column and if I find a match in the second column of second file, I extract their third components and combine them, store or output it.
Though my implementation works but the programs runs extremely slow, it takes more than an hour to iterate over the files, compare and output all the results.
I am reading and storing the data of both files in ArrayList then iterate over those list and do the comparison. Below is my code, is there any performance related glitch or its just normal for such an operation.
Note : I was using String.split() but I understand form other post that StringTokenizer is faster.
public ArrayList<String> match(String file1, String file2) throws IOException{
ArrayList<String> finalOut = new ArrayList<>();
try {
ArrayList<String> data = readGenreDataIntoMemory(file1);
ArrayList<String> data1 = readGenreDataIntoMemory(file2);
StringTokenizer st = null;
for(String line : data){
HashSet<String> genres = new HashSet<>();
boolean sameMovie = false;
String movie2 = "";
st = new StringTokenizer(line, "|");
//String line[] = fline.split("\\|");
String ratingInfo = st.nextToken();
String movie1 = st.nextToken();
String genreInfo = st.nextToken();
if(!genreInfo.equals("null")){
for(String s : genreInfo.split(",")){
genres.add(s);
}
}
StringTokenizer st1 = null;
for(String line1 : data1){
st1 = new StringTokenizer(line1, "|");
st1.nextToken();
movie2 = st1.nextToken();
String genreInfo2= st1.nextToken();
//If the movie name are similar then they should have the same genre
//Update their genres to be the same
if(!genreInfo2.equals("null") && movie1.equals(movie2)){
for(String s : genreInfo2.split(",")){
genres.add(s);
}
sameMovie = true;
break;
}
}
if(sameMovie){
finalOut.add(ratingInfo+""+movieName+""+genres.toString()+"\n");
}else if(sameMovie == false){
finalOut.add(line);
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
return finalOut;
}
I would use the Streams API
String file1 = "files1.txt";
String file2 = "files2.txt";
// get all the lines by movie name for each file.
Map<String, List<String[]>> map = Stream.of(Files.lines(Paths.get(file1)),
Files.lines(Paths.get(file2)))
.flatMap(p -> p)
.parallel()
.map(s -> s.split("[|]", 3))
.collect(Collectors.groupingByConcurrent(sa -> sa[1], Collectors.toList()));
// merge all the genres for each movie.
map.forEach((movie, lines) -> {
Set<String> genres = lines.stream()
.flatMap(l -> Stream.of(l[2].split(",")))
.collect(Collectors.toSet());
System.out.println("movie: " + movie + " genres: " + genres);
});
This has the advantage of being O(n) instead of O(n^2) and it's multi-threaded.
Do a hash join.
As of now you are doing an outer loop join which is O(n^2), the hash join will be amortized O(n)
Put the contents of each file in a hash map, with key the field you want (second field).
Map<String,String> map1 = new HashMap<>();
// build the map from file1
Then do the hash join
for(String key1 : map1.keySet()){
if(map2.containsKey(key1)){
// do your thing you found the match
}
}
I have a big CSV file, thousands of rows, and I want to aggregate some columns using java code.
The file in the form:
1,2012,T1
2,2015,T2
3,2013,T1
4,2012,T1
The results should be:
T, Year, Count
T1,2012, 2
T1,2013, 1
T2,2015, 1
Put your data to a Map like structure, each time add +1 to a stored value when a key (in your case ""+T+year) found.
You can use map like
Map<String, Integer> rowMap = new HashMap<>();
rowMap("T1", 1);
rowMap("T2", 2);
rowMap("2012", 1);
or you can define your own class with T and Year field by overriding hashcode and equals method. Then you can use
Map<YourClass, Integer> map= new HashMap<>();
T1,2012, 2
String csv =
"1,2012,T1\n"
+ "2,2015,T2\n"
+ "3,2013,T1\n"
+ "4,2012,T1\n";
Map<String, Integer> map = new TreeMap<>();
BufferedReader reader = new BufferedReader(new StringReader(csv));
String line;
while ((line = reader.readLine()) != null) {
String[] fields = line.split(",");
String key = fields[2] + "," + fields[1];
Integer value = map.get(key);
if (value == null)
value = 0;
map.put(key, value + 1);
}
System.out.println(map);
// -> {T1,2012=2, T1,2013=1, T2,2015=1}
Use uniVocity-parsers for the best performance. It should take 1 second to process 1 million rows.
CsvParserSettings settings = new CsvParserSettings();
settings.selectIndexes(1, 2); //select the columns we are going to read
final Map<List<String>, Integer> results = new LinkedHashMap<List<String>, Integer>(); //stores the results here
//Use a custom implementation of RowProcessor
settings.setRowProcessor(new AbstractRowProcessor() {
#Override
public void rowProcessed(String[] row, ParsingContext context) {
List<String> key = Arrays.asList(row); // converts the input array to a List - lists implement hashCode and equals based on their values so they can be used as keys on your map.
Integer count = results.get(key);
if (count == null) {
count = 0;
}
results.put(key, count + 1);
}
});
//creates a parser with the above configuration and RowProcessor
CsvParser parser = new CsvParser(settings);
String input = "1,2012,T1"
+ "\n2,2015,T2"
+ "\n3,2013,T1"
+ "\n4,2012,T1";
//the parse() method will parse and submit all rows to your RowProcessor - use a FileReader to read a file instead the String I'm using as example.
parser.parse(new StringReader(input));
//Here are the results:
for(Entry<List<String>, Integer> entry : results.entrySet()){
System.out.println(entry.getKey() + " -> " + entry.getValue());
}
Output:
[2012, T1] -> 2
[2015, T2] -> 1
[2013, T1] -> 1
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).