Read data as arrays from a Text File in Java

Read data as arrays from a Text File in Java - java

I have a text file with a bunch of arrays with a specific number I have to find in the array. The text file looks like this:
(8) {1, 4, 6, 8, 12, 22}
(50) {2, 5, 6, 7, 10, 11, 24, 50, 65}
(1) {1}
(33) {1, 2, 5, 6, 11, 12, 13, 21, 25, 26, 30, 33, 60, 88, 99}
(1) {1, 2, 3, 4, 8, 9, 100}
(1) {2, 3, 5, 6, 11, 12, 13, 21, 25, 26, 30, 33, 60, 88, 99}
where the number inside the parenthesis is the number I have to find using binary search. and the rest is the actual array.
I do not know how I would get this array from the text file and be able to read it as an actual array.
[This is a question on a previous coding competition I took, and am going over the problems]
I already have a method to do the binary search, and I have used scanner to read the file like this:
Scanner sc = new Scanner(new File("search_race.dat"));
and used a while loop to be able to loop through the file and read it.
But I am stuck on how to make java know that the stuff in the curly braces is an array and the stuff in the parenthesis is what it must use binary search on said array to find.

You could simply parse each line (the number to find and the array) as follow :
while (sc.hasNext()) {
int numberToFind = Integer.parseInt(sc.next("\\(\\d+\\)").replaceAll("[()]", ""));
int[] arrayToFindIn = Arrays.stream(sc.nextLine().split("[ ,{}]"))
.filter(x->!x.isEmpty())
.mapToInt(Integer::parseInt)
.toArray();
// Apply your binary search ! Craft it by yourself or use a std one like below :
// int positionInArray = Arrays.binarySearch(arrayToFindIn, numberToFind);
}
If you don't like the replaceAll, you could replace the first line in the loop by the two below :
String toFindGroup = sc.next("\\(\\d+\\)");
int numberToFind = Integer.parseInt(toFindGroup.substring(1, toFindGroup.length()-1));
Cheers!

TL;DR: You have to check character by character and see if it's a curly brace or a parenthesis or a digit
Long Answer:
First, create a POJO (let's call this AlgoContainer, but use whatever name you like) with the fields int numberToFind and ArrayList<Integer> listOfNumbers.
Then, read the file like #ManojBanik has mentioned in the comments
Now create an ArrayList<AlgoContainer> (it's size should be the same as the ArrayList<String> that was gotten while reading the file line by line)
Then loop through the ArrayList<String> in the above step and perform the following operations:
Create and instantiate an AlgoContainer object instance (let's call this tempAlgoContainer).
check if the first character is an open parentheses -> yes? create an empty temp String -> check if the next character is a number -> yes? -> append it to the empty String and repeat this until you find the closing parenthesis.
Found the open parenthesis? parse the temp String to int and set the numberToFind field of tempAlgoContainer to that number.
Next up is the curly bracket stuff: found a curly bracket? create a new empty temp String -> check if the next character is digit -> yes? append then append it to the empty String just like in step #2 until you find a comma or a closing curly brace.
Found a comma? parse the temp String to int and then add it to the listOfNumbers (which is a field) of tempAlgoContainer -> make the temp String empty again.
Found a closing curly brace? repeat the above step and break out of the loop. You are now ready to process whatever you want to do. Your data is ready.
Also, it's a good idea to have a member function or instance method of AlgoContainer (call it whatever you want) to perform the binary search so that you can simply loop through the ArrayList<AlgoContainer> and call that BS function on it (no-pun-intended)

To read the file, you can use Files.readAllLines()
To actually parse each line, you can use something like this.
First, to make things easier, remove any whitespace from the line.
line.replaceAll("\\s+", "");
This will essentially transform (8) {1, 4, 6, 8, 12, 22} into (8){1,4,6,8,12,22}.
Next, use a regular expression to validate the line. If the line does not match no further actions are required.
Expression: \([0-9]*\)\{[0-9]*(,[0-9]*)*}
\([0-9]*\) relates to (8) (above example)
\{[0-9]*(,[0-9]*)*} relates to {1,4,6,8,12,22}
If you don´t understand the expression, head over here.
Finally, we can parse the string into its two components: The number to search for and the int[] with the actual values.
// start from index one to skip the first bracket
int targetEnd = trimmed.indexOf(')', 1);
String searchString = trimmed.substring(1, targetEnd);
// parsing wont throw an exception, since we checked with the regex its a number
int numberToFind = Integer.parseInt(searchString);
// skip ')' and '{', align to the first value, skip the last '}'
String valuesString = trimmed.substring(targetEnd + 2, trimmed.length() - 1);
// split the array at ',' to get each value as string
int[] values = Arrays.stream(valuesString.split(","))
.mapToInt(Integer::parseInt).toArray();
With both of these components parsed, you can do the binary search yourself.
Example code as Gist on GitHub

You could read strings in file line by line and then use regex on each line to separate the string to groups.
Below regex should fit to match the line
\((\d+)\) \{([\d, ]+)\}
Then group(1) will give the digit inside the parentheses (as a String) and group(2) will give the String inside curly braces, which you can split using , and space(assuming every comma follows a space) and get an array of numbers (as Strings again).

Related

How to reverse hashmap compression (index method) (Java) [duplicate]

Background of question
I have been developing some code that focuses on firstly, reading a string and creating a file. Secondly, spliting a string into an array. Then getting the indexes for each word in the array and finally, removing the duplicates and printing it to a different file.
I currently have made the code for this here is a link https://pastebin.com/gqWH0x0 (there is a menu system as well) but it is rather long so I have refrained from implementing it in this question.
The compression method is done via hashmaps, getting indexes of the array and mapping them to the relevant word. Here is an example:
Original: "sea sea see sea see see"
Output: see[2, 4, 5],sea[0, 1, 3],
Question
The next stage is getting the output back into the original state. I am currently relatively new to java so I am not aware of the techniques required. The code should be able to take the output file (shown above) and put it back into the original.
My current thinking is that you would just rewrite this hashmap (below). Would I be correct in thinking this? I thought I should check with stack overflow first!
Map<String, Set<Integer>> seaMap = new HashMap<>(); //new hashmap
for (int seaInt = 0; seaInt < sealist.length; seaInt++) {
if (seaMap.keySet().contains(sealist[seaInt])) {
Set<Integer> index = seaMap.get(sealist[seaInt]);
index.add(seaInt);
} else {
Set<Integer> index = new HashSet<>();
index.add(seaInt);
seaMap.put(sealist[seaInt], index);
}
}
System.out.print("Compressed: ");
seaMap.forEach((seawords, seavalues) -> System.out.print(seawords + seavalues + ","));
System.out.println("\n");
If anyone has any good ideas / answers then please let me know, I am really desperate for a solution!
Link to current code: https://pastebin.com/gqWH0x0K

first you will have to separate the words with index(es) from your compressed line, using your example:
"see[2, 4, 5],sea[0, 1, 3],"
to obtain following Strings:
"see[2, 4, 5]" and "sea[0, 1, 3]"
for each you must read the indexes, e.g. for first:
2, 4 and 5
now just write the word in an ArrayList (or array) at the given index.
For the first two steps you can use a regular expression to find each word and the index list. Then use String.split and Integer.parseInt to get all indexes.
Pattern pattern = Pattern.compile("(.*?)\\[(.*?)\\],");
String line = "see[2, 4, 5],sea[0, 1, 3],";
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
String word = matcher.group(1);
String[] indexes = matcher.group(2).split(", ");
for (String str : indexes) {
int index = Integer.parseInt(str);
Now just check that the result List is big enough and set the word at the found indexes.

How do I take a compressed file (through indexes) and re-create the original file? (Java)

Background of question
I have been developing some code that focuses on firstly, reading a string and creating a file. Secondly, spliting a string into an array. Then getting the indexes for each word in the array and finally, removing the duplicates and printing it to a different file.
I currently have made the code for this here is a link https://pastebin.com/gqWH0x0 (there is a menu system as well) but it is rather long so I have refrained from implementing it in this question.
The compression method is done via hashmaps, getting indexes of the array and mapping them to the relevant word. Here is an example:
Original: "sea sea see sea see see"
Output: see[2, 4, 5],sea[0, 1, 3],
Question
The next stage is getting the output back into the original state. I am currently relatively new to java so I am not aware of the techniques required. The code should be able to take the output file (shown above) and put it back into the original.
My current thinking is that you would just rewrite this hashmap (below). Would I be correct in thinking this? I thought I should check with stack overflow first!
Map<String, Set<Integer>> seaMap = new HashMap<>(); //new hashmap
for (int seaInt = 0; seaInt < sealist.length; seaInt++) {
if (seaMap.keySet().contains(sealist[seaInt])) {
Set<Integer> index = seaMap.get(sealist[seaInt]);
index.add(seaInt);
} else {
Set<Integer> index = new HashSet<>();
index.add(seaInt);
seaMap.put(sealist[seaInt], index);
}
}
System.out.print("Compressed: ");
seaMap.forEach((seawords, seavalues) -> System.out.print(seawords + seavalues + ","));
System.out.println("\n");
If anyone has any good ideas / answers then please let me know, I am really desperate for a solution!
Link to current code: https://pastebin.com/gqWH0x0K

first you will have to separate the words with index(es) from your compressed line, using your example:
"see[2, 4, 5],sea[0, 1, 3],"
to obtain following Strings:
"see[2, 4, 5]" and "sea[0, 1, 3]"
for each you must read the indexes, e.g. for first:
2, 4 and 5
now just write the word in an ArrayList (or array) at the given index.
For the first two steps you can use a regular expression to find each word and the index list. Then use String.split and Integer.parseInt to get all indexes.
Pattern pattern = Pattern.compile("(.*?)\\[(.*?)\\],");
String line = "see[2, 4, 5],sea[0, 1, 3],";
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
String word = matcher.group(1);
String[] indexes = matcher.group(2).split(", ");
for (String str : indexes) {
int index = Integer.parseInt(str);
Now just check that the result List is big enough and set the word at the found indexes.

How do I convert a String to an String Array? [duplicate]

This question already has answers here:
How do I convert a String to an int in Java?
(47 answers)
Closed 7 years ago.
I'm reading from a file using Scanner, and the text contains the following.
[8, 3, 8, 2, 3, 4, 4, 4, 5, 8]
This was originally an integer Array that I had to convert to a String to be able to write in the file. Now, I need to be able to read the file back into java, but I need to be able to add the individual numbers together, so I need to get this String back into an array. Any help? Here's what I have:
File f = new File("testfile.txt");
try{
FileWriter fw = new FileWriter(f);
fw.write(Arrays.toString(array1));
fw.close();
} catch(Exception ex){
//Exception Ignored
}
Scanner file = new Scanner(f);
System.out.println(file.nextLine());
This prints out the list of numbers, but in a string. I need to access the integers in an array in order to add them up. This is my first time posting, let me know if I messed anything up.

You can use String#substring to remove the square brackets, String#split to split the String into an array, String#trim to remove the whitespace, and Integer#parseInt to convert the Strings into int values.
In Java 8 you can use the Stream API for this:
int[] values = Arrays.stream(string.substring(1, string.length() - 1)
.split(","))
.mapToInt(string -> Integer.parseInt(string.trim()))
.toArray();
For summing it, you can use the IntStream#sum method instead of converting it to an array at the end.

You don't need to read the String back in an Array, just use Regex
public static void main(String[] args) throws Exception {
String data = "[8, 3, 8, 2, 3, 4, 41, 4, 5, 8]";
// The "\\d+" gets the digits out of the String
Matcher matcher = Pattern.compile("\\d+").matcher(data);
int sum = 0;
while(matcher.find()) {
sum += Integer.parseInt(matcher.group());
}
System.out.println(sum);
}
Results:
86

List<Integer> ints = new ArrayList<>();
String original = "[8, 3, 8, 2, 3, 4, 4, 4, 5, 8]";
String[] splitted = original.replaceAll("[\\[\\] ]", "").split(",");
for(String s : splitted) {
ints.add(Integer.valueOf(s));
}

find all letters in String with regex

I know toCharArray() method but I am interested in regex. I have question for you about speed of two regex:
String s = "123456";
// Warm up JVM
for (int i = 0; i < 10000000; ++i) {
String[] arr = s.split("(?!^)");
String[] arr2 = s.split("(?<=\\G.{1})");
}
long start = System.nanoTime();
String[] arr = s.split("(?!^)");
long stop = System.nanoTime();
System.out.println(stop - start);
System.out.println(Arrays.toString(arr));
start = System.nanoTime();
String[] arr2 = s.split("(?<=\\G.{1})");
stop = System.nanoTime();
System.out.println(stop - start);
System.out.println(Arrays.toString(arr2));
output:
Run 1:
3158
[1, 2, 3, 4, 5, 6]
3947
[1, 2, 3, 4, 5, 6]
Run 2:
2763
[1, 2, 3, 4, 5, 6]
3158
[1, 2, 3, 4, 5, 6]
two regex are doing the same job. Why the first regex is more faster than second one ? . Thanks for your answers.

I can never be 100% sure, but I can think of one reason.
(?!^) always fails or succeeds in one shot (one attempt), that is if it can't find the start-of-string which is just a single test.
As for (?<=\\G.{1}) (which is exactly equivalent to just (?<=\\G.)) it always involved two steps or two matching attempts.
\\G matches either at the start-of-string or at the end of previous match, and even when it is successful, the regex engine still has to try and match a single character ..
For example, in your string 123456, at the start of the string:
(?!^): fails immediately.
(?<=\\G.): \\G succeeds, but then it looks for . but can't find a character behind because this is the start-of-string so now it fails, but as you can see it attempted two steps versus one step for the previous expression.
The same goes for every other position in the input string. Always two tests for (?<=\\G.) versus a single test for (?!^).

Cannot get a Substring of a substring

I'm trying to parse a String from a file that looks something like this:
Mark Henry, Tiger Woods, James the Golfer, Bob,
3, 4, 5, 1,
1, 2, 3, 5,
6, 2, 1, 4,
For ease of use, I'd like to split off the first line of the String, because it will be the only one that cannot be converted into integer values (the rest will be stored in a double array of integers[line][value]);
I tried to use String.split("\\\n") to divide out each line into its own String, which works. However, I am unable to divide the new strings into substrings with String.split("\\,"). I am not sure what is going on:
String[] firstsplit = fileOne.split("\\\n");
System.out.println("file split into " + firstsplit.length + " parts");
for (int i = 0; i < firstsplit.length; i++){
System.out.println(firstsplit[i]); // prints values to verify it works
}
String firstLine = firstsplit[0];
String[] secondSplit = firstLine.split("\\,");
System.out.println(secondSplit[0]); // prints nothing for some reason
I've tried a variety of different things with this, and nothing seems to work (copying over to a new String is an attempt to get it to work even). Any suggestions?
EDIT: I have changed it to String.split(",") and also tried String.split(", ") but I still get nothing to print afterwards.
It occurs to me now that maybe the first location is a blank one....after testing I found this to be true and everything works for firstsplit[1];

You're trying to split \\,, which translates to the actual value \,. You want to escape only ,.

Comma , doesn't need \ before it as it isn't a special character. Try using , instead of \\,, which is translated to \, (not only a comma, also a backslash).

Not only do you not need to escape a comma, but you also don't need three backslashes for the newline character:
String[] firstSplit = fileOne.split("\n");
That will work just fine. I tested your code with the string you specified, and it actually worked just fine, and it also worked just fine splitting without the extraneous escapes...
Have you actually tested it with the String data you provided in the question, or perhaps is the actual data something else. I was worried about the carriage return (\r\n in e.g. Windows files), but that didn't matter in my test, either. If you can scrub the String data you're actually parsing, and provide a sample output of the original String (fileOne), that would help significantly.

You could just load the file into a list of lines:
fin = new FileInputStream(filename);
bin = new BufferedReader(new InputStreamReader(fin));
String line = null;
List<String> lines = new ArrayList<>();
while (( line = bin.readLine()) != null ) {
lines.add( line );
}
fin.close();
Of course you have to include this stuff into some try catch block which fits into your exception handling. Then parse the lines starting with the second one like this:
for ( int i = 1; i < lines.size(); i++ ) {
String[] values = lines.get( i ).split( "," );
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Read data as arrays from a Text File in Java - java

Related

How to reverse hashmap compression (index method) (Java) [duplicate]

How do I take a compressed file (through indexes) and re-create the original file? (Java)

How do I convert a String to an String Array? [duplicate]

find all letters in String with regex

Cannot get a Substring of a substring

Categories

Resources