I have a CSV file with the following data:
20210903|0000000001|0081|A|T60|BSN|002|STATE UNITED
I have imported this file in my java application with this code:
public List<EquivalenceGroupsTO> read() throws FileNotFoundException, IOException {
try (BufferedReader br = new BufferedReader(new FileReader("/home/myself/Desk/blaBla/T60.csv"))) {
List<String> file = new ArrayList<String>();
StringBuilder sb = new StringBuilder();
String line = br.readLine();
Integer count = 0;
HashSet<String> hset = new HashSet<String>();
while (line != null) {
//System.out.println("data <" + count + "> :" + line);
count++;
file.add(line);
file.add("\n");
line = br.readLine();
}
EquivalenceGroupsTO equivalenceGroupsTO = new EquivalenceGroupsTO();
List<EquivalenceGroupsTO> equivalenceGroupsTOs = new ArrayList<>();
for (String row : file) {
equivalenceGroupsTO = new EquivalenceGroupsTO();
String[] str = row.split("|");
equivalenceGroupsTO.setEquivalenceGroupsCode(str[5]);
equivalenceGroupsTO.setDescription(str[7]);
equivalenceGroupsTO.setLastUpdateDate(new Date());
equivalenceGroupsTOs.add(equivalenceGroupsTO);
System.out.println("Tutto ok!");
}
return equivalenceGroupsTOs;
}
}
I need to set in the equivalenceGroupsTO.setEquivalenceGroupsCode and in the equivalenceGroupsTO.setDecription (which are strings) respectively the strings after the fifth and the seventh "|" , then "BSN" and "STATE UNITED".
But if I start this script it gives me this error:
java.lang.ArrayIndexOutOfBoundsException: Index 5 out of bounds for length 1
at it.utils.my2.read(OpenTXTCodifa.java:46)
What am I doing wrong?
Main issue is mentioned in the comments: when splitting by | character, it has to be escaped as \\| because the pipe character is user as OR operator in the regular espressions.
Next issue is adding a line containing only \n to file. When this line is split, str[5] will fail with ArrayIndexOutOfBoundsException.
Other minor issues are unused variables count and hset.
However, it may be better to refactor existing code to use NIO and Stream API to get a stream of lines and convert each line into corresponding list of EquivalenceGroupsTO:
public List<EquivalenceGroupsTO> read(String filename) throws IOException {
return Files.lines(Paths.get(filename)) // Stream<String>
.map(s -> s.split("\\|")) // Stream<String[]>
// make sure all data are available
.filter(arr -> arr.length > 7) // Stream<String[]>
.map(arr -> {
EquivalenceGroupsTO egTo = new EquivalenceGroupsTO();
egTo.setEquivalenceGroupsCode(str[5]);
egTo.setDescription(str[7]);
egTo.setLastUpdateDate(new Date());
return egTo;
}) // Stream<EquivalenceGroupsTO>
.collect(Collectors.toList())
}
Related
I want to read the data from text, then I will remove the header of the text and save the data into and array every 2 line, cause it still continues data.
visitor.txt
1 DAILY REPORT VISITOR
DATE : 02-02-22
0+------------------------------------------------------------------+
NO. DATE NAME ADDRESS
PHONE BIRTHDAY NEED
+------------------------------------------------------------------+
1 02-02-22 ELIZABETH ZEE WASHINGTON DC
+32 62 18-10-1985 BORROW BOOK
2 02-02-22 VICTORIA GEA BRUSEELS
+32 64 24-05-1986 VISITOR
3 02-02-22 GEORGE PHILIPS BRUSEELS
+32 76 02-05-1990 VISITOR
I want the data that save into an array like this.
1 02-02-22 ELIZABETH ZEE WASHINGTON DC +32 62 18-10-1985 BORROW BOOK
2 02-02-22 VICTORIA GEA BRUSEELS +32 64 24-05-1986 VISITOR
3 02-02-22 GEORGE PHILIPS BRUSEELS +32 76 02-05-1990 VISITOR
This is the code
BufferedReader bR = new BufferedReader(new FileReader(myfile));
int i =0;
String line;
try {
while (line = bufferedReader.readLine()) != null) {
i++;
String data = line.split("\\s", "")
if(data.matches("[0-9]{1,3}\\s.+")) {
String[] dataArray = data.split("\\s", -1);
String[] result = new String[30];
System.arraycopy(fileArray, 0, result, 0, fileArray.length);
String data1 = line.get(i).split("\\s", "")
String[] fileArray1 = data.split("\\s", -1);
String[] result1 = new String[30];
System.arraycopy(fileArray1, 0, result1,0,fileArray1.length);
}
}
The problem here is, I think this code is not effective cause it will be read the second line twice from data and data1. I want every 2 lines will save into one row in the database like the result of text. Do you have any solution?
It seems unlikely for me that one line would be read multiple times. Try to debug your code to see if that actually happens.
Otherwise, you could really skip the first line before starting processing:
BufferedReader bR = new BufferedReader(new FileReader(myfile));
int i =0;
String line;
try {
// alternative one
String firstline = bufferedReader.readLine();
String secondline = bufferedReader.readLine();
String mergedline = firstline + secondline; // the linefeed should have been removed but the data is retained
// alternative two
StringBuilder sb = new StringBuilder();
sb.append(bufferedReader.readLine()); // first line
sb.append(bufferedReader.readLine()); // second line
... = sb.toString(); // now do something with the merged lines
// the other stuff
while (line = bufferedReader.readLine()) != null) {
// process your data lines here
}
}
The result has actually a dynamic number of records. Then a fixed size array
no longer is suitable. Use List<String\[\]> instead: list.add(stringArray), list.get(i), list.size(), list.isEmpty().
The header seems to consist of 2 lines, but I may err.
I saw fields with a space, hence one cannot split on \s+ (one or more whitespace characters). I did split on \s\s+. Maybe you should better use the fixed length field boundaries with line1.substring(i1, i2).
FileReader uses the encoding on your current computer (=unportable file). I have made it explicit. If it always an US-ASCII file, without special characters, you could use StandardCharsets.US_ASCII. Then you can run the software on a Linux server, that normally uses UTF-8.
So without check of data format (which however makes sense):
private void stackOverflow() throws IOException {
List<String[]> data = loadData("mydata.txt");
System.out.println(data.size() + " records read");
for (String[] fields: data) {
System.out.println(Arrays.toString(fields));
}
}
private List<String[]> loadData(String myFile) throws IOException {
List<String[]> data = new ArrayList<>();
Path path = Paths.get(myFile);
try (BufferedReader bufferedReader =
Files.newBufferedReader(path, Charset.defaultCharset())) {
if (bufferedReader.readLine() != null
&& bufferedReader.readLine() != null) { // Skip both header lines.
String line1, line2;
while ((line1 = bufferedReader.readLine()) != null
&& (line2 = bufferedReader.readLine()) != null) {
String[] fields1 = line1.split("\\s\\s+", 4); // Split on at least 2 spaces.
if (fields1.length != 4) {
throw new IOException("Wrong number of fields for first line: " + line1);
}
String[] fields2 = line2.split("\\s\\s+", 3); // Split on at least 2 spaces.
if (fields1.length != 3) {
throw new IOException("Wrong number of fields for second line: " + line2);
}
String[] total = Arrays.copyOf(fields1, 7);
System.arraycopy(fields2, 0, total, 4, fields2.length);
;
data.add(total);
}
if (line1 != null && !line1.isBlank()) {
throw new IOException("Trailing single line: " + line1);
}
}
}
return data;
}
Substring is better, safer, than split.
Instead of String[] you might use record class (since java 14)
record Visitor(String no, String date, String name, String address,
String phone, String birthday, String need) { }
List<Visitor> data = new ArrayList<>();
data.add(new Visitor(fields1[0], fields1[1], fields1[2], fields1[3],
fields2[0], fields2[1], fields2[2]);
A record need little code, however cannot be changed, only replaced in the list.
I want to by reading the data of a file to split the results based on .split(",") in another words for this particular example i want to have 2 Indexes with each containing up to 5 informations which i would also like to acces with the .[0] and .[1] Method.
the File with the Data.
File Reading Method.
public void fileReading(ActionEvent event) throws IOException {
File file = new File("src/DateSpeicher/datenSpeicher.txt");
BufferedReader br = new BufferedReader(new FileReader(file));
String st;
while ((st = br.readLine()) != null) {
System.out.println(st);
}
}
The method does work very greatly however, i wonder how can i split those two in two Indexes or String arrays which both can be accessed through respective indecies [0], [1]. For first data in the firm array - 655464 [0][0] for last in the second Array [1][4].
My approach:
1. Making an ArrayList for every ,
2. Adding data till ","
Issue: eventho approach above works, you cant do such things as array1[0] - it gives an error, however the index method is crucial.
How can i solve this problem?
Path path = Paths.get("src/DateSpeicher/datenSpeicher.txt"); // Or:
Path path = Paths.get(new URL("/DateSpeicher/datenSpeicher.txt").toURI());
Either two Strings, and then handling them:
String content = new String(Files.readAllBytes(path), Charset.defaultCharset());
String[] data = content.split(",\\R");
or a list of lists:
List<String> lines = Files.readAllLines(path, Charset.defaultCharset());
// Result:
List<List<String>> lists = new ArrayList<>();
List<String> newList = null;
boolean addNewList = true;
for (int i = 0; i < lines.size(); ++i) {
if (addNewList) {
newList = new ArrayList<>();
lists.add(newList);
addNewList = false;
}
String line = lines.get(i);
if (line.endsWith(",")) {
line = line.substring(0, line.length() - 1);
addNewList = true;
}
newList.add(line);
}
So this method is supposed to read a text file and output the frequency of each letter. The text file reads:
aaaa
bbb
cc
So my output should be:
a = 4
b = 3
c = 2
Unfortunately, my output is:
a = 4
a = 4
b = 3
a = 4
b = 3
c = 2
Does anyone know why?
I tried modifying the loops but still haven't resolved this.
public void getFreq() throws FileNotFoundException, IOException, Exception {
File file = new File("/Users/guestaccount/IdeaProjects/Project3/src/sample/testFile.txt");
BufferedReader br = new BufferedReader(new FileReader(file));
HashMap<Character, Integer> hash = new HashMap<>();
String line;
while ((line= br.readLine()) != null) {
line = line.toLowerCase();
line = line.replaceAll("\\s", "");
char[] chars = line.toCharArray();
for (char c : chars) {
if (hash.containsKey(c)){
hash.put(c, hash.get(c)+1);
}else{
hash.put(c,1);
}
}
for (Map.Entry entry : hash.entrySet()){
System.out.println(entry.getKey() + " = " + entry.getValue());
}
}
}
Chrisvin Jem gave you the code to change because your for loop was in your while loop when reading from the File.
Does anyone know why?
As your question states, I'm going to explain why it gave you that output.
Reason: The reason that it gave you the output of a=4, a=4, b=3, a=4, b=3, c=3 is because your for loop was in your while loop meaning that each time that the BufferedReader read a new line, you iterated through the HashMap and printed its contents.
Example: When the BufferedReader reads the second line of the file, the HashMap hash already has the key, value pair for a and now, it just got the value for b. As a result, in addition to having already printed the value for a when reading the first line, it also prints the current contents of the HashMap, including the redundant a. The same thing happens for the third line of the file.
Solution: By moving the for loop out of the while loop, you only print the results after the HashMap has all its values, and not while the HashMap is still getting the values.
for (Map.Entry entry : hash.entrySet())
System.out.println(entry.getKey() + " = " + entry.getValue());
I hope this answer was able to explain why you were getting that specific output.
Just move the printing loop outside of the reading loop.
public void getFreq() throws FileNotFoundException, IOException, Exception {
File file = new File("/Users/guestaccount/IdeaProjects/Project3/src/sample/testFile.txt");
BufferedReader br = new BufferedReader(new FileReader(file));
HashMap<Character, Integer> hash = new HashMap<>();
String line;
while ((line= br.readLine()) != null) {
line = line.toLowerCase();
line = line.replaceAll("\\s", "");
char[] chars = line.toCharArray();
for (char c : chars) {
if (hash.containsKey(c)){
hash.put(c, hash.get(c)+1);
}else{
hash.put(c,1);
}
}
}
for (Map.Entry entry : hash.entrySet()){
System.out.println(entry.getKey() + " = " + entry.getValue());
}
}
-Java- I have a text file in which I am storing ID number, First Name, and Last Name on each line. I'm using BufferedReader to display the text files line by line. However I then need to take the ID number only from each line and store it into an array. If anyone can help it would be greatly appreciated.
As you said, you are already printing the line read from file, next you just need to split the line with the delimiter you have in file. Assuming you have comma as the delimiter, all you need to do is, split the line with comma and access the first element and store it in the List,
Here is the sample code,
public static void main(String[] args) throws Exception {
try(BufferedReader br = new BufferedReader(new FileReader("filename.txt"))) {
String line = null;
List<String> idList = new ArrayList<String>();
while((line = br.readLine()) != null) {
System.out.println(line); // you already printing it
String[] tokens = line.split("\\s*,\\s*"); // assuming your line is like this --> 123, Pushpesh, Rajwanshi
if (tokens.length > 0) {
idList.add(tokens[0]); // ID will be accessed at zero index
}
}
idList.forEach(System.out::println);
}
}
Using Java8 and above, you can do it in one liner.
List<String> idList = Files.lines(Paths.get("filename.txt")).filter(x -> x.trim().length() > 0)
.map(x -> x.split("\\s*,\\s*")).map(x -> x[0]).collect(Collectors.toList());
idList.forEach(System.out::println);
List<String> idList = Files.readAllLines(
Paths.get(FILE_PATH),
Charset.defaultCharset()
).stream()
.map(line -> line.split(SEPARATOR)[DATA_INDEX])
.collect(Collectors.toList());
FILE_PATH = file location ("c://users//..").
SEPARATOR = which separate datas (1:NAME:LAST_NAME < the separator for this ex = ":").
DATA_INDEX = index of data (1:NAME:LAST_NAME < the id index for this ex = 0).
I have a small project.
The project imports the txt file to String (coding similar to CSV - contains semicolons = ";").
In the next steps, the String is changed to ArrayList.
Then, using Predicate, I remove elements that do not interest me.
At the end I replace ArrayList on TreeSet to remove duplicates.
Unfortunately, there is a problem here because the duplicates occur ...
I checked in Notepadd ++ changing the encoding on ANSI to check whether there are no unnecessary characters.
Unfortunately, everything looks good and duplicates are still there.
Uploaded input file - https://drive.google.com/open?id=1OqIKUTvMwK3FPzNvutLu-GYpvocUsSgu
Any idea?
public class OpenSCV {
private static final String SAMPLE_CSV_FILE_PATH = "/Downloads/all.txt";
public static void main(String[] args) throws IOException {
File file = new File(SAMPLE_CSV_FILE_PATH);
String str = FileUtils.readFileToString(file, "utf-8");
str = str.trim();
String str2 = str.replace("\n", ";").replace("\"", "" ).replace("\n\n",";").replace("\\*www.*\\","")
.replace("\u0000","").replace(",",";").replace(" ","").replaceAll(";{2,}",";");
List<String> lista1 = new ArrayList<>(Arrays.asList((str2.split(";"))));
Predicate<String> predicate = s -> !(s.contains("#"));
Set<String> removeDuplicates = new TreeSet<>(lista1);
removeDuplicates.removeIf(predicate);
String fileName2 = "/Downloads/allMails.txt";
try ( BufferedWriter bw =
new BufferedWriter (new FileWriter (fileName2)) )
{
for (String line : removeDuplicates) {
bw.write (line + "\n");
}
bw.close ();
} catch (IOException e) {
e.printStackTrace ();
}
}
}
before doing str.replace you can try str.trim to remove any spaces or unwanted and unseen characters.
str = str.trim()