Weird BufferedReader behavior for a huge file

Weird BufferedReader behavior for a huge file - java

I am getting a very weird error. So, my program read a csv file.
Whenever it comes to this line:
"275081";"cernusco astreet, milan, italy";NULL
I get an error:
In the debug screen, I see that the BufferedReader read only
"275081";"cernusco as
That is a part of the line. But, it should read all of the line.
What bugs me the most is when I simply remove that line out of the csv file, the bug disappear! The program runs without any problem. I can remove the line, maybe it is a bad input or whatever; but, I want to understand why I am having this problem.
For better understanding, I will include a part of my code here:
reader = new BufferedReader(new FileReader(userFile));
reader.readLine(); // skip first line
while ((line = reader.readLine()) != null) {
String[] fields = line.split("\";\"");
int id = Integer.parseInt(stripPunctionMark(fields[0]));
String location = fields[1];
if (location.contains("\";")) { // When there is no age. The data is represented as "location";NULL. We cannot split for ";" here. So check for "; and split.
location = location.split("\";")[0];
System.out.printf("Added %d at %s\n", id, location);
people.put(id, new Person(id, location));
numberOfPeople++;
}
else {
int age = Integer.parseInt(stripPunctionMark(fields[2]));
people.put(id, new Person(id, location, age));
System.out.printf("Added %d at: %s age: %d \n", id, location, age);
numberOfPeople++;
}
Also, you can find the csv file here or here is a short version of the part that I encountered the error:
"275078";"el paso, texas, usa";"62"
"275079";"istanbul, eurasia, turkey";"26"
"275080";"madrid, n/a, spain";"29"
"275081";"cernusco astreet, milan, italy";NULL
"275082";"hacienda heights, california, usa";"16"
"275083";"cedar rapids, iowa, usa";"22"

This has nothing whatsoever to do with BufferedReader. It doesn't even appear in the stack trace.
It has to do with your failure to check the result and length of the array returned by String.split(). Instead you are just assuming the input is well-formed, with at least three columns in each row, and you have no defences if it isn't.

Related

How do I fix a NumberFormatException (from text file input)?

I was wondering if I could have some help with this NumberFormatException with code using a text input.
The result should be it being able to run properly and be able to first put 50 strings into the hashTable and then remove 10 afterwards.
I have tried placing the removeLine.next() inside a String datatype and then placing the String back inside the Integer.parseInt which didn't work.
Here is the class:
import java.io.*;
import java.util.*;
public class hashTest {
public static void main(String args[]) throws FileNotFoundException {
HashTable hashTable = new HashTable();
Scanner insert = new Scanner(new File("data1.txt"));
while(insert.hasNext()) {
String line = insert.nextLine();
Scanner insertLine = new Scanner(line);
insertLine.next();
insertLine.next();
int index = Integer.parseInt(insertLine.next());
String data = insertLine.nextLine();
hashTable.put(index, data);
}
Scanner remove = new Scanner(new File("data2.txt"));
while(remove.hasNext()) {
String line = remove.nextLine();
Scanner removeLine = new Scanner(line);
removeLine.next();
removeLine.next();
int index = Integer.parseInt(removeLine.next());
hashTable.remove(index);
}
}
}
data1.txt :
003 : 68682774 MALIK TULLER
004 : 24248685 FRANCE COELLO
005 : 25428367 DUSTY BANNON
006 : 79430806 MELVINA CORNEJO
007 : 98698743 MALIA HOGSTRUM
008 : 20316453 TOMASA POWANDA
009 : 39977566 CHONG MCOWEN
010 : 86770985 DUSTY CONFER
011 : 92800393 LINNIE GILMAN
012 : 31850991 WANETA DEWEES
013 : 81528001 NEAL HOLSTEGE
014 : 46531276 BRADLY BOMBACI
data2.txt :
92800393 LINNIE GILMAN
86770985 DUSTY CONFER
31850991 WANETA DEWEES
46531276 BRADLY BOMBACI
25428367 DUSTY BANNON
68682774 MALIK TULLER
18088219 PENNY JOTBLAD
48235250 KENNITH GRASSMYER
20316453 TOMASA POWANDA
54920021 TYSON COLBETH
22806858 LAVERNE WOLNIK
32244214 SHEMEKA HALLOWAY
81528001 NEAL HOLSTEGE
24248685 FRANCE COELLO
23331143 JUSTIN ADKIN
79430806 MELVINA CORNEJO
59245514 LESLEE PHIFER
64357276 SCOT PARREIRA
50725704 GENARO QUIDER
52298576 AUDIE UNCAPHER
54657809 MARTY ENOCHS
54526749 TOBI HEATLEY
24903965 ALONSO GILSTAD
84936051 DEONNA STRAZZA
62522327 AHMAD THAYER
90572271 ELIJAH METEVIER
88999386 ISMAEL ELKAN

NumberFormatExceptions with Integer.parseInt() are most often caused by attempting to read something into an int that is not actually an int. Try printing each line as it is read in. If you have a line that is not purely an int (e.g., Hello123), you will get this exception with Integer.parseInt(). A cleaner debugging method (and better coding practice) would be to catch the exception and print the problematic line. You will probably see right away what's causing the issue. When reading text input from anywhere, it's never good to assume that the data is of the format you're expecting.
When your input contains data other than the int values you need, you can read each line's values into an array and extract the proper value(s). Here's an example of how you might extract the values from a single line in your second data file. Keep in mind that this still makes assumptions about the input format and therefore, is not completely fool-proof.
try {
// Split the line by whitespace, saving the values into an array
String[] singleLineVals = someLine.split("\\s+");
// Extract the first value
int firstValue = Integer.parseInt(singleLineVals[0]);
} catch (NumberFormatException nfe) {
// Handle the exception
}

How to fix "GetStatus Write RFID_API_UNKNOWN_ERROR data(x)- Field can Only Take Word values" Android RFID 8500 Zebra

I am trying to develop and application to read and write to RF tags. Reading is flawless, but I'm having issues with writing. Specifically the error "GetStatus Write RFID_API_UNKNOWN_ERROR data(x)- Field can Only Take Word values"
I have tried reverse-engineering the Zebra RFID API Mobile by obtaining the .apk and decoding it, but the code is obfuscated and I am not able to decypher why that application's Write works and mine doesn't.
I see the error in the https://www.ptsmobile.com/rfd8500/rfd8500-rfid-developer-guide.pdf at page 185, but I have no idea what's causing it.
I've tried forcefully changing the writeData to Hex, before I realized that the API does that on its own, I've tried changing the Length of the writeData as well, but it just gets a null value. I'm so lost.
public boolean WriteTag(String sourceEPC, long Password, MEMORY_BANK memory_bank, String targetData, int offset) {
Log.d(TAG, "WriteTag " + targetData);
try {
TagData tagData = null;
String tagId = sourceEPC;
TagAccess tagAccess = new TagAccess();
tagAccess.getClass();
TagAccess.WriteAccessParams writeAccessParams = tagAccess.new WriteAccessParams();
String writeData = targetData; //write data in string
writeAccessParams.setAccessPassword(Password);
writeAccessParams.setMemoryBank(MEMORY_BANK.MEMORY_BANK_USER);
writeAccessParams.setOffset(offset); // start writing from word offset 0
writeAccessParams.setWriteData(writeData);
// set retries in case of partial write happens
writeAccessParams.setWriteRetries(3);
// data length in words
System.out.println("length: " + writeData.length()/4);
System.out.println("length: " + writeData.length());
writeAccessParams.setWriteDataLength(writeData.length()/4);
// 5th parameter bPrefilter flag is true which means API will apply pre filter internally
// 6th parameter should be true in case of changing EPC ID it self i.e. source and target both is EPC
boolean useTIDfilter = memory_bank == MEMORY_BANK.MEMORY_BANK_EPC;
reader.Actions.TagAccess.writeWait(tagId, writeAccessParams, null, tagData, true, useTIDfilter);
} catch (InvalidUsageException e) {
System.out.println("INVALID USAGE EXCEPTION: " + e.getInfo());
e.printStackTrace();
return false;
} catch (OperationFailureException e) {
//System.out.println("OPERATION FAILURE EXCEPTION");
System.out.println("OPERATION FAILURE EXCEPTION: " + e.getResults().toString());
e.printStackTrace();
return false;
}
return true;
}
With
Password being 00
sourceEPC being the Tag ID obtained after reading
Memory Bank being MEMORY_BANK.MEMORY_BANK_USER
target data being "8426017056458"
offset being 0
It just keeps giving me "GetStatus Write RFID_API_UNKNOWN_ERROR data(x)- Field can Only Take Word values" and I have no idea why this is the case, nor I know what a "Word value" is, and i've searched for it. This is all under the "OperationFailureException", as well. Any help would be appreciated, as there's almost no resources online for this kind of thing.

Even this question is a bit older, I had the same problem so as far as I know this should be the answer.
Your target data "8426017056458" length is 13 and at writeAccessParams.setWriteDataLength(writeData.length()/4)
you are devide it with four. Now if you are trying to write the target data it is longer than the determined WriteDataLength. And this throws the Error.
One 'word' is 4 Hex => 16 Bits long. So your Data have to be filled up first and convert it to Hex.

Parsing a Tab Separated File

I'm attempting to TSV from IMDB:
$hutter Battle of the Sexes (2017) (as $hutter Boy) [Bobby Riggs Fan] <10>
NVTION: The Star Nation Rapumentary (2016) (as $hutter Boy) [Himself] <1>
Secret in Their Eyes (2015) (uncredited) [2002 Dodger Fan]
Steve Jobs (2015) (uncredited) [1988 Opera House Patron]
Straight Outta Compton (2015) (uncredited) [Club Patron/Dopeman]
$lim, Bee Moe Fatherhood 101 (2013) (as Brandon Moore) [Himself - President, Passages]
For Thy Love 2 (2009) [Thug 1]
Night of the Jackals (2009) (V) [Trooth]
"Idle Talk" (2013) (as Brandon Moore) [Himself]
"Idle Times" (2012) {(#1.1)} (as Brandon Moore) [Detective Ryan Turner]
As you can some lines start with a tab and some do not. I want a map with the actor's name as a key and a list of movies as the value. Between the actor's name is one or more tabs to until the movie listing.
My code:
while ((line = reader.readLine()) != null) {
Matcher matcher = headerPattern.matcher(line);
boolean headerMatchFound = matcher.matches();
if (headerMatchFound) {
Logger.getLogger(ActorListParser.class.getName()).log(Level.INFO, "Header for actor list found");
String newline;
reader.readLine();
while ((newline = reader.readLine()) != null) {
String[] fullLine = null;
String actor;
String title;
Pattern startsWithTab = Pattern.compile("^\t.*");
Matcher tab = startsWithTab.matcher(newline);
boolean tabStartMatcher = tab.matches();
if (!tabStartMatcher) {
fullLine = newline.split("\t.*");
System.out.println("Actor: " + fullLine[0] +
"Movie: " + fullLine[1]);
}//this line will have code to match lines that start with tabs.
}
}
}
The way I've done this only works for a few lines before I get and arrayoutofbounds exception. How can I parse the lines and split them into 2 strings at max if they have one or more tabs?

There are subtleties in parsing tab/comma-delimited data files having to do with quoting and escaping.
To save yourself a lot of work, frustration and headaches you really should consider using one of the existing CSV parsing libaries such as OpenCSV or Apache Commons CSV.
Posted as an answer instead of a comment because the OP has not stated a reason for reinventing the wheel and there are some tasks that really have been "solved" once and for all.

JAVA : Reading txt and take elements with StringTokenizer

i got an error problem! I open my file i read a line and then i take information from the line with StringTokenizer
my code works with one line but when i am trying to read another i got an error any help ?
here is my code
try{
line = reader.readLine();
while(line!=null){
StringTokenizer st = new StringTokenizer(line,"\t");
timer=st.nextToken("\t");
int Itimer=Integer.parseInt(timer);
// System.out.println(Itimer);
what_to_do=st.nextToken("\t");
// System.out.print(what_to_do);
flightnumber=st.nextToken();
int Iflightnumber=Integer.parseInt(flightnumber);
// System.out.print(Iflightnumber);
departure=st.nextToken("\t");
// System.out.print(departure);
flighttime=st.nextToken("\t");
int Iflighttime=Integer.parseInt(flighttime);
// System.out.print(Iflighttime);
Key=new KeyFlight(Iflightnumber,Iflighttime);
flight=new Flight(Key,true);
if(what_to_do.equals("insert")){
// System.out.print("worked");
if(departure.equals("D")){
result=true;
}else{result=false;}
flight.setdeparture(result);//8a mporousa na kanw new flight alla gia e3ikonomisi to ekana me seter//
EV.insert(flight);
// System.out.println("worked again");
}else if(what_to_do.equals("cancel")){
EV.remove(Key);
}
else if(what_to_do.equals("update")){
EV.UpdateKey(flight, Key);
}
line=reader.readLine();
and these are the errors Exception in thread "main" java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken(Unknown Source)
at java.util.StringTokenizer.nextToken(Unknown Source)
at FlightSchedule.loadandStoreFile(FlightSchedule.java:54)
at FlightSchedule.main(FlightSchedule.java:13)
i wrote instead of last reader.readLine(), line=null and it worked
Code is ok its a StringTokenizer problem
examble of my txt format: 0 insert 370 D 425

The problem could be, you are looking for tab "\t" on you're stringTokenizer and maybe the space between youre data is not a tab is just a white space, try better line.split("\s+")

Parse a task list

A file contains the following:
HPWAMain.exe 3876 Console 1 8,112 K
hpqwmiex.exe 3900 Services 0 6,256 K
WmiPrvSE.exe 3924 Services 0 8,576 K
jusched.exe 3960 Console 1 5,128 K
DivXUpdate.exe 3044 Console 1 16,160 K
WiFiMsg.exe 3984 Console 1 6,404 K
HpqToaster.exe 2236 Console 1 7,188 K
wmpnscfg.exe 3784 Console 1 6,536 K
wmpnetwk.exe 3732 Services 0 11,196 K
skypePM.exe 2040 Console 1 25,960 K
I want to get the process ID of the skypePM.exe. How is this possible in Java?
Any help is appreciated.

Algorithm
Open the file.
In a loop, read a line of text.
If the line of text starts with skypePM.exe then extract the number.
Repeat looping until all lines have been read from the file.
Close the file.
Implementation
import java.io.*;
public class T {
public static void main( String args[] ) throws Exception {
BufferedReader br = new BufferedReader(
new InputStreamReader(
new FileInputStream( "tasklist.txt" ) ) );
String line;
while( (line = br.readLine()) != null ) {
if( line.startsWith( "skypePM.exe" ) ) {
line = line.substring( "skypePM.exe".length() );
int taskId = Integer.parseInt( (line.trim().split( " " ))[0] );
System.out.println( "Task Id: " + taskId );
}
}
br.close();
}
}
Alternate Implementation
If you have Cygwin and related tools installed, you could use:
cat tasklist.txt | grep skypePM.exe | awk '{ print $2; }'

To find the Process Id of the application SlypePM..
Open the file
now read lines one by one
find the line which contains SkypePM.exe in the beginning
In the line containing SkypePM.exe parse the line to read the numbers after the process name leaving the spaces.
You get process id of the process
It is all string operations.
Remember the format of the file should not change after you write the code.

If you really want to parse the output, you may need a different strategy. If your output file really is the result of a tasklist execution, then it should have some column headers at the top of it like:
Image Name PID Session Name Session# Mem Usage
========================= ======== ================ =========== ============
I would use these, in particular the set of equal signs with spaces, to break any subsequent strings using a fixed-width column strategy. This way, you could have more flexibility in parsing the output if needed (i.e. maybe someone is looking for java.exe or wjava.exe). Do keep in mind the last column may not be padded with spaces all the way to the end.
I will say, in the strictest sense, the existing answers should work for just getting the PID.

Implementation in Java is not a good way. Shell or other script languages may help you a lot. Anyway, JAWK is a implementation of awk in Java, I think it may help you.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Weird BufferedReader behavior for a huge file - java

Related

How do I fix a NumberFormatException (from text file input)?

How to fix "GetStatus Write RFID_API_UNKNOWN_ERROR data(x)- Field can Only Take Word values" Android RFID 8500 Zebra

Parsing a Tab Separated File

JAVA : Reading txt and take elements with StringTokenizer

Parse a task list

Categories

Resources