Java using \034 as delimiter in a string - java

I am trying to use '\034' field separator character as a delimiter in a string.
The issue is when I hardcode "\034"+opField and write it to a file it works, but if the "\034" character is read from a file, it writes the output as string "col1\034col2'.
I tried using StringBuilder but it escapes the \034 to "\\034".
I am using the following code to read the character from the file:
try (BufferedReader br = new BufferedReader(new FileReader(fConfig))){
int lc = 1;
for(String line;(line = br.readLine())!=null;){
String[] rowList = line.split(delim);
int row_len = rowList.length;
if (row_len<2){
System.out.println("Incorrect dictionary file row:"+fConfig.getAbsolutePath()+"\nNot enough values found at row:"+line);
}else{
String key = rowList[0];
String value = rowList[1];
dictKV.put(key, value);
}
lc++;
}
}catch(Exception e){
throw e;
}
Any help is welcome...
[update]: The same thing is happening with '\t' character, if harcoded fine, but if read from a file its getting appended as characters. "col0\tcol1"
if(colAl.toLowerCase().contains(" as ")){
String temp = colAl.replaceAll("[ ]+as[ ]+"," | ");
ArrayList<String> tempA = this.brittle_delim(temp,'|');
colAl = tempA.get(tempA.size()-1);
colAl = colAl.trim();
}else {
ArrayList<String> tempA = this.brittle_delim(colAl,' ');
colAl = tempA.get(tempA.size()-1);
colAl = colAl.trim();
}
if(i==0){
sb.append(colAl);
headerCols+=colAl.trim();
}else{
headerCols+= this.output_field_delim + colAl;
sb.append(this.output_field_delim);
sb.append(colAl);
}
}
}
System.out.println("SB Header Cols:"+sb.toString());
System.out.println("Header Cols:"+headerCols);
Output:
SB Header Cols:
SPRN_CO_ID\034FISC_YR_MTH_DSPLY_CD\034CST_OBJ_CD\034PRFT_CTR_CD\034LEGL_CO_CD\034HEAD_CT_TYPE_ID\034FIN_OWN_CD\034FUNC_AREA_CD\034HEAD_CT_NR
Header Cols:
SPRN_CO_ID\034FISC_YR_MTH_DSPLY_CD\034CST_OBJ_CD\034PRFT_CTR_CD\034LEGL_CO_CD\034HEAD_CT_TYPE_ID\034FIN_OWN_CD\034FUNC_AREA_CD\034HEAD_CT_NR
In the above code if I do the following I am getting correct results:
headerCols+= "\034"+ colAl;
output:
SPRN_CO_IDFISC_YR_MTH_DSPLY_CDCST_OBJ_CDPRFT_CTR_CDLEGL_CO_CDHEAD_CT_TYPE_IDFIN_OWN_CDFUNC_AREA_CDHEAD_CT_NR
The FS characters are there even if they are geting removed here

You should provide an example demonstrating your problem. Not just incomplete code snippets.
Following runable snippet does what you explained.
// create a file one line
byte[] bytes = "foo bar".getBytes(StandardCharsets.ISO_8859_1);
String fileName = "/tmp/foobar";
Files.write(Paths.get(fileName), bytes);
String headerCols = "";
String outputFieldDelim = "\034";
try (BufferedReader br = new BufferedReader(new FileReader(fileName))) {
// read the line from the file and split by blank character
String[] cols = br.readLine().split(" ");
// contcatenate the values with "\034"
// but ... for your code ...
// don't concatenate String objects in a loop like below
// use a StringBuilder or StringJoiner instead
headerCols += outputFieldDelim + cols[0];
headerCols += outputFieldDelim + cols[1];
}
// output with the "\034" character
System.out.println(headerCols);

I guess this is where I found my solution and the actual words for my Question.
How to unescape string literals in java

Related

How CSV parsing can be utilized - JAVA

I am given a file that will read the following:
"String",int,int
"String",int,int
"String",int,int
...
Given an unknown number of variables, a while (scanner.hasNextLine()) can solve to the number of entries. My goal is to take these three pieces of data and store them into a Node. I am using the method BinaryTree.addNode(String, int, int) for this. My issue comes to when I am trying to read in the data. I am trying to remove the commas within the document and then attempting to re-read the data using the following:
Scanner firstpass = new Scanner(file);
String input = firstpass.nextLine().replaceAll(",", "");
Scanner secondpass = new Scanner(input);
String variable1 = secondpass.next();
int variable2 = secondpass.nextInt();
int variable3 = secondpass.nextInt();
This however is a very innefective way of going about this.
UPDATED
The compiling errors can be fixed with the following:
try {
Scanner scanner1 = new Scanner(file);
while (scanner1.hasNextLine()) {
String inventory = scanner1.nextLine().replaceAll(",", " ");
Scanner scanner2 = new Scanner(inventory);
while (scanner2.hasNext()){
String i = scanner2.next();
System.out.print(i);
}
scanner2.close();
}
scanner1.close();
}
catch (FileNotFoundException ex) {
ex.printStackTrace();
}
which gives me the output:
"String"intint"String"intint"String"intint...
So I know I am on the right track. However any (spaces) within the "String" variable are removed. So they would output "SomeString" instead of "Some String". Also I still don't know how to remove the "" from the strings.
The format you've shown matches the CSV (Comma-Separated Values) format, so your best option is to use a CSV parser, e.g. Apache Commons CSV ™.
If you don't want to add a third-party library, you could use Regular Expression to parse the line.
Reading lines from a file should not be done with a Scanner. Use a BufferedReader instead. See Scanner vs. BufferedReader.
try (BufferedReader in = new BufferedReader(new FileReader(file))) {
Pattern p = Pattern.compile("\"(.*?)\",(-?\\d+),(-?\\d+)");
for (String line; (line = in.readLine()) != null; ) {
Matcher m = p.matcher(line);
if (! m.matches())
throw new IOException("Invalid line: " + line);
String value1 = m.group(1);
int value2 = Integer.parseInt(m.group(2));
int value3 = Integer.parseInt(m.group(3));
// use values here
}
} catch (IOException | NumberFormatException ex) {
ex.printStackTrace();
}
Note that this will not work if the string contains escaped characters, e.g. if it contains embedded double-quotes. For that, you should use a parser library.
The code above will correctly handle embedded spaces and commas.
I would instead of using
String input = firstpass.nextLine().replaceAll(",", "");
Scanner secondpass = new Scanner(input);
String variable1 = secondpass.next();
int variable2 = secondpass.nextInt();
int variable3 = secondpass.nextInt();
Use the following approach
String line = firstpass.nextLine();
String[] temp = line.split(",");
String variable1 = temp[0];
int variable2 = Integer.parseInt(temp[1]);
int variable3 = Integer.parseInt(temp[2]);

How to replace a specific character in a text file by giving the index value?

Came across this code which replaces all the characters of the given value.
File temp = File.createTempFile("newfile", ".txt");
FileWriter fw = new FileWriter(temp);
Reader reader = new FileReader(file);
BufferedReader br = new BufferedReader(reader);
while (br.ready()) {
fw.write(br.readLine().replaceAll("n", "j") + "\n");
}
fw.close();
br.close();
reader.close();
temp.renameTo(file);
}
instead of replacing all 'n's with 'j's isn't there a way to specify the index I want to change only?
You can use replaceFirst (that accepts a regex), using the following pattern:
(?<=.{N-1}).
Where "N" is the index you want to replace.
Of course there are many alternatives, look at the String API to fuel your creative fire.
The line within the while statement as follows:
fw.write(br.readLine().replaceAll("n", "j") + "\n");
This can be expanded to:
String str = br.readLine(); //get text
String replace = str.replaceAll("n", "j"); //replace content
replace = replace + "\n"; //add new line
fw.write(replace); //write to file
For your instance of replacing a certain index, you would want to do something similar:
StringBuilder str = new StringBuilder(br.readLine()); //Read line into StringBuilder
if(str.length() > 3) //Check if string is long enough
str.setChar(4, 'x'); //Replace character in line at index 4 to 'x'
fw.write(str); //write to file

Retrieving part of a string using a delimiter

Okay So I am creating an application but I'm not sure how to get certain parts of the string. I have read In a file as such:
*tp*|21394398437984|163600
*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567
*2*|AAA|1234567894561237|STOP|20140527|Success||Automated|DPSRN1234568
*3*|2
I need to read the lines beginning with 2 so I done:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
while (s.hasNext()) {
String str1 = s.nextLine ();
if(str1.startsWith("*2*")) {
System.out.print(str1);
}
}
So this will read the whole line I'm fine with that, Now my issue is I need to extract the 2nd line beginning with numbers the 4th with numbers the 5th with success and the 7th(DPSRN).
I was thinking about using a String delimiter with | as the delimiter but I'm not sure where to go after this any help would be great.
You should use String.split("|"), it will give you an array - String[]
Try following:
String test="*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567";
String tok[]=test.split("\\|");
for(String s:tok){
System.out.println(s);
}
Output :
*2*
AAA
1234567894561236
STOP
20140527
Success
Automated
DSPRN1234567
What you require will be placed at tok[2], tok[4], tok[5] and tok[8].
Just split the returned line based on your search, which would return an array of String elements where you can retrieve your elements based on their index:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
String searchLine = "";
while (s.hasNext()) {
searchLine = s.nextLine();
if(searchLine.startsWith("*2*")) {
break;
}
}
String[] strs = searchLine.split("|");
String secondArgument = strs[2];
String forthArgument = strs[4];
String fifthArgument = strs[5];
String seventhArgument = strs[7];
System.out.println(secondArgument);
System.out.println(forthArgument);
System.out.println(fifthArgument);
System.out.println(seventhArgument);

How to replace all special characters with another character in java?

I want to replace all 'special characters' with a special character in java
For example 'cash&carry' will become 'cash+carry' and also 'cash$carry' will become 'cash+carry'
I have a sample CSV file as
Here the CSV headers are 'What' and 'Where'
What,Where
salon,new+york+metro
pizza,los+angeles+metro
crate&barrel,los+angeles+metro
restaurants,los+angeles+metro
gas+station,los+angeles+metro
persian+restaurant,los+angeles+metro
car+wash,los+angeles+metro
book store,los+angeles+metro
garment,los+angeles+metro
"cash,carry",los+angeles+metro
cash&carry,los+angeles+metro
cash carry,los+angeles+metro
The expected output
What,Where
salon,new+york+metro
pizza,los+angeles+metro
crate+barrel,los+angeles+metro
restaurants,los+angeles+metro
gas+station,los+angeles+metro
persian+restaurant,los+angeles+metro
car+wash,los+angeles+metro
book+store,los+angeles+metro
garment,los+angeles+metro
cash+carry,los+angeles+metro
cash+carry,los+angeles+metro
cash+carry,los+angeles+metro
The sample code is as follows
String csvfile="BidAPI.csv";
try{
// create the 'Array List'
ArrayList<String> What=new ArrayList<String>();
ArrayList<String> Where=new ArrayList<String>();
BufferedReader br=new BufferedReader(new FileReader(csvfile));
StringTokenizer st=null;
String line="";
int linenumber=0;
int columnnumber;
int free=0;
int free1=0;
while((line=br.readLine())!=null){
linenumber++;
columnnumber=0;
st=new StringTokenizer(line,",");
while(st.hasMoreTokens()){
columnnumber++;
String token=st.nextToken();
if("What".equals(token)){
free=columnnumber;
System.out.println("the value of free :"+free);
} else if("Where".equals(token)){
free1=columnnumber;
System.out.println("the value of free1 :"+free1);
}
if(linenumber>1){
if (columnnumber==free){
What.add(token);
} else if(columnnumber==free1){
Where.add(token);
}
}
}
}
// converting the 'What' Array List to array
String[] what=What.toArray(new String[What.size()]);
// converting the 'Where' Array List to array
String[] where = Where.toArray(new String[Where.size()]);
for(int i=0;i<what.length;i++){
String data = what[i].replaceAll("[^A-Za-z0-9\",]| (?!([^\"]*\"){2}[^\"]*$)", "+").replace("\"", "");
System.out.println(data);
System.out.println(where[i]);
String finaldata = data+where[i];
String json = readUrl(desturl);
br.close();
}catch(Exception e){
System.out.println("There is an error :"+e);
}
All the special characters, all the spaces and the double quotes should be removed and replaced as in the desired output.
I am using value.replaceAll("[^A-Za-z0-9 ]", "+") , but it is not working.
Error
cash
carry"
Any help is appreciated. new to regex.
You need to:
replace all commas within quotes with +
replace non-whitelist (and you need to add commas to your whitelist)
+
remove double quotes
Try this:
line = line.replaceAll("[^A-Za-z0-9\",]|,(?!(([^\"]*\"){2})*[^\"]*$)", "+").replace("\"", "");
I think your regex is pretty close. Add an exception for comma's as well and get rid of the space and you are good.
BufferedReader r = new BufferedReader(new InputStreamReader(System.in));
String line;
while ((line = r.readLine()) != null)
{
String replaced = line.replace("\"", "");
replaced = replaced.replaceAll("[^A-Za-z0-9,]", "+");
System.out.println(replaced);
}
Of course, Strings are immutable in Java. Keep that in mind. replaceAll() returns a new String and does not modify the original instance.
Demo here.
You need to first find quote and replace , inside it with +. Next you can just use replaceAll("[^A-Za-z0-9,]", "+") so you will replace all non alphanumeric characters or , with +. Your code for that can use
Pattern p = Pattern.compile("\"([^\"]*)\"");
pattern to locate quotations and appendReplacement, appendTail from Matcher class to replace founded quotations with its new version.
So in short your code can look something like
Scanner scanner = new Scanner(new File(csvfile));
Pattern p = Pattern.compile("\"([^\"]*)\"");
StringBuffer sb = new StringBuffer();
while(scanner.hasNextLine()){
String line = scanner.nextLine();
Matcher m = p.matcher(line);
while (m.find()){//find quotes
//and replace their content with content with replaced `,` by `+`
//BTW group(1) holds part of quotation without `"` marsk
m.appendReplacement(sb, m.group(1).replace(',', '+'));
}
m.appendTail(sb);//we need to also add rest of unmatched data to buffer
//now we can just normally replace special characters with +
String result = sb.toString().replaceAll("[^A-Za-z0-9,]", "+");
//after job is done we can use result, so lest print it
System.out.println(result);
//lets not forget to reset buffer for next line
sb.delete(0, sb.length());
}
Answer to the question
String csvfile="BidAPI.csv";
try{
// create the 'Array List'
ArrayList<String> What=new ArrayList<String>();
ArrayList<String> Where=new ArrayList<String>();
BufferedReader br=new BufferedReader(new FileReader(csvfile));
StringTokenizer st=null;
String line="";
int linenumber=0;
int columnnumber;
int free=0;
int free1=0;
while((line=br.readLine())!=null){
line =line.replaceAll("[^A-Za-z0-9\",]|,(?!(([^\"]*\"){2})*[^\"]*$)", "+").replace("\"", "");
linenumber++;
columnnumber=0;
st=new StringTokenizer(line,",");
while(st.hasMoreTokens()){
columnnumber++;
String token=st.nextToken();
if("What".equals(token)){
free=columnnumber;
System.out.println("the value of free :"+free);
} else if("Where".equals(token)){
free1=columnnumber;
System.out.println("the value of free1 :"+free1);
}
if(linenumber>1){
if (columnnumber==free){
What.add(token);
} else if(columnnumber==free1){
Where.add(token);
}
}
}
}
// converting the 'What' Array List to array
String[] what=What.toArray(new String[What.size()]);
// converting the 'Where' Array List to array
String[] where = Where.toArray(new String[Where.size()]);
for(int i=0;i<what.length;i++){
String data = what[i].replaceAll("[^A-Za-z0-9\",]| (?!([^\"]*\"){2}[^\"]*$)", "+").replace("\"", "");
System.out.println(data);
System.out.println(where[i]);
String finaldata = data+where[i];
String json = readUrl(desturl);
br.close();
}catch(Exception e){
System.out.println("There is an error :"+e);
}

Buffered Reader find specific line separator char then read that line

My program needs to read from a multi-lined .ini file, I've got it to the point it reads every line that start with a # and prints it. But i only want to to record the value after the = sign. here's what the file should look like:
#music=true
#Volume=100
#Full-Screen=false
#Update=true
this is what i want it to print:
true
100
false
true
this is my code i'm currently using:
#SuppressWarnings("resource")
public void getSettings() {
try {
BufferedReader br = new BufferedReader(new FileReader(new File("FileIO Plug-Ins/Game/game.ini")));
String input = "";
String output = "";
while ((input = br.readLine()) != null) {
String temp = input.trim();
temp = temp.replaceAll("#", "");
temp = temp.replaceAll("[*=]", "");
output += temp + "\n";
}
System.out.println(output);
}catch (IOException ex) {}
}
I'm not sure if replaceAll("[*=]", ""); truly means anything at all or if it's just searching for all for of those chars. Any help is appreciated!
Try following:
if (temp.startsWith("#")){
String[] splitted = temp.split("=");
output += splitted[1] + "\n";
}
Explanation:
To process lines only starting with desired character use String#startsWith method. When you have string to extract values from, String#split will split given text with character you give as method argument. So in your case, text before = character will be in array at position 0, text you want to print will be at position 1.
Also note, that if your file contains many lines starting with #, it should be wise not to concatenate strings together, but use StringBuilder / StringBuffer to add strings together.
Hope it helps.
Better use a StringBuffer instead of using += with a String as shown below. Also, avoid declaring variables inside loop. Please see how I've done it outside the loop. It's the best practice as far as I know.
StringBuffer outputBuffer = new StringBuffer();
String[] fields;
String temp;
while((input = br.readLine()) != null)
{
temp = input.trim();
if(temp.startsWith("#"))
{
fields = temp.split("=");
outputBuffer.append(fields[1] + "\n");
}
}

Categories