I have the following data which I want to split up.
(1,167,2,'LT2A',45,'Weekly','1,2,3,4,5,6,7,8,9,10,11,12,13'),
to obtain each of the values:
1
167
2
'LT2A'
45
'Weekly'
'1,2,3,4,5,6,7,8,9,10,11,12,13'
I am using the Scanner class to do that and with , as the delimiter.
But I face problems due to the last string: ('1,2,3,4,5,6,7,8,9,10,11,12,13').
I would hence like some suggestions on how I could split this data.
I have also tried using ,' as the delimiter but the string contains data without ''.
The question is quite specific to my needs but I would appreciate if someone could give me suggestions on how I could split this data up.
Thanks!
you can use simple logic for example:
String str="1,167,2,'LT2A',45,'Weekly','1,2,3,4,5,6,7,8,9,10,11,12,13'";
Scanner s = new Scanner(str);
s.useDelimiter(",");
while(s.hasNext())
{
String element = s.next();
if(element.startsWith("'") && ! element.endsWith("'"))
{
while(s.hasNext())
{
element += "," + s.next();
if(element.endsWith("'"))
break;
}
}
System.out.println(element);
}
try
String s = "1,167,2,'LT2A',45,'Weekly','1,2,3,4,5,6,7,8,9,10,11,12,13'";
Scanner sc = new Scanner(s);
sc.useDelimiter(",");
while (sc.hasNext()) {
String n = sc.next();
if (n.startsWith("'") && !n.endsWith("'")) {
n = n + sc.findInLine(".+?'");
}
System.out.println(n);
}
}
Related
For example, the content of a file is:
black=white
bad=good
easy=hard
So, I want to store in a map this words as key and value (ex: {black=white, bad=good} ). And my problem is when I read string I have to skip a char '=' which disappears key and value. How to make this?
In code below I make a code which read key and value from file, but this code works just when between words is SPACE, but I have to be '='.
System.out.println("File name:");
String pathToFile = in.nextLine();
File cardFile = new File(pathToFile);
try(Scanner scanner = new Scanner(cardFile)){
while(scanner.hasNext()) {
key = scanner.next();
value = scanner.next();
flashCards.put(key, value);
}
}catch (FileNotFoundException e){
System.out.println("No file found: " + pathToFile);
}
Use the split method of String in Java.
so after reading your line, split the string and take the key and value as so.
String[] keyVal = line.split("=");
System.out.println("key is ", keyVal[0]);
System.out.println("value is ", keyVal[1]);
You can change the delimiter for the scanner.
public static void main (String[] args) throws java.lang.Exception
{
String s = "black=white\nbad=good\neasy=hard";
Scanner scan = new Scanner(s);
scan.useDelimiter("\\n+|=");
while(scan.hasNext()){
String key = scan.next();
String value = scan.next();
System.out.println(key + ", " + value);
}
}
The output:
black, white
bad, good
easy, hard
Changing the delimiter can be tricky, and it could be better to just read each line,then parse it. For example, "\\n+|=" will split the tokens by either one or more endlines, or an "=". The end line is somewhat hard coded though so it could change depending on the platform the file was created on.
A simple "if" condition will solve it.
if (key == '='){ break;}
I have a file in the following format, records are separated by newline but some records have line feed in them, like below. I need to get each record and process them separately. The file could be a few Mb in size.
<?aaaaa>
<?bbbb
bb>
<?cccccc>
I have the code:
FileInputStream fs = new FileInputStream(FILE_PATH_NAME);
Scanner scanner = new Scanner(fs);
scanner.useDelimiter(Pattern.compile("<\\?"));
if (scanner.hasNext()) {
String line = scanner.next();
System.out.println(line);
}
scanner.close();
But the result I got have the begining <\? removed:
aaaaa>
bbbb
bb>
cccccc>
I know the Scanner consumes any input that matches the delimiter pattern. All I can think of is to add the delimiter pattern back to each record mannully.
Is there a way to NOT have the delimeter pattern removed?
Break on a newline only when preceded by a ">" char:
scanner.useDelimiter("(?<=>)\\R"); // Note you can pass a string directly
\R is a system independent newline
(?<=>) is a look behind that asserts (without consuming) that the previous char is a >
Plus it's cool because <=> looks like Darth Vader's TIE fighter.
I'm assuming you want to ignore the newline character '\n' everywhere.
I would read the whole file into a String and then remove all of the '\n's in the String. The part of the code this question is about looks like this:
String fileString = new String(Files.readAllBytes(Paths.get(path)), StandardCharsets.UTF_8);
fileString = fileString.replace("\n", "");
Scanner scanner = new Scanner(fileString);
... //your code
Feel free to ask any further questions you might have!
Here is one way of doing it by using a StringBuilder:
public static void main(String[] args) throws FileNotFoundException {
Scanner in = new Scanner(new File("C:\\test.txt"));
StringBuilder builder = new StringBuilder();
String input = null;
while (in.hasNextLine() && null != (input = in.nextLine())) {
for (int x = 0; x < input.length(); x++) {
builder.append(input.charAt(x));
if (input.charAt(x) == '>') {
System.out.println(builder.toString());
builder = new StringBuilder();
}
}
}
in.close();
}
Input:
<?aaaaa>
<?bbbb
bb>
<?cccccc>
Output:
<?aaaaa>
<?bbbb bb>
<?cccccc>
I've found a strange behaviour of java.util.Scanner class.
I need to split a String variable into a set of tokens separated by ";".
If I consider a string of "a[*1022]" + ";[*n]" I expect a number n of token.
However if n=3 the Scanner class fails: it "see" just 2 tokens instead of 3. I think it's something related to internal char buffer size of Scanner class.
a[x1022]; -> 1 token: correct
a[x1022];; -> 2 token: correct
a[x1022];;; -> 2 token: wrong (I expect 3 tokens)
a[x1022];;;; -> 4 token: correct
I attach a simple example:
import java.util.Scanner;
public static void main(String[] args) {
// generate test string: (1022x "a") + (3x ";")
String testLine = "";
for (int i = 0; i < 1022; i++) {
testLine = testLine + "a";
}
testLine = testLine + ";;;";
// set up the Scanner variable
String delimeter = ";";
Scanner lineScanner = new Scanner(testLine);
lineScanner.useDelimiter(delimeter);
int p = 0;
// tokenization
while (lineScanner.hasNext()){
p++;
String currentToken = lineScanner.next();
System.out.println("token" + p + ": '" + currentToken + "'");
}
lineScanner.close();
}
I would like to skip the "incorrect" behaviour, could you help me?
Thanks
My recommendation is to report the bug to Oracle, and then work around it by using a BufferedReader to read your InputStream (you'll also need the InputStreamReader class). What Scanner does isn't magic, and working directly with BufferedReader in this case only requires slightly more code than you were already using.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am currently learning how to use Java. I am trying to read a file using the scanner class and I want to read the file but ignore the rest of the line after a certain character eg #. say the file reads
P5 #ignorethis
#ignorealso
123 123 123 #thisneedstogo
355 255 345 #alsothis goes
the file I am trying to read has comments after the symbol '#' and they last till the end of the line. I want to read the strings of the file, whilst ignoring '#' and everything after that.
Any help would be much appreciated, thanks :)
Read the file one line at a time and then consider using the replaceAll(String string) method which takes a regular expression on the line you have just read. You would then use something like so: #.*$ to replace the # character and whatever follows till the end of the string with an empty string.
You could then write the string back to some other file or console once that you are done.
From the Scanner's class doc:
A Scanner breaks its input into tokens using a delimiter pattern,
which by default matches whitespace.
You can do it using useDelimiter method and regular expressions
As an example:
public static void main(String[] args) {
String s = " P5 #ignorethis\n" +
" #ignorealso\n" +
" 123 123 123 #thisneedstogo\n" +
" 355 255 345 #alsothis goes";
Scanner scanner = new Scanner(s).useDelimiter("#.*");
while (scanner.hasNext()){
System.out.print(scanner.next());
}
}
You can write something like this:
Scanner scanner = new Scanner(new FileInputStream(new File("input.txt")));
while(scanner.hasNextLine()){
String str = scanner.nextLine();
System.out.println(str.substring(0, str.indexOf("#")));
}
scanner.close;
Hope this helps.
You can also read whole line, and then use just one part of it, like this:
public void readFile(File file){
String line = "";
try{
scanner = new Scanner(file);
}catch (FileNotFoundException ex){
ex.printStackTrace();
System.out.println("File not found");
}
while (scanner.hasNext()){
line = scanner.nextLine();
String parts[] = line.split("#");
System.out.println(parts[0]);
}
}
This method read new line to String, split a String in a place of "#", and use part before "#" occurrence. Here is output:
P5
123 123 123
355 255 345
BufferedReader br = new BufferedReader(new FileReader("name_file"))) {
String line;
String result = "";
while ((line = br.readLine()) != null) {
if(line.trim().startsWith("#"))
{
System.out.println("next line");
}
else
{
int index = line.indexOf("#");
if(index != -1){
String split = line.substring(0,index);
String[] sLine = line.trim().split("\t");
result = result + " " +split;
}
else
{
String[] sLine = line.trim().split("\t");
result = result + " " +line;
}
}
}
br.close();
System.out.println(result);
Okay So I am creating an application but I'm not sure how to get certain parts of the string. I have read In a file as such:
*tp*|21394398437984|163600
*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567
*2*|AAA|1234567894561237|STOP|20140527|Success||Automated|DPSRN1234568
*3*|2
I need to read the lines beginning with 2 so I done:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
while (s.hasNext()) {
String str1 = s.nextLine ();
if(str1.startsWith("*2*")) {
System.out.print(str1);
}
}
So this will read the whole line I'm fine with that, Now my issue is I need to extract the 2nd line beginning with numbers the 4th with numbers the 5th with success and the 7th(DPSRN).
I was thinking about using a String delimiter with | as the delimiter but I'm not sure where to go after this any help would be great.
You should use String.split("|"), it will give you an array - String[]
Try following:
String test="*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567";
String tok[]=test.split("\\|");
for(String s:tok){
System.out.println(s);
}
Output :
*2*
AAA
1234567894561236
STOP
20140527
Success
Automated
DSPRN1234567
What you require will be placed at tok[2], tok[4], tok[5] and tok[8].
Just split the returned line based on your search, which would return an array of String elements where you can retrieve your elements based on their index:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
String searchLine = "";
while (s.hasNext()) {
searchLine = s.nextLine();
if(searchLine.startsWith("*2*")) {
break;
}
}
String[] strs = searchLine.split("|");
String secondArgument = strs[2];
String forthArgument = strs[4];
String fifthArgument = strs[5];
String seventhArgument = strs[7];
System.out.println(secondArgument);
System.out.println(forthArgument);
System.out.println(fifthArgument);
System.out.println(seventhArgument);