In java i would like to read a file line by line and print the line to the output.
I want to solve this with regular expressions.
while (...)
{
private static java.util.regex.Pattern line = java.util.regex.Pattern.compile(".*\\n");
System.out.print(scanner.next(line));
}
The regex in the code is not correct, as i get InputMismatchException.
I am working on this regex for 2 hours. Please help with it.
With regex powertoy i see that ".*\n" is correct. But my program runs incorrectly.
The whole source is:
/**
* Extracts the points in the standard input in off file format to the standard output in ascii points format.
*/
import java.util.regex.Pattern;
import java.util.Scanner;
class off_to_ascii_points
{
private static Scanner scanner = new Scanner(System.in);
private static Pattern fat_word_pattern = Pattern.compile("\\s*\\S*\\s*");
private static Pattern line = Pattern.compile(".*\\n", Pattern.MULTILINE);
public static void main(String[] args)
{
try
{
scanner.useLocale(java.util.Locale.US);
/* skip to the number of points */
scanner.skip(fat_word_pattern);
int n_points = scanner.nextInt();
/* skip the rest of the 2. line */
scanner.skip(fat_word_pattern); scanner.skip(fat_word_pattern);
for (int i = 0; i < n_points; ++i)
{
System.out.print(scanner.next(line));
/*
Here my mistake is.
next() reads only until the delimiter,
which is by default any white-space-sequence.
That is next() does not read till the end of the line
what i wanted.
Changing "next(line)" to "nextLine()" solves the problem.
Also, setting the delimiter to line_separator
right before the loop solves the problem too.
*/
}
}
catch(java.lang.Exception e)
{
System.err.println("exception");
e.printStackTrace();
}
}
}
The beginning of an example input is:
OFF
4999996 10000000 0
-28.6663 -11.3788 -58.8252
-28.5917 -11.329 -58.8287
-28.5103 -11.4786 -58.8651
-28.8888 -11.7784 -58.9071
-29.6105 -11.2297 -58.6101
-29.1189 -11.429 -58.7828
-29.4967 -11.7289 -58.787
-29.1581 -11.8285 -58.8766
-30.0735 -11.6798 -58.5941
-29.9395 -11.2302 -58.4986
-29.7318 -11.5794 -58.6753
-29.0862 -11.1293 -58.7048
-30.2359 -11.6801 -58.5331
-30.2021 -11.3805 -58.4527
-30.3594 -11.3808 -58.3798
I first skip to the number 4999996 which is the number of lines containing point coordinates. These lines are that i am trying to write to the output.
I suggest using
private static Pattern line = Pattern.compile(".*");
scanner.useDelimiter("[\\r\\n]+"); // Insert right before the for-loop
System.out.println(scanner.next(line)); //Replace print with println
Why your code doesn't work as expected:
This has to do with the Scanner class you use and how that class works.
The javadoc states:
A Scanner breaks its input into tokens
using a delimiter pattern, which by
default matches whitespace.
That means when you call one of the Scanner's.next* methods the scanner reads the specified input until the next delimiter is encountered.
So your first call to scanner.next(line) starts reading the following line
-28.6663 -11.3788 -58.8252
And stops at the space after -28.6663. Then it checks if the token (-28.6663) matches your provided pattern (.*\n) which obviously doesn't match (-28.6663). That's why.
If you only want to print the file to standard out, why do you want to use regexps? If you know that you always want to skip the first two lines, there are simpler ways to accomplish it.
import java.util.Scanner;
import java.io.File;
public class TestClass {
public static void main(String[] args) throws Exception {
Scanner in=new Scanner(new File("test.txt"));
in.useDelimiter("\n"); // Or whatever line delimiter is appropriate
in.next(); in.next(); // Skip first two lines
while(in.hasNext())
System.out.println(in.next());
}
}
You have to switch the Pattern into multiline mode.
line = Pattern.compile("^.*$", Pattern.MULTILINE);
System.out.println(scanner.next(line));
By default the scanner uses the white space as its delimiter. You must change the delimiter to the new line before you read the line after the first skips. The code you need to change is to insert the following line before the for loop:
scanner.useDelimiter(Pattern.compile(System.getProperty("line.separator")));
and update the Pattern variable line as following:
private static Pattern line = Pattern.compile(".*", Pattern.MULTILINE);
Thank everybody for the help.
Now i understand my mistake:
The API documentation states, that every nextT() method of the Scanner class first skips the delimiter pattern, then it tries to read a T value. However it forgets to say that each next...() method reads only till the first occurrence of the delimiter!
Related
I have to write a program which prints the String which are inputed from a user and every letter like the first is replaced with "#":
mum -> #u#
dad -> #a#
Swiss -> #wi## //also if it is UpperCase
Albert -> Albert //no letter is like the first
The user can input how many strings he wants. I thought to split the strings with the Split method but it doesn't work with the ArrayList.
import java.util.*;
public class CensuraLaPrima {
public static void main(String[] args) {
Scanner s= new Scanner (System.in);
String tdc;
ArrayList <String> Parolecens= new ArrayList <String>();
while (s.hasNextLine()) {
tdc=s.nextLine();
Parolecens.add(tdc);
}
System.out.println(Parolecens);
}
}
If you want to read in single words you can use Scanner.next() instead. It basically gives you every word, so every string without space and without newline. Also works if you put in two words at the same time.
I guess you want to do something like this. Feel free to use and change to your needs.
import java.util.*;
public class CensuraLaPrima {
public static void main(String[] args) {
Scanner s= new Scanner (System.in);
String tdc;
while (s.hasNext()) {
tdc=s.next();
char c = tdc.charAt(0);
System.out.print(tdc.replaceAll(Character.toLowerCase(c) +"|"+ Character.toUpperCase(c), "#"));
}
}
}
Edit:
Basically it boils down to this. If you want to read single words with the scanner use .next() instead of .nextLine() it does consider every word seperated by space and newline, even if you put in an entire Line at once.
Tasks calling for replacing characters in a string are often solved with the help of regular expressions in Java. In addition to using regex explicitly through the Pattern class, Java provides a convenience API for using regex capabilities directly on the String class through replaceAll method.
One approach to replacing all occurrences of a specific character with # in a case-insensitive manner is using replaceAll with a regex (?i)x, where x is the initial character of the string s that you are processing:
String result = s.replaceAll("(?i)"+s.charAt(0), "#");
You need to ensure that the string is non-empty before calling s.charAt(0).
Demo.
Assuming that you've successfully created the ArrayList, I'd prefer using the Iterator interface to access each elements in the ArrayList. Then you can use any String variable and assign it the values in ArrayList . Thereafter you can use the split() in the String variable you just created. Something like this:
//after your while loop
Iterator<String> it = Parolecens.iterator();
String myVariable = "";
String mySplitString[];
while(it.hasNext()) {
myVariable = it.next();
mySplitString = myVariable.split(//delimiter of your choice);
//rest of the code
}
I hope this helps :)
Suggestions are always appreciated.
My input txt file has this content:
aa1 aa2
bb1 bb2
cc1 cc2
After cursor is going to the last line, how does hasNextLine() method give true while reading cc1 and cc2? I thought I would only get aa1 to bb2.
Output:
aa1
aa2
bb1
bb2
cc1
cc2
package test;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class Test {
public static void main(String[] args) throws FileNotFoundException {
File f = new File("K:\\Test\\a.txt");
System.out.println(f.exists());
Scanner reader = new Scanner(f);
while (reader.hasNextLine()) {
System.out.println(reader.next());
}
}
}
Either use hasNextLine() with nextLine(), or use hasNext(), with next(). Mixing those can result in undesired behavior (unless used deliberately). See their documentation here. Quoting the next() method:
Returns the next token if it matches the specified pattern. This method may block while waiting for input to scan, even if a previous invocation of hasNext(Pattern) returned true. If the match is successful, the scanner advances past the input that matched the pattern.
And also, in the Scanner class documentation, you can find this:
A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.
So, the scanner reads the first line, it advances to the first delimiter (the white space) and prints the first part (aa1). Then, still on the same line, it prints the second part (aa2). Then, it moves to the second line, prints (bb1), then it prints (bb2), and still the while condition is true, since there is also a next line. So, finally, at the third line, it prints the first part (cc1), it prints the second part (cc2) and then stops, since there is no other line.
I'm trying to do some homework for my computer science class and I can't seem to figure this one out. The question is:
Write a program that reads a line of text and then displays the line, but with the first occurrence of hate changed to love.
This sounded like a basic problem, so I went ahead and wrote this up:
import java.util.Scanner;
public class question {
public static void main(String[] args)
{
Scanner keyboard = new Scanner(System.in);
System.out.println("Enter a line of text:");
String text = keyboard.next();
System.out.println("I have rephrased that line to read:");
System.out.println(text.replaceFirst("hate", "love"));
}
}
I expect a string input of "I hate you" to read "I love you", but all it outputs is "I". When it detects the first occurrence of the word I'm trying to replace, it removes the rest of the string, unless it's the first word of the string. For instance, if I just input "hate", it will change it to "love". I've looked at many sites and documentations, and I believe I'm following the correct steps. If anyone could explain what I'm doing wrong here so that it does display the full string with the replaced word, that would be fantastic.
Thank you!
Your mistake was on the keyboard.next() call. This reads the first (space-separated) word. You want to use keyboard.nextLine() instead, as that reads a whole line (which is what your input is in this case).
Revised, your code looks like this:
import java.util.Scanner;
public class question {
public static void main(String[] args)
{
Scanner keyboard = new Scanner(System.in);
System.out.println("Enter a line of text:");
String text = keyboard.nextLine();
System.out.println("I have rephrased that line to read:");
System.out.println(text.replaceFirst("hate", "love"));
}
}
Try getting the whole line like this, instead of just the first token:
String text = keyboard.nextLine();
keyboard.next() only reads the next token.
Use keyboard.nextLine() to read the entire line.
In your current code, if you print the contents of text before the replace you will see that only I has been taken as input.
As an alternate answer, build a while loop and look for the word in question:
import java.util.Scanner;
public class question {
public static void main(String[] args)
{
// Start with the word we want to replace
String findStr = "hate";
// and the word we will replace it with
String replaceStr = "love";
// Need a place to put the response
StringBuilder response = new StringBuilder();
Scanner keyboard = new Scanner(System.in);
System.out.println("Enter a line of text:");
System.out.println("<Remember to end the stream with Ctrl-Z>");
String text = null;
while(keyboard.hasNext())
{
// Make sure we have a space between characters
if(text != null)
{
response.append(' ');
}
text = keyboard.next();
if(findStr.compareToIgnoreCase(text)==0)
{
// Found the word so replace it
response.append(replaceStr);
}
else
{
// Otherwise just return what was entered.
response.append(text);
}
}
System.out.println("I have rephrased that line to read:");
System.out.println(response.toString());
}
}
Takes advantage of the Scanner returning one word at a time. The matching will fail if the word is followed by a punctuation mark though. Anyway, this is the answer that popped into my head when I read the question.
I have this code for Identifying the comments and print them in java
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Solution {
public static void main(String[] args) {
Pattern pattern = Pattern.compile("(\\/\\*((.|\n)*)\\*\\/)|\\/\\/.*");
String code = "";
Scanner scan = new Scanner(System.in);
while(scan.hasNext())
{
code+=(scan.nextLine()+"\n");
}
Matcher matcher = pattern.matcher(code);
int nxtBrk=code.indexOf("\n");
while(matcher.find())
{
int i=matcher.start(),j=matcher.end();
if(nxtBrk<i)
{
System.out.print("\n");
}
System.out.print(code.substring(i,j));
nxtBrk = code.indexOf("\n",j);
}
scan.close();
}
}
Now when I try the code against this input
/*This is a program to calculate area of a circle after getting the radius as input from the user*/
\#include<stdio.h>
int main()
{ //something
It outputs right and only the comments. But when I give the input
/*This is a program to calculate area of a circle after getting the radius as input from the user*/
\#include<stdio.h>
int main()
{//ok
}
/*A test run for the program was carried out and following output was observed
If 50 is the radius of the circle whose area is to be calculated
The area of the circle is 7857.1429*/
The program outputs the whole code instead of just the comments. I don't know what wrong is doing the addition of that last lines.
EDIT: parser is not an option because I am solving problems and I have to use programming language . link https://www.hackerrank.com/challenges/ide-identifying-comments
Parsing source code with regular expressions is very unreliable. I'd suggest you use a specialized parser. Creating one is pretty simple using antlr. And, since you seem to be parsing C source files, you can use the C grammar.
Your pattern, shorn of its Java quoting (and some unnecessary backslashes), is this:
(/\*((.|
)*)\*/)|//.*
That's fine enough, except that it has just greedy quantifiers which means that it will match from the first /* to the last */. You want non-greedy quantifiers instead, to get this pattern:
(/\*((.|
)*?)\*/)|//.*
Small change, big consequence since it now matches to the first */ after the /*. Re-encoded as Java code.
Pattern pattern = Pattern.compile("(/\\*((.|\n)*?)\\*/)|//.*");
(Be aware that you are very close to the limit of what it is sensible to match with regular expressions. Indeed, it's actually incorrect since you might have strings with /* or // in. But you'll probably get away with it…)
I have a text file as follows:
Title
XYZ
Id name
1 abc
2 pqr
3 xyz
I need to read the content starting with the integer value and I used the regular expression as in the following code.
public static void main(String[] args) throws FileNotFoundException {
FileInputStream file= new FileInputStream("C:\\Users\\ap\\Downloads\\sample1.txt");
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
if (line.startsWith("[0-9]")) {
System.out.println("Line: "+line);
}
}
}
The above code can't detect the lines starting with integers. However, it works fine if single integer values are passed to startsWith() function.
Please suggest, where I went wrong.
String#startsWith(String) method doesn't take regex. It takes a string literal.
To check the first character is digit or not, you can get the character at index 0 using String#charAt(int index) method. And then test that character is digit or not using Character#isDigit(char) method:
if (Character.isDigit(line.charAt(0)) {
System.out.println(line);
}
For regex you can use the "matches" method, like this:
line.matches("^[0-9].*")