I have made this method to take in a file.txt and transfer its elements into an array list.
My problem is, I dont want to transfer a whole line into one string. I want to take each element on the line as string.
public ArrayList<String> readData() throws IOException {
FileReader pp=new FileReader(filename);
BufferedReader nn=new BufferedReader(pp);
ArrayList<String> data=new ArrayList<String>();
String line;
while((line=nn.readLine()) != null){
data.add(line);
}
xoxo.close();
return data;
}
is it possible ?
What about reading the lines, but splitting each line into the single words?
while ((line = nn.readLine()) != null) {
for (String word : line.split(" ")) {
data.add(line);
}
}
The method split(" ") in this example will split the line on each whitespace " " and put the single words into an array.
In case the words in the file are separated by another character (like a comma for example) you can use that too in split():
line.split(",");
If I may, here is a somewhat easier way to read a text file:
Scanner scanner = new Scanner(filename);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
for (String word : line.split(" ")) {
data.add(word);
}
}
Well not easier but shorter :)
And one last advice: if you give your variables a more.. readable name like bufferedReader instead of naming them all nn, pp, xoxo you might have less problems when the code grows more and more complex later on
Use split function for String.
String line = "This is line";
String [] a = line.split("\\s");// \\s is regular expression for space
a[0] = This
a[1] = is
a[2] = line
If by 'Element' you mean each word, then simply changing
line = nn.readLine()
to
line = nn.read()
should fix your problem, as the read method will take in every character it reads until it hits a space character in which it will return the processed characters. However if by element you mean each character then the problem is much harder. You will need to read each word and split that string up using any of the various functions Java provides.
Related
I am trying to read an input file that contains the following:
input.txt
Hello world. Welcome,
to the java.
And, I have to append the sentence with prefix(BEGIN) and suffix(END) and the output should like the following:
output expected:
BEGIN_Hello world_END.BEGIN_ Welcome,
to the java_END.
Following is my input file reading function. I am reading an entire file and storing it in array list:
InputDetails.java
private List<String> readInput = new ArrayList<>();
public void readFile() throws IOException {
while((inputLine = input.readLine()) != null ) {
readInput.add(inputLine);
}
}
//Getter to return input file content
public List<String> getReadInput() {
return readInput;
}
And following is my code for appending the string with BEGIN and END:
public void process() {
InputDetails inputD = new InputDetails();
for(int i=0;i<inputD.getReadInput().size();i++) {
String sentence = inputD.getReadInput().get(i);
String splitSentence[] = sentence.split("\\.");
for(int j=0;j<splitSentence.length;j++) {
System.out.println(splitSentence[j]);
splitSentence[j] = "BEGIN_"+splitSentence[j]+"__END";
}
sentence = String.join(".",splitSentence);
inputD.writeToFile(sentence);
}
}
output getting:
BEGIN_SENTENCE__Hello world__END_SENTENCE.BEGIN_SENTENCE__Welcome
to the java.
Note: Each sentence is separated by a "." (period) character. The output Sentence should be prefixed with BEGIN_ and suffixed with __END. The period character is not considered a part of the sentence. And, input file are delimited by one or more spaces. The sentence is complete when it has period(.) Even if it means the sentence completes on the new line(just as the input that i specified above). All, the special chars position should be retained in the output. There can also be a space between period(.) or a comma(,) and a word. for eg: java . or Welcome ,
Can Anyone help me fix this? Thanks
First, you'll need to join your string list input into a single string. Then, you can use the String.split() method to break up your input into parts delimited by the . character. You can then choose to either run a loop on that array or use the stream method (as shown below) to iterate over your sentences. On each part, simply append the required BEGIN_ and _END blocks to the sentence. You can use manual string concatenation using the + operator or use a string template with String.format() (as shown below). Finally, reintroduce the . delimiter used to break the input by joining the parts back into a single string.
String fullString = String.join("", getReadInput());
Arrays.asList(fullString).split("\\.")).stream()
.map(s -> String.format("BEGIN_%s_END", s))
.collect(Collectors.joining("."));
I am trying to keep every single word in this file into an array so i could apply my own language implementation on it. I have applied split already but when I put the string into the variable parts, parts[0] will display the whole file instead of one word only while parts[1] will give an error
java.lang.ArrayIndexOutOfBoundsException: 2
How do I access every single word in this file?
String[] parts = line.split("\\s+");
System.out.print(parts[0] + '\n');
file test.snol contains
SNOL
INTO num IS 5
INTO res IS MULT num num
INTO res IS MULT res res
INTO res IS MOD res num
PRINT num
PRINT res
LONS
If you are using java-8 you can do so in a single line :-
String[] words = Files.lines(Paths.get(PATH))
.flatMap(line -> Arrays.stream(line.split(" ")))
.toArray(String[]::new);
Alternatively, if you want to access each line as a list of String[] you can use :-
List<String[]> lines = Files.lines(Paths.get(PATH))
.collect(Collectors.toList())
.stream().map(e -> e.split(" "))
.collect(Collectors.toList());
The regular expression match token for whitespace is \s. Your code uses a forward slash (/) instead of a backslash (\) which has no special meaning, so your code is trying to match two forward slashes followed by one or more ss.
In Java, regular expressions are passed through strings, so backslashes need to be escaped by a second backslash (unlike a forward slash which needs no special handling). Your regular expression should read "\\s+" which will match one more whitespace characters.
Your call to split should then return an array with each word from the line as a different element.
If you are reading your file line by line, you can access every word with code like
BufferedReader reader = new BufferedReader(new FileReader("D:\\test.snol"));
String line;
while ((line = reader.readLine()) != null) {
String[] words = line.split("\\s+");
for (String word : words) {
System.out.println(word);
}
}
I have a line in input file.
It is arranged as following (example):
(space)MOV(space)A,(space)(space)#20
When computer is reading this line, I plan to split() this string and add into the array. I use following code for this:
while((nline = bufreader.readLine()) != null)
{
String[] array = nline.split("[ ,]");
With other words, string is splitted with delimiters: (space) and (comma). So, I expect my array to have a length of 3. but in practce I get 6.
So, as I understood, computer creates array of {"(space)", "MOV", "(space)", "A", "(space)", "(space)", "#20"}. However, I need this array: {"MOV", "A", "#20"}
How can I get this? Or how can I split the array according to the above mentioned delimiters. (I suppose that nline.split("[ ,]") is not correct).
I put all the explanations in the comment to proper lines, have a look at this:
String nline;
BufferedReader bufreader = new BufferedReader(new FileReader(new File("nameOfYourFile")));
while((nline = bufreader.readLine()) != null) {
String trimmed = nline.trim(); // removing leading and trailing spaces
// System.out.println(trimmed); Output from this line: >>MOV A, #20<< (">>" and "<<" just to show where it begins and ends)
String[] splitted = trimmed.split("[ |,]{1,}"); // split on ' ' OR ',' that appear AT LEAST once (so it also matches " ," (space + comma))
System.out.println(Arrays.toString(splitted)); // Output: [MOV, A, #20]
}
bufreader.close();
I have a text with sentences by this format:
sentence 1 This is a sentence.
t-extraction 1 This is a sentence
s-extraction 1 This_DT is_V a_DT sentence_N
sentence 2 ...
As you see, the lines are separated by enter key. sentence, t-extraction, s-extraction words are repeated. The numbers are sentence numbers 1,2,.. . The phrases are separated by Tab key for example in the first line: sentence(TAb)1(TAb)This is a sentence.
or in the second line:t-extraction(TAb)1(TAb)This(TAb)is(TAb)a sentence.
I need to map some of these information in a sql table, so I should extract them.
I need first and second sentence(without sentence word in first lines and t-extraction and numbers in second lines). Each separated part by Tab will be mapped in a field in sql (for example 1 in one column, This is a sentence in one column, This (in second lines) in one column, and also is and a sentence ).
What is your suggestion? Thanks in advance.
You could use String.split().
The regex you could use is [^A-Za-z_]+ or [ \t]+
Using the split method on String is probably the key to this. The split command breaks a string into parts where the regex matches, returning an array of Strings of the parts between the matches.
You want to match on tab (or \t as it is delimited to). You also want to process three lines as a unit, the code below shows one way of doing this (it does depend on the file being in good format).
Of course you want to use a reader created from your file not a string.
public class Test {
public static void main(String[] args) throws Exception {
BufferedReader reader = new BufferedReader(new FileReader("/my/file.data"));
String line = null;
for(int i = 0; (line = reader.readLine()) != null; i++){
if(i % 3 == 0){
String[] parts = line.split("\t");
System.out.printf("sentence ==> %s\n", Arrays.toString(parts));
} else if(i % 3 == 1){
String[] parts = line.split("\t");
System.out.printf("t-sentence ==> %s\n", Arrays.toString(parts));
} else {
String[] parts = line.split("\t");
System.out.printf("s-sentence ==> %s\n", Arrays.toString(parts));
}
}
}
}
I am trying to use the Scanner class to read a line using the next(Pattern pattern) method to capture the text before the colon and then after the colon so that s1 = textbeforecolon and s2 = textaftercolon.
The line looks like this:
something:somethingelse
There are two ways of doing this, depending on specifically what you want.
If you want to split the entire input by colons, then you can use the useDelimiter() method, like others have pointed out:
// You could also say "scanner.useDelimiter(Pattern.compile(":"))", but
// that's the exact same thing as saying "scanner.useDelimiter(":")".
scanner.useDelimiter(":");
// Examines each token one at a time
while (scanner.hasNext())
{
String token = scanner.next();
// Do something with token here...
}
If you want to split each line by a colon, then it would be much easier to use String's split() method:
while (scanner.hasNextLine())
{
String[] parts = scanner.nextLine().split(":");
// The parts array now contains ["something", "somethingelse"]
}
I've never used Pattern with scanner.
I've always just changed the delimeter with a string.
http://java.sun.com/j2se/1.5.0/docs/api/java/util/Scanner.html#useDelimiter(java.lang.String)
File file = new File("someFileWithLinesContainingYourExampleText.txt");
Scanner s = new Scanner(file);
s.useDelimiter(":");
while (!s.hasNextLine()) {
while (s.hasNext()) {
String text = s.next();
System.out.println(text);
}
s.nextLine();
}