I am trying to split a combined text file. The combined text file has multiple xml files inside. I want to split on <?xml version='1.0'?> which is the start of every new xml inside the combined text file. Not sure what is the best way to do this. Currently this is what I have which does not split correctly.
Updated Code Working (fixed quotation in quotes problem added Pattern.quote):
Scanner scanner = new Scanner( new File("src/main/resources/Flume_Sample"), "UTF-8" );
String combinedText = scanner.useDelimiter("\\A").next();
scanner.close(); // Put this call in a finally block
String delimiter = "<?xml version=\"1.0\"?>";
String[] xmlFiles = combinedText.split("(?="+Pattern.quote(delimiter)+")");
for (int i = 0; i < xmlFiles.length; i++){
File file = new File("src/main/resources/output_"+i);
FileWriter writer = new FileWriter(file);
writer.write(xmlFiles[i]);
System.out.println(xmlFiles[i]);
writer.close();
}
The split method takes a regular expression string, so you may want to escape your delimiter String to a valid regex :
String[] xmlFiles = combinedText.split(Pattern.quote(delimiter));
See the Pattern.quote method .
Be also aware that you will load the entire initial file in memory if you proceed this way.
A streamed approach would perform better if the input file is large...
I would use something like this if you want to parse the data manually.
public static void parseFile(File file) throws AttributeException, LineException{
BufferedReader br = null;
String s = "";
int counter = 0;
if(file != null){
try{
br = new BufferedReader(new FileReader(file));
while((s = br.readLine()) != null){
if(s.contains("<?xml version='1.0'?>")){
//Write in new file with Stringbuffer and Filewritter.
}
}
br.close();
}catch (IOException e){
System.out.println(e);
}
}
}
In Java, is there any method to read a particular line from a file? For example, read line 32 or any other line number.
For small files:
String line32 = Files.readAllLines(Paths.get("file.txt")).get(32)
For large files:
try (Stream<String> lines = Files.lines(Paths.get("file.txt"))) {
line32 = lines.skip(31).findFirst().get();
}
Unless you have previous knowledge about the lines in the file, there's no way to directly access the 32nd line without reading the 31 previous lines.
That's true for all languages and all modern file systems.
So effectively you'll simply read lines until you've found the 32nd one.
Not that I know of, but what you could do is loop through the first 31 lines doing nothing using the readline() function of BufferedReader
FileInputStream fs= new FileInputStream("someFile.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fs));
for(int i = 0; i < 31; ++i)
br.readLine();
String lineIWant = br.readLine();
Joachim is right on, of course, and an alternate implementation to Chris' (for small files only because it loads the entire file) might be to use commons-io from Apache (though arguably you might not want to introduce a new dependency just for this, if you find it useful for other stuff too though, it could make sense).
For example:
String line32 = (String) FileUtils.readLines(file).get(31);
http://commons.apache.org/io/api-release/org/apache/commons/io/FileUtils.html#readLines(java.io.File, java.lang.String)
You may try indexed-file-reader (Apache License 2.0). The class IndexedFileReader has a method called readLines(int from, int to) which returns a SortedMap whose key is the line number and the value is the line that was read.
Example:
File file = new File("src/test/resources/file.txt");
reader = new IndexedFileReader(file);
lines = reader.readLines(6, 10);
assertNotNull("Null result.", lines);
assertEquals("Incorrect length.", 5, lines.size());
assertTrue("Incorrect value.", lines.get(6).startsWith("[6]"));
assertTrue("Incorrect value.", lines.get(7).startsWith("[7]"));
assertTrue("Incorrect value.", lines.get(8).startsWith("[8]"));
assertTrue("Incorrect value.", lines.get(9).startsWith("[9]"));
assertTrue("Incorrect value.", lines.get(10).startsWith("[10]"));
The above example reads a text file composed of 50 lines in the following format:
[1] The quick brown fox jumped over the lazy dog ODD
[2] The quick brown fox jumped over the lazy dog EVEN
Disclamer: I wrote this library
Although as said in other answers, it is not possible to get to the exact line without knowing the offset (pointer) before. So, I've achieved this by creating an temporary index file which would store the offset values of every line. If the file is small enough, you could just store the indexes (offset) in memory without needing a separate file for it.
The offsets can be calculated by using the RandomAccessFile
RandomAccessFile raf = new RandomAccessFile("myFile.txt","r");
//above 'r' means open in read only mode
ArrayList<Integer> arrayList = new ArrayList<Integer>();
String cur_line = "";
while((cur_line=raf.readLine())!=null)
{
arrayList.add(raf.getFilePointer());
}
//Print the 32 line
//Seeks the file to the particular location from where our '32' line starts
raf.seek(raf.seek(arrayList.get(31));
System.out.println(raf.readLine());
raf.close();
Also visit the Java docs on RandomAccessFile for more information:
Complexity: This is O(n) as it reads the entire file once. Please be aware for the memory requirements. If it's too big to be in memory, then make a temporary file that stores the offsets instead of ArrayList as shown above.
Note: If all you want in '32' line, you just have to call the readLine() also available through other classes '32' times. The above approach is useful if you want to get the a specific line (based on line number of course) multiple times.
Another way.
try (BufferedReader reader = Files.newBufferedReader(
Paths.get("file.txt"), StandardCharsets.UTF_8)) {
List<String> line = reader.lines()
.skip(31)
.limit(1)
.collect(Collectors.toList());
line.stream().forEach(System.out::println);
}
No, unless in that file format the line lengths are pre-determined (e.g. all lines with a fixed length), you'll have to iterate line by line to count them.
In Java 8,
For small files:
String line = Files.readAllLines(Paths.get("file.txt")).get(n);
For large files:
String line;
try (Stream<String> lines = Files.lines(Paths.get("file.txt"))) {
line = lines.skip(n).findFirst().get();
}
In Java 7
String line;
try (BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
for (int i = 0; i < n; i++)
br.readLine();
line = br.readLine();
}
Source: Reading nth line from file
If you are talking about a text file, then there is really no way to do this without reading all the lines that precede it - After all, lines are determined by the presence of a newline, so it has to be read.
Use a stream that supports readline, and just read the first X-1 lines and dump the results, then process the next one.
It works for me:
I have combined the answer of
Reading a simple text file
But instead of return a String I am returning a LinkedList of Strings. Then I can select the line that I want.
public static LinkedList<String> readFromAssets(Context context, String filename) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(context.getAssets().open(filename)));
LinkedList<String>linkedList = new LinkedList<>();
// do reading, usually loop until end of file reading
StringBuilder sb = new StringBuilder();
String mLine = reader.readLine();
while (mLine != null) {
linkedList.add(mLine);
sb.append(mLine); // process line
mLine = reader.readLine();
}
reader.close();
return linkedList;
}
Use this code:
import java.nio.file.Files;
import java.nio.file.Paths;
public class FileWork
{
public static void main(String[] args) throws IOException {
String line = Files.readAllLines(Paths.get("D:/abc.txt")).get(1);
System.out.println(line);
}
}
You can use LineNumberReader instead of BufferedReader. Go through the api. You can find setLineNumber and getLineNumber methods.
You can also take a look at LineNumberReader, subclass of BufferedReader. Along with the readline method, it also has setter/getter methods to access line number. Very useful to keep track of the number of lines read, while reading data from file.
public String readLine(int line){
FileReader tempFileReader = null;
BufferedReader tempBufferedReader = null;
try { tempFileReader = new FileReader(textFile);
tempBufferedReader = new BufferedReader(tempFileReader);
} catch (Exception e) { }
String returnStr = "ERROR";
for(int i = 0; i < line - 1; i++){
try { tempBufferedReader.readLine(); } catch (Exception e) { }
}
try { returnStr = tempBufferedReader.readLine(); } catch (Exception e) { }
return returnStr;
}
you can use the skip() function to skip the lines from begining.
public static void readFile(String filePath, long lineNum) {
List<String> list = new ArrayList<>();
long totalLines, startLine = 0;
try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
totalLines = Files.lines(Paths.get(filePath)).count();
startLine = totalLines - lineNum;
// Stream<String> line32 = lines.skip(((startLine)+1));
list = lines.skip(startLine).collect(Collectors.toList());
// lines.forEach(list::add);
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
list.forEach(System.out::println);
}
EASY WAY - Reading a line using line number.
Let's say Line number starts from 1 till null .
public class TextFileAssignmentOct {
private void readData(int rowNum, BufferedReader br) throws IOException {
int n=1; //Line number starts from 1
String row;
while((row=br.readLine()) != null) { // Reads every line
if (n == rowNum) { // When Line number matches with which you want to read
System.out.println(row);
}
n++; //This increments Line number
}
}
public static void main(String[] args) throws IOException {
File f = new File("../JavaPractice/FileRead.txt");
FileReader fr = new FileReader(f);
BufferedReader br = new BufferedReader(fr);
TextFileAssignmentOct txf = new TextFileAssignmentOct();
txf.readData(4, br); //Read a Specific Line using Line number and Passing buffered reader
}
}
for a text file you can use an integer with a loop to help you get the number of the line, don't forget to import the classes we are using in this example
File myObj = new File("C:\\Users\\LENOVO\\Desktop\\test.txt");//path of the file
FileReader fr = new FileReader(myObj);
fr.read();
BufferedReader bf = new BufferedReader(fr); //BufferedReader of the FileReader fr
String line = bf.readLine();
int lineNumber = 0;
while (line != null) {
lineNumber = lineNumber + 1;
if(lineNumber == 7)
{
//show line
System.out.println("line: " + lineNumber + " has :" + line);
break;
}
//lecture de la prochaine ligne, reading next
line = bf.readLine();
}
They are all wrong I just wrote this in about 10 seconds.
With this I managed to just call the object.getQuestion("linenumber") in the main method to return whatever line I want.
public class Questions {
File file = new File("Question2Files/triviagame1.txt");
public Questions() {
}
public String getQuestion(int numLine) throws IOException {
BufferedReader br = new BufferedReader(new FileReader(file));
String line = "";
for(int i = 0; i < numLine; i++) {
line = br.readLine();
}
return line; }}
My java code takes almost 10-15minutes to run (Input file is 7200+ lines long list of query). How do I make it run in short time to get same results?
How do I make my code to search only for aA to zZ and 0 to 9??
If I don't do #2, some characters in my output are shown as "?". How do I solve this issue?
// no parameters are used in the main method
public static void main(String[] args) {
// assumes a text file named test.txt in a folder under the C:\file\test.txt
Scanner s = null;
BufferedWriter out = null;
try {
// create a scanner to read from the text file test.txt
FileInputStream fstream = new FileInputStream("C:\\user\\query.txt");
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
// Write to the file
out = new BufferedWriter(new FileWriter("C:\\user\\outputquery.txt"));
// keep getting the next String from the text, separated by white space
// and print each token in a line in the output file
//while (s.hasNext()) {
// String token = s.next();
// System.out.println(token);
// out.write(token + "\r\n");
//}
String strLine="";
String str="";
while ((strLine = br.readLine()) != null) {
str+=strLine;
}
String st=str.replaceAll(" ", "");
char[]third =st.toCharArray();
System.out.println("Character Total");
for(int counter =0;counter<third.length;counter++){
//String ch= "a";
char ch= third[counter];
int count=0;
for ( int i=0; i<third.length; i++){
// if (ch=="a")
if (ch==third[i])
count++;
}
boolean flag=false;
for(int j=counter-1;j>=0;j--){
//if(ch=="b")
if(ch==third[j])
flag=true;
}
if(!flag){
System.out.println(ch+" "+count);
out.write(ch+" "+count);
}
}
// close the output file
out.close();
} catch (IOException e) {
// print any error messages
System.out.println(e.getMessage());
}
// optional to close the scanner here, the close can occur at the end of the code
finally {
if (s != null) {
// close the input file
s.close();
}
}
}
For something like this I would NOT recommend java though it entirely possible it is much easier with GAWK or something similar. GAWK also has java like syntax so its easy to pick up. You should check it out.
SO isn't really the place to ask such a broad how-do-I-do-this-question but I will refer you to the following page on regular expression and text match in Java. Also, check out the Javadocs for regexes.
If you follow that link you should get what you want, else you could post a more specific question back on SO.
I'm trying to read a simple text file that contains the following:
LOAD
Bill's Beans
1200
20
15
30
QUIT
I need to store and print the contents line by line. I am doing so using the following code:
String inputFile = "(file path here)";
try {
Scanner input = new Scanner(inputFile);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
String currentLine = "";
while (!currentLine.equals("QUIT}")){
currentLine = input.nextLine();
System.out.println(currentLine);
}
input.close();
However, the output is very "messy". I am trying to avoid storing all new line characters and anything else that doesn't appear in the text file. Output is:
{\rtf1\ansi\ansicpg1252\cocoartf949\cocoasubrtf540
{\fonttbl\f0\fmodern\fcharset0 Courier;}
{\colortbl;\red255\green255\blue255;}
\margl1440\margr1440\vieww9000\viewh8400\viewkind0
\deftab720
\pard\pardeftab720\ql\qnatural
\f0\fs26 \cf0 LOAD\
Bill's Beans\
1200\
20\
15\
30\
QUIT}
Any help would be greatly appreciated, thank you!
This looks like you're reading a RTF file, isn't that so, by any chance?
Otherwise, I found reading text files is most natural for me using this construct:
BufferedReader reader = new BufferedReader(
new FileReader(new File("yourfile.txt")
);
String text = null;
// repeat until all lines is read
while ((text = reader.readLine()) != null) {
// do whatever with the text line
}
Because this is an RTF file, look into this for example: RTFEditorKit
If you insist on writing your own RTF reader, the correct approach would be for you to extend FilterInputStream and handle the RTF metadata in its implementation.
Just add following code into your class, then call it with path parameter. it returns all lines as List object
public List<String> readStudentsNoFromText(String path) throws IOException {
List<String> result = new ArrayList<String>();
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream(new File(path));
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
System.out.println(strLine);
result.add(strLine.trim());
}
//Close the input stream
in.close();
return result;
}
Currently I am trying something very simple. I am looking through an XML document for a certain phrase upon which I try to replace it. The problem I am having is that when I read the lines I store each line into a StringBuffer. When I write the it to a document everything is written on a single line.
Here my code:
File xmlFile = new File("abc.xml")
BufferedReader br = new BufferedReader(new FileReade(xmlFile));
String line = null;
while((line = br.readLine())!= null)
{
if(line.indexOf("abc") != -1)
{
line = line.replaceAll("abc","xyz");
}
sb.append(line);
}
br.close();
BufferedWriter bw = new BufferedWriter(new FileWriter(xmlFile));
bw.write(sb.toString());
bw.close();
I am assuming I need a new line character when I prefer sb.append but unfortunately I don't know which character to use as "\n" does not work.
Thanks in advance!
P.S. I figured there must be a way to use Xalan to format the XML file after I write to it or something. Not sure how to do that though.
The readline reads everything between the newline characters so when you write back out, obviously the newline characters are missing. These characters depend on the OS: windows uses two characters to do a newline, unix uses one for example. To be OS agnostic, retrieve the system property "line.separator":
String newline = System.getProperty("line.separator");
and append it to your stringbuffer:
sb.append(line).append(newline);
Modified as suggested by Brel, your text-substituting approach should work, and it will work well enough for simple applications.
If things start to get a little hairier, and you end up wanting to select elements based on their position in the XML structure, and if you need to be sure to change element text but not tag text (think <abc>abc</abc>), then you'll want to call in in the cavalry and process the XML with an XML parser.
Essentially you read in a Document using a DocuemntBuilder, you hop around the document's nodes doing whatever you need to, and then ask the Document to write itself back to file. Or do you ask the parser? Anyway, most XML parsers have a handful of options that let you format the XML output: You can specify indentation (or not) and maybe newlines for every opening tag, that kinda thing, to make your XML look pretty.
Sb would be the StringBuffer object, which has not been instantiated in this example. This can added before the while loop:
StringBuffer sb = new StringBuffer();
Scanner scan = new Scanner(System.in);
String filePath = scan.next();
String oldString = "old_string";
String newString = "new_string";
String oldContent = "";
BufferedReader br = null;
FileWriter writer = null;
File xmlFile = new File(filePath);
try {
br = new BufferedReader(new FileReader(xmlFile));
String line = br.readLine();
while (line != null) {
oldContent = oldContent + line + System.lineSeparator();
line = br.readLine();
}
String newContent = oldContent.replaceAll(oldString, newString);
writer = new FileWriter(xmlFile);
writer.write(newContent);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
scan.close();
br.close();
writer.close();
} catch (IOException e) {
e.printStackTrace();
}
}