How to split single text file into multiple with character as delimiter

How to split single text file into multiple with character as delimiter - java

I have a text document that has multiple separate entries all compiled into one .log file.
The format of the file looks something like this.
$#UserID#$
Date
User
UserInfo
SteamFriendID
=========================
<p>Message</p>
$#UserID#$
Date
User
UserInfo
SteamFriendID
========================
<p>Message</p>
$#UserID#$
Date
User
UserInfo
SteamFriendID
========================
<p>Message</p>
I'm trying to take everything in between the instances of "$#UserID$#", and print them into separate text files.
So far, with the looking that I've done, I tried implementing it using StringBuilder in something like this.
FileReader fr = new FileReader(“Path to raw file.”);
int idCount = 1;
FileWriter fw = new FileWriter("Path to parsed files" + idCount);
BufferedReader br = new BufferedReader(fr);
//String line, date, user, userInfo, steamID;
StringBuilder sb = new StringBuilder();
//br.readLine();
while ((line = br.readLine()) != null) {
if(line.substring(0,1).contains("$#")) {
if (sb.length() != 0) {
File file = new File("Path to parsed logs" + idCount);
PrintWriter pw = new PrintWriter(file, "UTF-8");
pw.println(sb.toString());
pw.close();
//System.out.println(sb.toString());
Sb.delete(0, sb.length());
idCount++;
}
continue;
}
sb.append(line + "\r\n");
}
But this only gives me the first 2 of the entries in separate parsed files. Leaving the 3rd one out for some reason.
The other way I was thinking about doing it was reading in all the lines using .readAllLines(), store the list as an array, loop through the lines to find "$#", get that line's index & then recursively write the lines starting at the index given.
Does anyone know of a better way to do this, or would be willing to explain to me why I'm only getting two of the three entries parsed?

Short / quick fix is to write the contents of the StringBuilder once after your while loop like this:
public static void main(String[] args) {
try {
int idCount = 1;
FileReader fr = new FileReader("<path to desired file>");
BufferedReader br = new BufferedReader(fr);
//String line, date, user, userInfo, steamID;
StringBuilder sb = new StringBuilder();
//br.readLine();
String line = "";
while ((line = br.readLine()) != null) {
if(line.startsWith("$#")) {
if (sb.length() != 0) {
writeFile(sb.toString(), idCount);
System.out.println(sb);
sb.setLength(0);
idCount++;
}
continue;
}
sb.append(line + "\r\n");
}
if (sb.length() != 0) {
writeFile(sb.toString(), idCount);
System.out.println(sb);
idCount++;
}
} catch (IOException e) {
e.printStackTrace();
}
}
private static void writeFile(String content, int id) throws IOException
{
File file = new File("<path to desired dir>\\ID_" + id + ".txt");
file.createNewFile();
PrintWriter pw = new PrintWriter(file, "UTF-8");
pw.println(content);
pw.close();
}
I've changed two additional things:
the condition "line.substring(0,1).contains("$#")" did not work properly, the substring call only returns one character, but is compared to two characters -> never true. I changed that to use the 'startsWith' method.
After the content of the StringBuilder is written to file, you did not reset or empty it, resulting in the second and third file containing every previous blocks aswell (thrid file equals input then...). So thats done with "sb.setLength(0);".

Related

Writing to File duplicating data the second time JAVA

I'm creating a program to remove doctors from an arrayList that is utilising a queue. This works the first time perfectly however, the second time it's duplicating the data inside the text file. How can I solve this?
/**
*
* #throws Exception
*/
public void writeArrayListToFile() throws Exception {
String path = "src/assignment1com327ccab/DoctorRecordsFile.txt";
OutputStreamWriter os = new OutputStreamWriter(new FileOutputStream(path));
BufferedWriter br = new BufferedWriter(os);
PrintWriter out = new PrintWriter(br);
DoctorNode temp; //create a temporary doctorNode object
temp = end; //temp is equal to the end of the queue
//try this while temp is not equal to null (queue is not empty)
StringBuilder doctor = new StringBuilder();
while (temp != null) {
{
doctor.append(temp.toStringFile());
doctor.append("\n");
//temp is equal to temp.getNext doctor to get the next doctor to count
temp = temp.getNext();
}
}
System.out.println("Finished list");
System.out.println("Doctors is : " + doctor.toString());
out.println(doctor.toString());
System.out.println("Done");
br.newLine();
br.close();
}

This is not 100% solution but I think it will give you the right directions. I don't want to do 100% work for you :)
In my comment I said
Read file content
Store it in variable
Remove file
Remove doctors from variable
Write variables to new file
So, to read file content we would use something file this (if it's txt file):
public static String read(File file) throws FileNotFoundException {
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(file.getAbsoluteFile()));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
line = br.readLine();
if (line != null) sb.append(System.lineSeparator());
}
String everything = sb.toString();
return everything;
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null) br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return null;
}
This method returns String as file content. We can store it in a variable like this:
String fileContent = MyClass.read(new File("path to file"));
Next step would be to remove our file. Since we have it in memory, and we don't want duplicate values...
file.delete();
Now we should remove our doctors from fileContent. It's basic String operations. I would recommend using method replace() or replaceAll().
And after the String manipulation, just write fileContent to our file again.
File file = new File("the same path");
file.createNewFile();
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream(file, true), "UTF-8"));
out.write(fileContent);
out.flush();
out.close();

Creating an inverted index with limited memory in java

Im curious on how create an Inverted Index on data that doesn't fit into memory. So right now I'm reading a file directory and indexing the files based on the contents inside the file, I am using a HashMap to store the index. The code below is a snippet from a function I use and I call the function on an entire directory. What do I do if this directory was just massive and the HashMap can't fit all the entries. Yes, This does sound like premature optimization. Im just having fun. I don't want to use Lucene so don't even mention it because I'm tired as to seeing that as the majority answer to "Index" stuff. This HashMap is my only constraint everything else is stored in files to easily reference stuff later on.
Im just curious how I can do this since it stores it in the map like so
keyword -> file1,file2,file3,etc..(locations)
keyword2 -> file9,file11,file13,etc..(locations)
My thoughts were to create a file which would some how be able to update itself to be like the format above but I feel thats not efficient.
Code Snippet
br = new BufferedReader(new FileReader(file));
while ((line = br.readLine()) != null) {
for (String _word : line.split("\\W+")) {
word = _word.toLowerCase();
if (!ignore_words.contains(word)) {
fileLocations = index.get(word);
if (fileLocations == null) {
fileLocations = new LinkedList<Long>();
index.put(word, fileLocations);
}
fileLocations.add(file_offset);
}
}
}
br.close();
Update:
So I managed to come up with something, but performance wise I feel this is slow, especially if there was a large amount of data. I basically created a file that would just have to word and its offset on each line the word appeared.Lets name it index.txt.
It had the format of like so
word1:offset
word2:offset
word1:offset <-encountered again.
word3:offset
etc...
I then created multiple files for each word and appended the offset to that file each time it was encountered in the index.txt file.
So basically the format of the word files are like so
word1.txt -- Format
word1:offset1:offset2:offset3:offset4...and so on
each time word1 is encountered in the index.txt file it would append it to the word1.txt file and add to end.
Then finally, I go through all the word files I created and overwrite the index.txt file with the final output in the index file looking like so
word1:offset1:offset2:offset3:offset4:...
word2:offset9:offset11:offset13:offset14:...
etc..
Then to finish it up, I delete all the word files.
The nasty code snippet for this is below, its a fair amount.
public void createIndex(String word, long file_offset)
{
PrintWriter writer;
try {
writer = new PrintWriter(new FileWriter(this.file,true));
writer.write(word + ":" + file_offset + "\n");
writer.close();
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
}
public void mergeFiles()
{
String line;
String wordLine;
String[] contents;
String[] wordContents;
BufferedReader reader;
BufferedReader mergeReader;
PrintWriter writer;
PrintWriter mergeWriter;
try {
reader = new BufferedReader(new FileReader(this.file));
while((line = reader.readLine()) != null)
{
contents = line.split(":");
writer = new PrintWriter(new FileWriter(
new File(contents[0] + ".txt"),true));
if(this.words.get(contents[0]) == null)
{
this.words.put(contents[0], contents[0]);
writer.write(contents[0] + ":");
}
writer.write(contents[1] + ":");
writer.close();
}
//This could be put in its own method below.
mergeWriter = new PrintWriter(new FileWriter(this.file));
for(String word : this.words.keySet())
{
mergeReader = new BufferedReader(
new FileReader(new File(word + ".txt")));
while((wordLine = mergeReader.readLine()) != null)
{
mergeWriter.write(wordLine + "\n");
}
}
mergeWriter.close();
deleteFiles();
}
catch(IOException ioe)
{
ioe.printStackTrace();
}
}
public void deleteFiles()
{
File toDelete;
for(String word : this.words.keySet())
{
toDelete = new File(word + ".txt");
if(toDelete.exists())
{
toDelete.delete();
}
}
}

File Writer that puts a number of a line before the line it's about to write

I am writing a method for my java class. it looks like this so far:
String file_name;
String line;
void addLine(file_name, line){
int line_number;
try {
FileWriter writer = new FileWriter(file_name, true);
PrintWriter out = new PrintWriter(writer);
out.println(line_number + line);
}
catch (IOException e){
System.out.println(e);
}
}
How should I define line_number so it would check how many lines were there in file before I printed out next into it?

int totalLines = 0;
BufferedReader br br = new BufferedReader(new FileReader("C:\\filename.txt"));
String CurrentLine = "";
while ((CurrentLine = br.readLine()) != null) {
++totalLines
}
i think you have to actually read the file by using a bufferedreader. and then keep on incrementing the totalLines till it reach the end of the file

You can count them with a function posted here: Number of lines in a file in Java
They tested it with a 150 MB log file and it seems to be fast.

How to delete a line from text line by id java

How to delete a line from a text file java?
I searched everywhere and even though I can't find a way to delete a line.
I have the text file: a.txt
1, Anaa, 23
4, Mary, 3
and the function taken from internet:
public void removeLineFromFile(Long id){
try{
File inputFile = new File(fileName);
File tempFile = new File("C:\\Users\\...myTempFile.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));
BufferedWriter writer = new BufferedWriter(new FileWriter(tempFile));
String lineToRemove = Objects.toString(id,null);
String currentLine;
while((currentLine = reader.readLine()) != null) {
// trim newline when comparing with lineToRemove
String trimmedLine = currentLine.trim();
String trimmLine[] = trimmedLine.split(" ");
if(!trimmLine.equals(lineToRemove)) {
writer.write(currentLine + System.getProperty("line.separator"));
}
}
writer.close();
reader.close();
boolean successful = tempFile.renameTo(inputFile);
}catch (IOException e){
e.printStackTrace();
}
}
where the fileName is the path for a.txt.
I have to delete the line enetering the id.That's why I split the trimmedLine. At the end of execution I have 2 files, the a.txt and myTempFile both having the same lines(the ones from beginning). Why couldn't delete it?

If I understand your question correctly, you want to delete the line whose id matches with the id passed in the removeLineFromFile method.
To make your code work, only few changes are needed.
To extract the id, you need to split using both " " and ","
i.e.
String trimmLine[] = trimmedLine.split(" |,");
where | is the regex OR operator.
See Java: use split() with multiple delimiters.
Also, trimmLine is an array, you can't just compare trimmLine with lineToRemove. You first need to extract the first part which is the id from trimmLine. I would suggest you to look at the working of split method if you have difficulty in understanding this. You can have a look at How to split a string in Java.
So, extract the id which is the first index of the array trimmLine here using:
String part1 = trimmLine[0];
and then compare part1 with lineToRemove.
Whole code looks like:
public void removeLineFromFile(Long id){
try{
File inputFile = new File(fileName);
File tempFile = new File("C:\\Users\\...myTempFile.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));
BufferedWriter writer = new BufferedWriter(new FileWriter(tempFile));
String lineToRemove = Objects.toString(id,null);
String currentLine;
while((currentLine = reader.readLine()) != null) {
// trim newline when comparing with lineToRemove
String trimmedLine = currentLine.trim();
String trimmLine[] = trimmedLine.split(" |,");
String part1 = trimmLine[0];
if(!part1.equals(lineToRemove)) {
writer.write(currentLine + System.getProperty("line.separator"));
}
}
writer.close();
reader.close();
boolean successful = tempFile.renameTo(inputFile);
}catch (IOException e){
e.printStackTrace();
}
}

Updating a single line on a text file with a Java method

I know previous questions LIKE this one have been asked, but this question has to do with the specifics of the code that I have written. I am trying to update a single line of code on a file that will be permanently updated even when the program terminates so that the data can be brought up again. The method that I am writing currently looks like this (no compile errors found with eclipse)
public static void editLine(String fileName, String name, int element,
String content) throws IOException {
try {
// Open the file specified in the fileName parameter.
FileInputStream fStream = new FileInputStream(fileName);
BufferedReader br = new BufferedReader(new InputStreamReader(
fStream));
String strLine;
StringBuilder fileContent = new StringBuilder();
// Read line by line.
while ((strLine = br.readLine()) != null) {
String tokens[] = strLine.split(" ");
if (tokens.length > 0) {
if (tokens[0].equals(name)) {
tokens[element] = content;
String newLine = tokens[0] + " " + tokens[1] + " "
+ tokens[2];
fileContent.append(newLine);
fileContent.append("\n");
} else {
fileContent.append(strLine);
fileContent.append("\n");
}
}
/*
* File Content now has updated content to be used to override
* content of the text file
*/
FileWriter fStreamWrite = new FileWriter(fileName);
BufferedWriter out = new BufferedWriter(fStreamWrite);
out.write(fileContent.toString());
out.close();
// Close InputStream.
br.close();
}
} catch (IOException e) {
System.out.println("COULD NOT UPDATE FILE!");
System.exit(0);
}
}
If you could look at the code and let me know what you would suggest, that would be wonderful, because currently I am only getting my catch message.

Okay. First off the bat, StringBuilder fileContent = new StringBuilder(); is bad practice as this file could well be larger than the user's available memory. You should not keep much of the file in memory at all. Do this by reading into a buffer, processing the buffer (adjusting it if necessary), and writing the buffer to a new file. When done, delete the old file and rename the secondary to the old one's name. Hope this helps.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to split single text file into multiple with character as delimiter - java

Related

Writing to File duplicating data the second time JAVA

Creating an inverted index with limited memory in java

File Writer that puts a number of a line before the line it's about to write

How to delete a line from text line by id java

Updating a single line on a text file with a Java method

Categories

Resources