Compare 2 *.txt files and print difference between them - java

I have a problem wrtting the code for comparing two files (first reference file):
PROTOCOL STATE SERVICE
1 open icmp
6 open tcp
17 open udp
and (execution file)
PROTOCOL STATE SERVICE
1 open icmp
6 open tcp
17 open udp
255 closed unknown
and save difference between these two files in new file (255 closed unknown).
For comparing I have used following code but it seems it doesn't work.
public String[] compareResultsAndDecide(String refFile, String execFile) throws IOException {
String[] referenceFile = parseFileToStringArray(refFile);
String[] execCommand = parseFileToStringArray(execFile);
List<String> tempList = new ArrayList<String>();
for(int i = 1; i < execCommand.length ; i++)
{
boolean foundString = false; // To be able to track if the string was found in both arrays
for(int j = 1; j < referenceFile.length; j++)
{
if(referenceFile[j].equals(execCommand[i]))
{
foundString = true;
break; // If it exist in both arrays there is no need to look further
}
}
if(!foundString) // If the same is not found in both..
tempList.add(execCommand[i]); // .. add to temporary list
}
String diff[] = tempList.toArray(new String[0]);
if(diff != null) {
return diff;
}
For String refFile I would use /home/xxx/Ref.txt path to reference file. And the same for execFile (second file shown up).
Anyone can help me with this?
Just to add, I'm using for parsing File to String Array:
public String[] parseFileToStringArray(String filename) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
List<String> lines = new ArrayList<String>();
while (sc.hasNextLine()) {
lines.add(sc.nextLine());
}
String[] arr = lines.toArray(new String[0]);
return arr;
}

Change int i = 1 to int i = 0 and int j = 1 to int j = 0

your compareResultsAndDecide method has to be changed like :
public static String[] compareResultsAndDecide(String refFile, String execFile) throws IOException {
String[] referenceFile = parseFileToStringArray(refFile);
String[] execCommand = parseFileToStringArray(execFile);
List<String> tempList = new ArrayList<String>();
List<String> diff = new ArrayList(Arrays.asList(execCommand));
diff.removeAll(Arrays.asList(referenceFile));
String[] toReturn = new String[diff.size()];
toReturn = diff.toArray(toReturn);
return toReturn;
}
and your parseFileToStringArray like:
public String[] parseFileToStringArray(String filename) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
List<String> lines = new ArrayList<String>();
while (sc.hasNextLine()) {
lines.add(sc.nextLine());
}
String[] arr = new String[lines.size()];
return lines.toArray(arr);
}

Problem is in your .txt files. Your encoding must be different.
I know this isn't the best way, but if you use replaceAll() method to replace white spaces from your lines of text files, your code should work. But Unfortunately, you will miss the spaces between lines.
Change:
String[] arr = lines.toArray(new String[0]);
To:
String[] arr = lines.toArray(new String[0]).replaceAll(" ", "");
Note:
I tried using trim() but it didn't worked well for me.
As I mentioned above, array elements starts from 0, not from 1. Change that too.

Related

Read & split multiple column text file into arrays

For a project I'm working on a fairly big animal dataset with up to 14 parameters of data. I was able to read it in and display it as strings using this:
public static void readIn(String file) throws IOException {
Scanner scanner = new Scanner(new File(file));
while (scanner.hasNext()) {
String[] columns = scanner.nextLine().split("/t");
String data = columns[columns.length-1];
System.out.println(data);
}
}
and displaying something like this:
04:00:01 0.11 0.04 -0.1 1047470 977.91 91.75
04:00:01 0.32 -0.03 -0.07 1047505 977.34 92.91
04:00:01 0.49 -0.03 -0.08 1047493 978.66 92.17
But I'm currently having trouble trying to split each column into separate arrays so that I can process the data (e.g. calculating means). Any idea of how I can do this? Any help would be much appreciated.
Edit: thanks, I've found out a solution that works and also lets me choose which channel it reads specifically. I've also decided to store the data as arrays within the class, here's what I have now:
public static void readChannel(String file, int channel) throws IOException
{
List<Double> dataArr = new ArrayList<>();
Scanner scanner = new Scanner(new File(file));
while (scanner.hasNext()) {
String[] columns = scanner.nextLine().split("\t");
for (int i = channel; i < columns.length; i+=(columns.length-channel)) {
dataArr.add(Double.parseDouble(columns[i]));
dataArr.toArray();
}
}
}
You can store all rows in an ArrayList and then create arrays for each column and store values in them. Sample code:
Scanner scanner = new Scanner(new File(file));
ArrayList<String> animalData = new ArrayList<String>();
while (scanner.hasNext()) {
String[] columns = scanner.nextLine().split("/t");
String data = columns[columns.length-1];
animalData.add(data);
System.out.println(data);
}
int size = animalData.size();
String[] arr1 = new String[size]; String[] arr2 = new String[size];
String[] arr3 = new String[size]; String[] arr4 = new String[size];
for(int i=0;i<size;i++)
{
String[] temp = animalData.get(i).split("\t");
arr1[i] = temp[0];
arr2[i] = temp[1];
arr3[i] = temp[2];
arr4[i] = temp[3];
}
I think you should split your problem in 2:
File reading:
Your program read each line and save it inside a instance of a class defined by you:
public class MyData {
private String time;
private double percent;
//... and so on
}
public MyData readLine( String line ) {
String[] columns = line.split("\t");
MyData md = new MyData();
md.setTime( columns[ 0 ] );
md.setPercent( Double.parseDouble(columns[ 1 ]) );
}
public void readFile( File file ) {
Scanner scanner = new Scanner(file);
List<MyData> myList = new ArrayList<>();
while (scanner.hasNext()) {
MyData md = readLine( scanner.nextLine() );
myList.add( md );
}
}
Data processing:
After you processed your file, you can create the method you need to process the data:
int sum = 0;
for ( MyData md : myList ) {
sum = sum + md.getValue();
}
I hope it help.
Following snippet will list down all values for a given index
public static void readIn(String file) throws Exception {
Scanner scanner = new Scanner(new File(file));
final Map<Integer,List<String>> resultMap = new HashMap<>();
while (scanner.hasNext()) {
String[] columns = scanner.nextLine().split("/t");
for(int i=0;i<columns.length;i++){
resultMap.computeIfAbsent(i, k -> new ArrayList<>()).add(columns[i]);
}
} resultMap.keySet().forEach(index -> System.out.println(resultMap.get(index).toString()));}

Take values from a text file and put them in a array

For now in my program i am using hard-coded values, but i want it so that the user can use any text file and get the same result.
import java.io.IOException;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.File;
public class a1_12177903
{
public static void main(String [] args) throws IOException
{
if (args[0] == null)
{
System.out.println("File not found");
}
else
{
File file = new File(args[0]);
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr);
String line = "";
while (br.ready())
{
line += br.readLine();
}
String[] work = line.split(",");
double[] doubleArr = new double[work.length];
for (int i =0; i < doubleArr.length; i++)
{
doubleArr[i] = Double.parseDouble(work[i]);
}
double maxStartIndex=0;
double maxEndIndex=0;
double maxSum = 0;
double total = 0;
double maxStartIndexUntilNow = 0;
for (int currentIndex = 0; currentIndex < doubleArr.length; currentIndex++)
{
double eachArrayItem = doubleArr[currentIndex];
total += eachArrayItem;
if(total > maxSum)
{
maxSum = total;
maxStartIndex = maxStartIndexUntilNow;
maxEndIndex = currentIndex;
}
if (total < 0)
{
maxStartIndexUntilNow = currentIndex;
total = 0;
}
}
System.out.println("Max sum : "+ maxSum);
System.out.println("Max start index : "+ maxStartIndex);
System.out.println("Max end index : " +maxEndIndex);
}
}
}
I've fixed it so it takes in the name of the text file from the command line. if anyone has any ways to improve this, I'll happily accept any improvments.
You can do this with Java8 Streams, assuming each entry has it's own line
double[] doubleArr = Files.lines(pathToFile)
.mapToDouble(Double::valueOf)
.toArray();
If you were using this on production systems (rather than as an exercise) it would be worth while to create the Stream inside a Try with Resources block. This will make sure your input file is closed properly.
try(Stream<String> lines = Files.lines(path)){
doubleArr = stream.mapToDouble(Double::valueOf)
.toArray();
}
If you have a comma separated list, you will need to split them first and use a flatMap.
double[] doubleArr = Files.lines(pathToFile)
.flatMap(line->Stream.of(line.split(","))
.mapToDouble(Double::valueOf)
.toArray();
public static void main(String[] args) throws IOException {
String fileName = "";
File inputFile = new File(fileName);
BufferedReader br = new BufferedReader(new FileReader(inputFile));
// if input is in single line
StringTokenizer str = new StringTokenizer(br.readLine());
double[] intArr = new double[str.countTokens()];
for (int i = 0; i < str.countTokens(); i++) {
intArr[i] = Double.parseDouble(str.nextToken());
}
// if multiple lines in input file for a single case
String line = "";
ArrayList<Double> arryList = new ArrayList<>();
while ((line = br.readLine()) != null) {
// delimiter of your choice
for (String x : line.split(" ")) {
arryList.add(Double.parseDouble(x));
}
}
// convert arraylist to array or maybe process arrayList
}
This link may help: How to use BufferedReader. Then you will get a String containing the array.
Next you have several ways to analyze the string into an array.
Use JSONArray to parse it. For further information, search google for JSON.
Use the function split() to parse string to array. See below.
Code for way 2:
String line="10,20,50";//in fact you get this from file input.
String[] raw=line.split(",");
String[] arr=new String[raw.length];
for(int i=0;i<raw.length;++i)arr[i]=raw[i];
//now arr is what you want
Use streams if you are on JDK8. And please take care of design principles/patterns as well. It seems like a strategy/template design pattern can be applied here. I know, nobody here would ask you to focus on design guidelines.And also please take care of naming conventions. "File" as class name is not a good name.

reading text from .db file one line at a time

First of, I don't know how a.db file stores it data. If it does it in one line, or over many lines. Probably it does some difference from how to solve the problem.
the problem I'm facing is that I don't know how much data the file contains, only that it will be a date, time, and a description for x number of events in the form given below.
I have to convert the text into strings and put them in an array, but I don't know how to separate the text. When I tried I just ended up with one long string.
Can anybody help me?
01.01.2015|07:00-07:15|get up
01.01.2015|08:00|get to work
01.01.2015|08:00-16:00| work
01.01.2015|16:00-16:30| go home
what I want:
array[0] = "01.01.2015|07:00-07:15|get up"
array[1] = "01.01.2015|08:00|get to work"
array[2] = "01.01.2015|08:00-16:00| work"
array[3] = "01.01.2015|16:00-16:30| go home"
string table[] = new String [100];
void readFile(String fileName){
String read = "";
try {
x = new Scanner (new File(fileName));
}
catch (Exception e) {
}
while (x.hasNext()) {
read += x.nextLine();
}
}
Assuming here that your first code-block is in fact a copy of the file you're trying to read, you can do:
Scanner s = new Scanner(new File("file1.txt"));
List<String> lines = new LinkedList<>();
while (s.hasNextLine())
lines.add(s.nextLine());
If you really want to work with arrays and not lists, you can do
String[] table = lines.toArray(new String[lines.size()]);
after the loop.
If you're fortunate enough to work with Java 8, you can use:
List<String> lines = Files.lines(Paths.get("big.txt"))
.collect(Collectors.toList());
Again, if you really want to work with an array, you can convert the list using lines.toArray.
Since Java 8 you can use Paths.get(String first, String... more), Files.lines(Path path), and Stream.toArray():
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class SOPlayground {
public static void main(String[] args) throws Exception {
Path path = Paths.get("/tmp", "db.txt");
Object[] lines = Files.lines(path).toArray();
System.out.println(lines.length);
System.out.println(lines[0]);
System.out.println(lines[lines.length - 1]);
}
}
Output:
4
01.01.2015|07:00-07:15|get up
01.01.2015|16:00-16:30| go home
Try this solution using arrays:
code
Scanner sc = new Scanner(new File("file.txt"));
int index;
String[] arr = new String[1];
for(index = 0; sc.hasNextLine(); index++) {
arr = Arrays.copyOf(arr, index + 1);
arr[index] = sc.nextLine();
}
for(int i = 0; i<arr.length; i++) {
System.out.print(arr[i] + "\n");
}
I have used arr = Arrays.copyOf(arr, index + 1) to increase the size of the array to add next element.
Output
01.01.2015|07:00-07:15|get up
01.01.2015|08:00|get to work
01.01.2015|08:00-16:00| work
01.01.2015|16:00-16:30| go home
Well, it took me some houres. Thanx to all who lended a hand. This was what I got in the end.
int i=0;
String array [] new String [100]
try {
FileReader textFileReader= new FileReader (fileName);
BufferedReader textReader= new BufferedReader(textFileReader);
boolean continue = true;
while (continue) {
String text = textReader.readLine();
if (text != null){
array[i] = text;
i++;
}else {
continue = false;
}
}
}catch (Exception e) {}

Check if a file contains strings and create an array for new strings

I need to create a method that will read the file, and check each word in the file. Each new word in the file should be stored in a string array. The method should be case insensitive. Please help.
The file says the following:
Ask not what your country can do for you
ask what you can do for your country
So the array should only contain: ask, not, what, your, country, can, do, for, you
import java.util.*;
import java.io.*;
public class TextAnalysis {
public static void main (String [] args) throws IOException {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
System.out.println("TEXT FILE STATISTICS");
System.out.println("--------------------");
System.out.println("Length of the longest word: " + longestWord(fileScanner));
System.out.println("Number of words in file wordlist: " );
countWords();
System.out.println("Word-frequency statistics");
}
public static String longestWord (Scanner s) {
String longest = "";
while (s.hasNext()) {
String word = s.next();
if (word.length() > longest.length()) {
longest = word;
}
}
return (longest.length() + " " + "(\"" + longest + "\")");
}
public static void countWords () throws IOException {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
int count = 0;
while(fileScanner.hasNext()) {
String word = fileScanner.next();
count++;
}
System.out.println("Number of words in file: " + count);
}
public static int wordList (int words) {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
int size = words;
String [] list = new String[size];
for (int i = 0; i <= size; i++) {
while(fileScanner.hasNext()){
if(!list[].contains(fileScanner.next())){
list[i] = fileScanner.next();
}
}
}
}
}
You could take advantage of my following code snippet (it will not store the duplicate words)!
File file = new File("names.txt");
FileReader fr = new FileReader(file);
StringBuilder sb = new StringBuilder();
char[] c = new char[256];
while(fr.read(c) > 0){
sb.append(c);
}
String[] ss = sb.toString().toLowerCase().trim().split(" ");
TreeSet<String> ts = new TreeSet<String>();
for(String s : ss)
ts.add(s);
for(String s : ts){
System.out.println(s);
}
And the output is:
ask
can
country
do
for
not
what
you
your
You could always just try:
List<String> words = new ArrayList<String>();
//read lines in your file all at once
List<String> allLines = Files.readAllLines(yourFile, Charset.forName("UTF-8"));
for(int i = 0; i < allLines.size(); i++) {
//change each line from your file to an array of words using "split(" ")".
//Then add all those words to the list "words"
words.addAll(Arrays.asList(allLines.get(i).split(" ")));
}
//convert the list of words to an array.
String[] arr = words.toArray(new String[words.size()]);
Using Files.readAllLines(yourFile, Charset.forName("UTF-8")); to read all the lines of yourFile is much cleaner than reading each individually. The problem with your approach is that you're counting the number of lines, not the number of words. If there are multiple words on one line, your output will be incorrect.
Alternatively, if you do not use Java 7, you can create a list of lines as follows and then count the words at the end (as opposed to your approach in countWords():
List<String> allLines = new ArrayList<String>();
Scanner fileScanner = new Scanner(yourFile);
while (fileScanner.hasNextLine()) {
allLines.add(scanner.nextLine());
}
fileScanner.close();
Then split each line as shown in the previous code and create your array. Also note that you should use a try{} catch block around your scanner rather than throws ideally.

Using Java to read and process textfile with custom column and row separators

I have a text file which contains content scraped from webpages. The text file is structured like this:
|NEWTAB|lkfalskdjlskjdflsj|NEWTAB|lkjsldkjslkdjf|NEWTAB|sdlfkjsldkjf|NEWLINE|lksjlkjsdl|NEWTAB|lkjlkjlkj|NEWTAB|sdkjlkjsld
|NEWLINE| indicates the start of a new line (i.e., a new row in the data)
|NEWTAB| indicates the start of a new field within a line (i.e. a new column in the data)
I need to split the text file into fields and lines and store in an array or some other data structure. Content between |NEWLINE| strings may contain actual new lines (i.e. \n), but these don't indicate an actual new row in the data.
I started by reading each character in one by one and looking at sets of 8 consecutive characters to see if they contained |NEWTAB|. My method proved to be unreliable and ugly. I am looking for the best practice on this. Would the best method be to read the whole text file in as a single string, and then use a string split on "|NEWLINE|" and then string splits on the resulting strings using "|NEWTAB|"?
Many thanks!
I think that the other answers will work too, but my solution is as follows:
FileReader inputStream = null;
StringBuilder builder = new StringBuilder();
try {
inputStream = new FileReader(args[0]);
int c;
char d;
while ((c = inputStream.read()) != -1) {
d = (char)c;
builder.append(d);
}
}
finally {
if (inputStream != null) {
inputStream.close();
}
}
String myString = builder.toString();
String rows[] = myString.split("\\|NEWLINE\\|");
for (String row : rows) {
String cols[] = row.split("\\|NEWTAB\\|");
/* do something with cols - e.g., store */
}
You could do something like this:
Scanner scanner = new Scanner(new File("myFile.txt"));
List<List<String>> rows = new ArrayList<List<String>>();
List<String> column = new ArrayList<String>();
while (scanner.hasNext()) {
for (String elem : scanner.nextLine().split("\\|")) {
System.out.println(elem);
if (elem.equals("NEWTAB") || elem.equals(""))
continue;
else if (elem.equals("NEWLINE")) {
rows.add(column);
column = new ArrayList<String>();
} else
column.add(elem);
}
}
Took me a while to write it up, since I don't have IntelliJ or Eclipse on this computer and had to use Emacs.
EDIT: This is a bit more verbose than I like, but it works with |s that are part of the text:
Scanner scanner = new Scanner(new File("myFile.txt"));
List<List<String>> rows = new ArrayList<List<String>>();
List<String> lines = new ArrayList<String>();
String line = "";
while (scanner.hasNext()) {
line += scanner.nextLine();
int index = 0;
while ((index = line.indexOf("|NEWLINE|")) >= 0) {
lines.add(line.substring(0, index));
line = line.substring(index + 9);
}
}
if (!line.equals(""))
lines.add(line);
for (String l : lines) {
List<String> columns = new ArrayList<String>();
for (String column : l.split("\\|NEWTAB\\|"))
if (!column.equals(""))
columns.add(column);
rows.add(columns);
}

Categories