Read & split multiple column text file into arrays - java

For a project I'm working on a fairly big animal dataset with up to 14 parameters of data. I was able to read it in and display it as strings using this:
public static void readIn(String file) throws IOException {
Scanner scanner = new Scanner(new File(file));
while (scanner.hasNext()) {
String[] columns = scanner.nextLine().split("/t");
String data = columns[columns.length-1];
System.out.println(data);
}
}
and displaying something like this:
04:00:01 0.11 0.04 -0.1 1047470 977.91 91.75
04:00:01 0.32 -0.03 -0.07 1047505 977.34 92.91
04:00:01 0.49 -0.03 -0.08 1047493 978.66 92.17
But I'm currently having trouble trying to split each column into separate arrays so that I can process the data (e.g. calculating means). Any idea of how I can do this? Any help would be much appreciated.
Edit: thanks, I've found out a solution that works and also lets me choose which channel it reads specifically. I've also decided to store the data as arrays within the class, here's what I have now:
public static void readChannel(String file, int channel) throws IOException
{
List<Double> dataArr = new ArrayList<>();
Scanner scanner = new Scanner(new File(file));
while (scanner.hasNext()) {
String[] columns = scanner.nextLine().split("\t");
for (int i = channel; i < columns.length; i+=(columns.length-channel)) {
dataArr.add(Double.parseDouble(columns[i]));
dataArr.toArray();
}
}
}

You can store all rows in an ArrayList and then create arrays for each column and store values in them. Sample code:
Scanner scanner = new Scanner(new File(file));
ArrayList<String> animalData = new ArrayList<String>();
while (scanner.hasNext()) {
String[] columns = scanner.nextLine().split("/t");
String data = columns[columns.length-1];
animalData.add(data);
System.out.println(data);
}
int size = animalData.size();
String[] arr1 = new String[size]; String[] arr2 = new String[size];
String[] arr3 = new String[size]; String[] arr4 = new String[size];
for(int i=0;i<size;i++)
{
String[] temp = animalData.get(i).split("\t");
arr1[i] = temp[0];
arr2[i] = temp[1];
arr3[i] = temp[2];
arr4[i] = temp[3];
}

I think you should split your problem in 2:
File reading:
Your program read each line and save it inside a instance of a class defined by you:
public class MyData {
private String time;
private double percent;
//... and so on
}
public MyData readLine( String line ) {
String[] columns = line.split("\t");
MyData md = new MyData();
md.setTime( columns[ 0 ] );
md.setPercent( Double.parseDouble(columns[ 1 ]) );
}
public void readFile( File file ) {
Scanner scanner = new Scanner(file);
List<MyData> myList = new ArrayList<>();
while (scanner.hasNext()) {
MyData md = readLine( scanner.nextLine() );
myList.add( md );
}
}
Data processing:
After you processed your file, you can create the method you need to process the data:
int sum = 0;
for ( MyData md : myList ) {
sum = sum + md.getValue();
}
I hope it help.

Following snippet will list down all values for a given index
public static void readIn(String file) throws Exception {
Scanner scanner = new Scanner(new File(file));
final Map<Integer,List<String>> resultMap = new HashMap<>();
while (scanner.hasNext()) {
String[] columns = scanner.nextLine().split("/t");
for(int i=0;i<columns.length;i++){
resultMap.computeIfAbsent(i, k -> new ArrayList<>()).add(columns[i]);
}
} resultMap.keySet().forEach(index -> System.out.println(resultMap.get(index).toString()));}

Related

How to add up all the values in an ArrayList<String> or convert to ArrayList<Integer>

I'm trying to add up all the values inside of an ArrayList but nothing allows me to get the sum. I have to find the average of the numbers pulled from a text file.
public static void main(String[] args) throws IOException {
File file = new File("C:\\Users\\[REDACTED]\\Desktop\\mbox.txt");
Scanner inputFile = new Scanner(file);
ArrayList<Integer> ConfidenceLevels = new ArrayList<>();
String [] DSPAM;
String line;
while(inputFile.hasNext())
{
line = inputFile.nextLine();
if(line.startsWith("X-DSPAM-Confidence:"))
{
DSPAM = line.split(" 0");
int x = Integer.parseInt(DSPAM[1].trim());
ConfidenceLevels.add(x);
}
}
System.out.println(ConfidenceLevels);
System.out.println(ConfidenceLevels.size());
}
You are just adding elements to the list.
If you want the sum of them, you can create an Integer variable and add it.
public static void main(String[] args) throws IOException {
File file = new File("C:\\Users\\[REDACTED]\\Desktop\\mbox.txt");
Scanner inputFile = new Scanner(file);
Integer confidenceLevelsSum = 0;
String [] DSPAM;
String line;
while(inputFile.hasNext())
{
line = inputFile.nextLine();
if(line.startsWith("X-DSPAM-Confidence:"))
{
DSPAM = line.split(" 0");
// Check if it's null before trying to convert it.
if (DSPAM[1] == null) {
continue;
}
int x = Integer.parseInt(DSPAM[1].trim());
confidenceLevelsSum += x;
}
}
System.out.println(confidenceLevelsSum);
}
You can try use IntSummaryStatistics and you can get min/max/average/sum of this list.
IntSummaryStatistics intSummaryStatistics = ConfidenceLevels.stream().mapToInt(Integer::intValue).summaryStatistics();
System.out.println(intSummaryStatistics.getAverage());
System.out.println(intSummaryStatistics.getMin());
System.out.println(intSummaryStatistics.getMax());
System.out.println(intSummaryStatistics.getSum());

About sorting elements of a line in a csv file and storing it into different Arraylists

So I have a CSV file that kinda looks like this :
LS1312, AWFXQA, -12107909
LS1313, VEDSBJV, -55003726
LS1314, UCRQSE, -1711111
....
Therefore, I'm wondering how I can store the first element of the line (for example LS1312,
LS1313) into my first ArrayList and the second element(AWFXQA, VEDSBJV) into the second ArrayList and the last element (-12107909, -55003726) into the third ArrayList?
This is what I have right now: I managed to read the file but am not sure how to store elements in its respective Arraylist.
public static void main(String[] args) throws FileNotFoundException {
Scanner scanner = new Scanner(new File(args[0]));
scanner.useDelimiter(",");
ArrayList < String > arrlist1 = new ArrayList < >();
ArrayList < String > arrlist2 = new ArrayList < >();
ArrayList < String > arrlist3 = new ArrayList < >();
while (scanner.hasNext()) {
arrlist1.add(scanner.next());
}
scanner.close();
}
You can use a Common CSV Library:
Simple:
Reader in = new FileReader("yourFile.csv");
Iterable<CSVRecord> records = CSVFormat.RFC4180.parse(in);
for (CSVRecord record : records) {
arrlist1.add(record.get(0));
arrlist2.add(record.get(1));
arrlist3.add(record.get(2));
}
You should use split() method to split the line and assign to the equivalent array
public static void main(String[] args) throws FileNotFoundException {
Scanner scanner = new Scanner(new File(args[0]));
ArrayList<String> arrlist1 = new ArrayList<>();
ArrayList<String> arrlist2 = new ArrayList<>();
ArrayList<String> arrlist3 = new ArrayList<>();
String[] split;
while (scanner.hasNextLine()) {
split= scanner.nextLine().split(",");
arrlist1.add(split[0]);
arrlist2.add(split[1]);
arrlist3.add(split[2]);
}
scanner.close();
}

Input is a String array, Output is an ArrayList

My problem consists, that I want to insert a String[] into an ArrayList, this works fine.
Whilst trying to get my String[] from the ArrayList, I only receive an ArrayList in return. Is there any way I can retrieve my String[].
I cannot seem to find the method that could solve this problem. I would like to restrain myself from programming around it.
public ArrayList<String> getAllAttributes() throws IOException {
BufferedReader reader = new BufferedReader(new FileReader(vmxConfigFile));
ArrayList<String> attributes = new ArrayList<String>();
for(int i = 0 ; i < getAmountOfAttributes() ; i++) {
String line = reader.readLine();
String[] attribute = line.split("=");
attributes.addAll(Arrays.asList(attribute));
}
return attributes;
}
public static void main(String[] args) throws IOException {
VMXConverter vmx = new VMXConverter("file:///C://Users//trisi//Downloads//vCloud-Availability.vmx_ ");
ArrayList<String> list = vmx.getAllAttributes();
for(int i = 0; i < list.size(); i++) {
// What i want to do: String[] x = list.get(i);
String x = list.get(i); <--this is currently my only option
}
}
I expect, that when I grab the element in my list, it should be an array of size 2, with 2 keywords.
Instead, I get a list in which the Elements are no longer sorted as arrays
It is because you are not adding the Array of Strings to your arraylist, instead you add all elements of the String array to the arraylist, so You'll only get one big arraylist.
What you can do is not return with an arraylist of string, but an arraylist of string array:
public ArrayList<String[]> getAllAttributes() throws IOException {
BufferedReader reader = new BufferedReader(new FileReader(vmxConfigFile));
ArrayList<String[]> attributes = new ArrayList<>();
for(int i = 0 ; i < getAmountOfAttributes() ; i++) {
String line = reader.readLine();
String[] attribute = line.split("=");
attributes.add(attribute);
}
return attributes;
}
public static void main(String[] args) throws IOException {
VMXConverter vmx = new VMXConverter("file:///C://Users//trisi//Downloads//vCloud-Availability.vmx_ ");
ArrayList<String[]> list = vmx.getAllAttributes();
for(int i = 0; i < list.size(); i++) {
String[] x = list.get(i);
}
}
However I do not think you want this. If I understand well you have a file containing key=value pairs. I suggest load them into a Map, so it will be much more easier to work with.
You can load to map like this:
public Map<String,String> getAllAttributesMap() throws IOException {
BufferedReader reader = new BufferedReader(new FileReader(vmxConfigFile));
Map<String,String> attributes = new HashMap<>();
while( (line = br.readLine()) != null){
String[] attribute = line.split("=");
if(attribute.length>0 && attribute[0]!=null)
attributes.put(attribute[0],attribute.length<1?null:attribute[1]); //just to avoid null pointer problems...
}
return attributes;
}
And so you can easily get a value of a key from the map.
Map<String,String> attributes=getAllAttributesMap();
System.out.println("Value of foo: "+attributes.get("foo"));
System.out.println("Value of bar: "+attributes.get("bar"));
At the moment you have a list of Strings. When you add the String[] to the list each single element of the array will be added separately to the list.
If you want to keep the arrays as they are, then you need to use a list of String[].
public ArrayList<String[]> getAllAttributes() throws IOException {
BufferedReader reader = new BufferedReader(new FileReader(vmxConfigFile));
ArrayList<String[]> attributes = new ArrayList<String>();
for(int i = 0 ; i < getAmountOfAttributes() ; i++) {
String line = reader.readLine();
String[] attribute = line.split("=");
attributes.add(attribute);
}
return attributes;
}
public static void main(String[] args) throws IOException {
VMXConverter vmx = new VMXConverter("file:///C://Users//trisi//Downloads//vCloud-Availability.vmx_ ");
ArrayList<String[]> list = vmx.getAllAttributes();
for(int i = 0; i < list.size(); i++) {
String[] x = list.get(i);
}
}
You need ArrayList < String[] > instead ArrayList < String >.
Something like this:
public ArrayList<String[]> getAllAttributes() throws IOException {
BufferedReader reader = new BufferedReader(new FileReader(vmxConfigFile));
ArrayList<String[]> attributes = new ArrayList<>();
for(int i = 0 ; i < getAmountOfAttributes() ; i++) {
String line = reader.readLine();
String[] attribute = line.split("=");
attributes.add(attribute);
}
return attributes;
}

Compare 2 *.txt files and print difference between them

I have a problem wrtting the code for comparing two files (first reference file):
PROTOCOL STATE SERVICE
1 open icmp
6 open tcp
17 open udp
and (execution file)
PROTOCOL STATE SERVICE
1 open icmp
6 open tcp
17 open udp
255 closed unknown
and save difference between these two files in new file (255 closed unknown).
For comparing I have used following code but it seems it doesn't work.
public String[] compareResultsAndDecide(String refFile, String execFile) throws IOException {
String[] referenceFile = parseFileToStringArray(refFile);
String[] execCommand = parseFileToStringArray(execFile);
List<String> tempList = new ArrayList<String>();
for(int i = 1; i < execCommand.length ; i++)
{
boolean foundString = false; // To be able to track if the string was found in both arrays
for(int j = 1; j < referenceFile.length; j++)
{
if(referenceFile[j].equals(execCommand[i]))
{
foundString = true;
break; // If it exist in both arrays there is no need to look further
}
}
if(!foundString) // If the same is not found in both..
tempList.add(execCommand[i]); // .. add to temporary list
}
String diff[] = tempList.toArray(new String[0]);
if(diff != null) {
return diff;
}
For String refFile I would use /home/xxx/Ref.txt path to reference file. And the same for execFile (second file shown up).
Anyone can help me with this?
Just to add, I'm using for parsing File to String Array:
public String[] parseFileToStringArray(String filename) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
List<String> lines = new ArrayList<String>();
while (sc.hasNextLine()) {
lines.add(sc.nextLine());
}
String[] arr = lines.toArray(new String[0]);
return arr;
}
Change int i = 1 to int i = 0 and int j = 1 to int j = 0
your compareResultsAndDecide method has to be changed like :
public static String[] compareResultsAndDecide(String refFile, String execFile) throws IOException {
String[] referenceFile = parseFileToStringArray(refFile);
String[] execCommand = parseFileToStringArray(execFile);
List<String> tempList = new ArrayList<String>();
List<String> diff = new ArrayList(Arrays.asList(execCommand));
diff.removeAll(Arrays.asList(referenceFile));
String[] toReturn = new String[diff.size()];
toReturn = diff.toArray(toReturn);
return toReturn;
}
and your parseFileToStringArray like:
public String[] parseFileToStringArray(String filename) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
List<String> lines = new ArrayList<String>();
while (sc.hasNextLine()) {
lines.add(sc.nextLine());
}
String[] arr = new String[lines.size()];
return lines.toArray(arr);
}
Problem is in your .txt files. Your encoding must be different.
I know this isn't the best way, but if you use replaceAll() method to replace white spaces from your lines of text files, your code should work. But Unfortunately, you will miss the spaces between lines.
Change:
String[] arr = lines.toArray(new String[0]);
To:
String[] arr = lines.toArray(new String[0]).replaceAll(" ", "");
Note:
I tried using trim() but it didn't worked well for me.
As I mentioned above, array elements starts from 0, not from 1. Change that too.

Check if a file contains strings and create an array for new strings

I need to create a method that will read the file, and check each word in the file. Each new word in the file should be stored in a string array. The method should be case insensitive. Please help.
The file says the following:
Ask not what your country can do for you
ask what you can do for your country
So the array should only contain: ask, not, what, your, country, can, do, for, you
import java.util.*;
import java.io.*;
public class TextAnalysis {
public static void main (String [] args) throws IOException {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
System.out.println("TEXT FILE STATISTICS");
System.out.println("--------------------");
System.out.println("Length of the longest word: " + longestWord(fileScanner));
System.out.println("Number of words in file wordlist: " );
countWords();
System.out.println("Word-frequency statistics");
}
public static String longestWord (Scanner s) {
String longest = "";
while (s.hasNext()) {
String word = s.next();
if (word.length() > longest.length()) {
longest = word;
}
}
return (longest.length() + " " + "(\"" + longest + "\")");
}
public static void countWords () throws IOException {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
int count = 0;
while(fileScanner.hasNext()) {
String word = fileScanner.next();
count++;
}
System.out.println("Number of words in file: " + count);
}
public static int wordList (int words) {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
int size = words;
String [] list = new String[size];
for (int i = 0; i <= size; i++) {
while(fileScanner.hasNext()){
if(!list[].contains(fileScanner.next())){
list[i] = fileScanner.next();
}
}
}
}
}
You could take advantage of my following code snippet (it will not store the duplicate words)!
File file = new File("names.txt");
FileReader fr = new FileReader(file);
StringBuilder sb = new StringBuilder();
char[] c = new char[256];
while(fr.read(c) > 0){
sb.append(c);
}
String[] ss = sb.toString().toLowerCase().trim().split(" ");
TreeSet<String> ts = new TreeSet<String>();
for(String s : ss)
ts.add(s);
for(String s : ts){
System.out.println(s);
}
And the output is:
ask
can
country
do
for
not
what
you
your
You could always just try:
List<String> words = new ArrayList<String>();
//read lines in your file all at once
List<String> allLines = Files.readAllLines(yourFile, Charset.forName("UTF-8"));
for(int i = 0; i < allLines.size(); i++) {
//change each line from your file to an array of words using "split(" ")".
//Then add all those words to the list "words"
words.addAll(Arrays.asList(allLines.get(i).split(" ")));
}
//convert the list of words to an array.
String[] arr = words.toArray(new String[words.size()]);
Using Files.readAllLines(yourFile, Charset.forName("UTF-8")); to read all the lines of yourFile is much cleaner than reading each individually. The problem with your approach is that you're counting the number of lines, not the number of words. If there are multiple words on one line, your output will be incorrect.
Alternatively, if you do not use Java 7, you can create a list of lines as follows and then count the words at the end (as opposed to your approach in countWords():
List<String> allLines = new ArrayList<String>();
Scanner fileScanner = new Scanner(yourFile);
while (fileScanner.hasNextLine()) {
allLines.add(scanner.nextLine());
}
fileScanner.close();
Then split each line as shown in the previous code and create your array. Also note that you should use a try{} catch block around your scanner rather than throws ideally.

Categories