Recently one of my data servers went down and a large number of video files are damaged (over 15,000 files, or more than 60TB). I wrote a script to check all files and put results in a very big log.txt file (almost 8GB).
I wrote code to find all lines starting with "Input #0" and lines which contain "damaged", then added their line numbers to ArrayList's. Next, I need to compare those two ArrayLists and find the closest line number in list2 to the number in list1 so I can get back file names from the log file.
For example:
if list1 contains numbers {1, 5, 45, 55, 100, 2000... etc}
and list2 contains numbers {50, 51, 53, 2010... etc} the result should be {45, 2000... etc}
This is my current code:
import java.io.*;
import java.util.*;
public class Log {
public static void main(String [] args) throws IOException{
ArrayList<Integer> list1 = new ArrayList<Integer>();
ArrayList<Integer> list2 = new ArrayList<Integer>();
File file = new File("C:\\log.txt");
try {
Scanner scanner = new Scanner(file);
Scanner scanner2 = new Scanner(file);
int lineNum = 0;
int lineNum2 = 0;
while (scanner.hasNextLine()){
String line = scanner.nextLine();
String line2 = scanner.nextLine();
lineNum++;
lineNum2++;
if((line.startsWith("Input #0"))) {
list1.add(lineNum);
}
if((line2.contains("damaged"))) {
list2.add(lineNum2);
}
}
This is what I'm getting from the code above:
list1 [5, 262, 304, 488, 523, 1189, 1796, 2503, 2722, 4052, 4201, 4230, 4298, 4312, 4559, 4887, 4903, 5067....]
list2 [1838, 1841, 1842, 1844, 1851, 1861, 1865, 1866, 1868, 1875, 1878, 1879, 1880, 1881, 1886, 1887, 1891....]
Some log data:
Input #0, mpegvideo, from '/cinegy/cinegy/VIDEO/BSF/BLOK 3 - 14. NOVHighb668ca7d201411141051110636.m2v':
.
.
.
.
.
.
Data with damage:
Input #0, mpegvideo, from '/cinegy/cinegy/VIDEO/BSF/BLOK 3 - 14. NOVHighb668ca7d201411141051110636.m2v':
.
.
.
.
.
[error 0x090010] file damaged at 16 09
[error 0x090010] file damaged at 19 15
The log for each individual file does not contain any pattern except for the first 5-6 lines or so. Both damaged and non-damaged files contain info written in 20 to 100+ lines.
So, from these numbers the first result should be number 1796.
I'm pretty much a novice in Java and I need help.
Here's a small code that will do the work, but I don't know if you want redundant values in the result, so I saved them in a list and in a set, choose the one you prefer:
public static void main(String[] args) {
int[] list1 = {5, 262, 304, 488, 523, 1189, 1796, 2503, 2722, 4052, 4201, 4230, 4298, 4312, 4559};
int[] list2 = {1838, 1841, 1842, 1844, 1851, 1861, 1865, 1866, 1868, 1875, 1878, 1879, 1880, 1881};
ArrayList<Integer> resultList = new ArrayList<Integer>();
Set<Integer> resultSet = new HashSet<Integer>();
int j = 0;
for(int i = 0; i < list2.length; i++){
for(; j < list1.length; j++){
if(list1[j] > list2[i])
break;
}
resultList.add(list1[j-1]);
resultSet.add(list1[j-1]);
}
System.out.println(resultList);
System.out.println(resultSet);
}
Output:
[1796, 1796, 1796, 1796, 1796, 1796, 1796, 1796, 1796, 1796, 1796, 1796, 1796, 1796]
[1796]
You defined two scanners (seems unnecessary) but you are only using one of them and calling nextline() twice on it. It looks like that is not intended and as a consequence the results you are getting are erroneous. It would be very helpful if you could post a sample excerpt from your logfile (you can filter the sensitive data) so that we can determine what the best approach is for this.
I think you should scrap your current approach because it does not seem like an efficient way to solve your problem of needing to find filenames of damaged files.
Depending on how your data looks, you can use regular expressions and possibly even extract the filenames directly into a Set.
Edit: Added some rough code that should do the job for you if you are indeed correct that each file starts with "Input #0". As long as there is a pattern in the log data for each file, then you should always be able to extract the data you need directly instead of going through the mess of matching entries from two separate arraylists.
public static void main(String [] args) throws FileNotFoundException{
Set<String> damagedFiles = new LinkedHashSet<String>();
File file = new File("C:\\log.txt");
Scanner scanner = new Scanner(file);
String filename = null;
try {
int lineNum = 0;
while (scanner.hasNextLine()){
String line = scanner.nextLine();
if(line.startsWith("Input #0")){
/*if desired, can use a regex lookahead to get only the path and filename
instead of the entire Input #0 line */
filename = line;
}
if(line.contains("damaged")){
if (filename != null){
damagedFiles.add(filename);
}
}
}
} finally {
scanner.close();
for (String s : damagedFiles){
System.out.println(s);
}
}
}
This is the result I got when running this code on a sample log file where I named the damaged files dmg#.m2v
Input #0, mpegvideo, from '/cinegy/cinegy/VIDEO/BSF/BLOK 3 - 14. dmg1.m2v':
Input #0, mpegvideo, from '/cinegy/cinegy/VIDEO/BSF/BLOK 3 - 14. dmg2.m2v':
Input #0, mpegvideo, from '/cinegy/cinegy/VIDEO/BSF/BLOK 3 - 14. dmg3.m2v':
Input #0, mpegvideo, from '/cinegy/cinegy/VIDEO/BSF/BLOK 3 - 14. dmg4.m2v':
Related
I have the following sample data in a .txt file
111, Sybil, 21
112, Edith, 22
113, Mathew, 30
114, Mary, 25
the required output is
[{"number":"111","name":"Sybil","age":"21" },
{"number":"112","name":"Edith","age":"22"},
{"number":"113","name":"Mathew","age":"30"},
"number":"114","name":"Mary","age":"25"]
Sadly, I have not gone far because I cant seem to get the values out of each line. instead, this is what is displayed
[one, two, three]
private void loadFile() throws FileNotFoundException, IOException {
File txt = new File("Users.txt");
try (Scanner scan = new Scanner(txt)) {
ArrayList data = new ArrayList<>() ;
while (scan.hasNextLine()) {
data.add(scan.nextLine());
System.out.print(scan.nextLine());
}
System.out.print(data);
}
I would appreciate any help. thank you
Not too sure about the requirements. If you just need to know how to get the values out, then use String.split() combined with Scanner.nextLine().
Codes below:
private void loadFile() throws FileNotFoundException, IOException {
File txt = new File("Users.txt");
try (Scanner scan = new Scanner(txt)) {
ArrayList data = new ArrayList<>();
while (scan.hasNextLine()) {
// split the data by ", " and split at most (3-1) times
String[] input = scan.nextLine().split(", ", 3);
data.add(input[0]);
data.add(input[1]);
data.add(input[2]);
System.out.print(scan.nextLine());
}
System.out.print(data);
}
}
The output would be as below and you can further modify it yourself:
[111, Sybil, 21, 112, Edith, 22, 113, Mathew, 30, 114, Mary, 25]
However, if you need the required format as well, the closest I can get is by using a HaspMap and put it into the ArrayList.
Codes below:
private void loadFile() throws FileNotFoundException, IOException {
File txt = new File("Users.txt");
try (Scanner scan = new Scanner(txt)) {
ArrayList data = new ArrayList<>();
while (scan.hasNextLine()) {
// Create a hashmap to store data in correct format,
HashMap<String, String> info = new HashMap();
String[] input = scan.nextLine().split(", ", 3);
info.put("number", input[0]);
info.put("name", input[1]);
info.put("age", input[2]);
// Put it inside the ArrayList
data.add(info);
}
System.out.print(data);
}
}
And the output would be:
[{number=111, name=Sybil, age=21}, {number=112, name=Edith, age=22}, {number=113, name=Mathew, age=30}, {number=114, name=Mary, age=25}]
Hope this answer helps you well.
Currently, you're skipping lines. A quote from the Scanner::nextLine documentation:
This method returns the rest of the current line, excluding any line separator at the end. The position is set to the beginning of the next line.
So you're adding one line to your list, and writing the next one to the console.
To get the data from each line, you can use the String::split method, which supports RegEx.
Example:
"line of my file".split(" ")
We can use streams to write some compact code.
First we define a record to hold our data.
Files.lines reads your file into memory, producing a stream of strings, one per line.
We call Stream#map to produce another stream, a series of string arrays. Each array has three elements, the three fields within each line.
We call map again, this time to produce a stream of Person objects. We construct each person object by parsing and passing to the constructor each of line’s three fields.
We call Stream#toList to collect those person objects into a list.
We call List#toString to generate text representing the contents of the list of person objects.
record Person ( int id , String name , int age ) {}
String output =
Files
.lines( Paths.of("/path/to/Users.txt" ) )
.map( line -> line.split( ", " ) )
.map( parts -> new Person(
Integer.parseInt( parts[ 0 ] ) ,
parts[ 1 ] ,
Integer.parseInt( parts[ 2 ] )
) )
.toList()
.toString()
;
If the format of the default Person#toString method does not suit you, add an override of that method to produce your desired output.
I am trying to read a file but I am not able to get the correct output from it. Can someone tell me how should I change the code to make it work? isNum() function in the code is a method that checks whether the string is a number or not (because I need to put 5 and 10 in a separate variable).
Edit: I have changed the code a bit after listening to the suggestions and it looks better now but there still some problem. The code and output below has been updated.
int numEv = 0;
Scanner input = new Scanner(System.in);
ArrayList<String> evtList = new ArrayList<String>();
try {
input = new Scanner(Paths.get("src/idse/Events.txt"));
} catch (IOException e) {
System.out.println(e);
}
try {
while(input.hasNext()) {
String a = input.nextLine();
if (isNum(a)){
numEv = Integer.parseInt(a);
System.out.println(numEv);
}
else if(!a.isEmpty()&&!isNum(a)){
String[] parts = a.split(":");
for (String part : parts) {
evtList.add(part);
}
System.out.println(evtList);
}
if(isNum(a)){
evtList.clear();
}
}
The output that I am getting is:
5
[Logins, 2, Total time online, 1, Emails sent, 1, Orders processed, 1]
[Logins, 2, Total time online, 1, Emails sent, 1, Orders processed, 1, Pizza’s ordered online, 0.5]
10
[Logins, 7, Total time online, 5, Emails sent, 9, Orders processed, 15]
[Logins, 7, Total time online, 5, Emails sent, 9, Orders processed, 15, Pizza’s ordered online, 0.9, Logouts, 6]
The output that I want is:
5
[Logins, 2, Total time online, 1, Emails sent, 1, Orders processed, 1, Pizza’s ordered online, 0.5]
10
[Logins, 7, Total time online, 5, Emails sent, 9, Orders processed, 15, Pizza’s ordered online, 0.9, Logouts, 6]
There are 3 fixes you should do, follow the next steps:
Correct your file format.
change the format to:
5
Logins:2:Total time online:1:Emails sent:1:Orders processed:1:Pizza’s ordered online:0.5:
10
Logins:7:Total time online:5:Emails sent:9:Orders processed:15:Pizza’s ordered online:0.9:Logouts:6:
Thud will sperate the file by lines as you want.
Enter the System.out.println() method to the code blocks:
if (isNum(a)){
numEv = Integer.parseInt(a);
System.out.println(numEv);
}
else if(!a.isEmpty()&&!isNum(a)){
String[] parts = a.split(":");
for (String part : parts) {
evtList.add(part);
}
System.out.println(evtList);
}
This will fix you too long output, because its prints some unneccery stuff.
Clear the event list:
evtList.clear();
Add this line after every iteration in the while loop, to make list update only to the current line, and not full of nodes from previous events.
Based on what you specified in the comments (e.g. You cannot change the input file format), you would always have to check the next line of the file to see if the specific input code has ended. I would use this trick to read the next line without moving the pointer.
int numEv = 0;
Scanner input = new Scanner(System.in); // idk what you need this for
ArrayList<String> evtList = new ArrayList<String>();
try {
BufferedReader reader = new BufferedReader(new FileReader(Paths.get("src/idse/Events.txt")));
} catch (IOException e) {
System.out.println(e);
}
try {
while((a= reader.readLine()) != null) {
if (isNum(a)){ // Reading and printing the number
numEv = Integer.parseInt(a);
System.out.println(numEv);
} else if(!a.isEmpty()){ // Getting and storing the code
String[] parts = a.split(":");
for (String part : parts) {
evtList.add(part);
}
}
reader.mark(0);
a = reader.readLine();
if(a == null || isNum(a)) { // If the next line is a number or doesn't exist, we print and clear the code
System.out.println(evtList);
evtList.clear();
}
reader.reset();
}
I hope this works!
I wanted to write a program which can print, and modify the irregular csv files. The format is as follows:
1.date
2.organization name
3. student name, id number, residence
student name, id number, residence
student name, id number, residence
student name, id number, residence
student name, id number, residence
1.another date
2.another organization name
3. student name, id number, residence
student name, id number, residence
student name, id number, residence
..........
For instance, the data may be given as follows:
1. 10/09/2016
2. cycling club
3. sam, 1000, oklahoma
henry, 1001, california
bill, 1002, NY
1. 11/15/2016
2. swimming club
3. jane, 9001, georgia
elizabeth, 9002, lousiana
I am a beginner and I have not found any viable resource online which deals with this type of problem. My main concern is, how do we iterate through the loop and identify the date and name of the club, and feed them into a array?
Please advise.
I think this should be helpful for you. Basically there should be some pattern in your messed up csv. Below is my code to arrange your csv
public static void main(String[] args) throws FileNotFoundException, UnsupportedEncodingException {
PrintWriter writer = new PrintWriter("file.txt", "UTF-8");
try{
//Create object of FileReader
FileReader inputFile = new FileReader("csv.txt");
//Instantiate the BufferedReader Class
BufferedReader bufferReader = new BufferedReader(inputFile);
//Variable to hold the one line data
String line;
String date="";String org ="";String student ="";
// Read file line by line and print on the console
while ((line = bufferReader.readLine()) != null) {
if(line.contains("1.")){
if(date!="" || org!=""){
writer.println(date+","+org+","+student);
student ="";
}
date = line.substring(2);
}else if(line.contains("2.")){
org = line.substring(2);
}else{
line = "("+line+")";
student += line+",";
}
System.out.println(line);
}
writer.println(date+","+org+","+student);
//Close the buffer reader
bufferReader.close();
}catch(Exception e){
System.out.println("Error while reading file line by line:" + e.getMessage());
}
writer.close();
}
This is the output you will get for this
10/09/2016, cycling club,(3. sam, 1000, oklahoma),( henry, 1001, california),( bill, 1002, NY),
11/15/2016, swimming club,(3. jane, 9001, georgia),( elizabeth, 9002, lousiana),
I am reading the file from csv.txt. while loop goes through each line of text file.all the fields are stored in a variable. When next date comes I write all of them into output file. Last line of the csv is written to file after the while loop terminates.
Try uniVocity-parsers to handle this. For parsing this sort of format, you'll find a few examples here. For writing, look here and here.
Adapting from the examples I've given, you could write:
final ObjectRowListProcessor dateProcessor = new ObjectRowListProcessor();
final ObjectRowListProcessor clubProcessor = new ObjectRowListProcessor();
final ObjectRowListProcessor memberProcessor = new ObjectRowListProcessor();
InputValueSwitch switch = new InputValueSwitch(0){
public void rowProcessorSwitched(RowProcessor from, RowProcessor to) {
//your custom logic here
if (to == dateProcessor) {
//processing dates.
}
if (to == clubProcessor) {
//processing clubs.
}
if (to == memberProcessor){
//processing members
}
};
switch.addSwitchForValue("1.", dateProcessor, 1); //getting values of column 1 and sending them to `dateProcessor`
switch.addSwitchForValue("2.", clubProcessor, 1); //getting values of column 1 and sending them to `clubProcessor`
switch.addSwitchForValue("3.", memberProcessor, 1, 2, 3); //getting values of columns 1, 2, and 3 and sending them to `memberProcessor`
setDefaultSwitch(memberProcessor, 1, 2, 3); //Rows with blank value at column 0 are members. Also get columns 1, 2, and 3 and send them to `memberProcessor`
CsvParserSettings settings = new CsvParserSettings(); //many options here, check the tutorial and examples
// configure the parser to use the switch
settings.setRowProcessor(switch);
//creates a parser
CsvParser parser = new CsvParser(settings);
//parse everying. Rows will be sent to the RowProcessor of each switch, depending on the value at column 0.
parser.parse(new File("/path/to/file.csv"));
Disclaimer: I'm the author of this library, it's open-source and free (Apache 2.0 license)
So basically what I need to do is:
Read a text file like this:
[Student ID], [Student Name], Asg 1, 10, Asg 2, 10, Midterm, 40, Final, 40
01234567, Timture Choi, 99.5, 97, 100.0, 99.0
02345678, Elaine Tam, 89.5, 88.5, 99.0, 100
and present it like this (with calculations of rank and average):
ID Name Asg 1 Asg 2 Midterm Final Overall Rank
01234567 Timture Choi 99.5 97.0 100.0 99.0 99.3 1
02345678
Elaine Tam 89.5 88.5 99.0 100.0 97.4 2
Average: 94.5 92.75 99.5 99.5 98.3
Using printf() function
now this is what I have done so far:
import java.io.*;
import java.util.Scanner;
class AssignmentGrades {
public static void main(String args[]) throws Exception {
Scanner filename = new Scanner(System.in);
String fn = filename.nextLine(); //scannig the file name
System.out.println("Enter your name of file : ");
FileReader fr = new FileReader(fn+".txt");
BufferedReader br = new BufferedReader (fr);
String list;
while((list = br.readLine()) !=null) {
System.out.println(list);
}
fr.close();
}
}
So I can ask the user for the name of the file, then read it and print.
Now.. I'm stuck. I think I need to probably put it in to array and split?
String firstrow = br.readLine();
String[] firstrow = firstrow.split(", ");
something like that?.. ugh ive been stuck here for more than an hour
I really need help!! I appreciate your attention!! ( I started to learn java this week)
There are two ways for splitting the input line just read from the file
Using String object's split() method which would return an array. Read more about the split here.
StringTokenizer Class - This class can be used to divide the input string into separate tokens based on a set of delimeter. Here is a good tutorial to get started.
You should be able to get more examples using google :)
In case you want to parse integers from String. Check this.
Here I store the columns as an array of Strings and I store the record set as an ArrayList of String arrays. In the while loop if the column set is not initialized yet (first iteration) I initialize it with the split. Otherwise I add the split to the ArrayList. Import java.util.ArrayList.
String[] columns = null;
ArrayList<String[]> values = new ArrayList<String[]>();
String list;
while((list = br.readLine()) !=null) {
if (columns != null) {
columns = list.split(", ");
} else {
values.add(list.split(", "));
}
}
fr.close();
I have a tsv txt file containing data in 3 rows.
It looks like:
HG sn FA
PC 2 16:0
PI 1 18:0
PS 3 20:0
PE 2 24:0
26:0
16:1
18:2
I want to read this file into a 2 dimensional array in java.
But i get an error all the time, no matter what i try.
File file = new File("table.txt");
Scanner scanner = new Scanner(file);
final int maxLines = 100;
String[][] resultArray = new String[maxLines][];
int linesCounter = 0;
while (scanner.hasNextLine() && linesCounter < maxLines) {
resultArray[linesCounter] = scanner.nextLine().split("\t");
linesCounter++;
}
System.out.print(resultArray[1][1]);
I keep getting this error
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at exercise.exercise2.main(exercise2.java:31)
Line 31 is
System.out.print(resultArray[1][1]);
I cannot find any reasons why this error keeps emerging
In your case I would use Java 7 Files.readAllLines.
Something like:
String[][] resultArray;
List<String> lines = Files.readAllLines(Paths.get("table.txt"), StandardCharsets.UTF_8);
//lines.removeAll(Arrays.asList("", null)); // <- remove empty lines
resultArray = new String[lines.size()][];
for(int i =0; i<lines.size(); i++){
resultArray[i] = lines.get(i).split("\t"); //tab-separated
}
Output:
[[HG, sn FA ], [PC, 2, 16:0], [PI, 1, 18:0], [PS, 3, 20:0], [PE, 2, 24:0], [, , 26:0], [, , 16:1], [, , 18:2]]
And this is the file (press edit and grab the content, it should be tab separated):
HG sn FA
PC 2 16:0
PI 1 18:0
PS 3 20:0
PE 2 24:0
26:0
16:1
18:2
[EDIT]
To get 16:1:
System.out.println(root[6][2]);