Mapreduce Program(Java) using Distributed Cache

Mapreduce Program(Java) using Distributed Cache - java

Hi I am new to Hadoop Mapreduce programming. Actually I have a requirement like below:
larger file i.e the input file input.txt
101 Vince 12000
102 James 33
103 Tony 32
104 John 25
105 Nataliya 19
106 Anna 20
107 Harold 29
And this is the smaller file lookupfile.txt
101 Vince 12000
102 James 10000
103 Tony 20000
104 John 25000
105 Nataliya 15000
Now what we want is to get those results which have common Id Number. So, in order to achieve this use smaller file as look up file and larger file as input file. The complete java code and explanation of each component is given below:
This is the result we will get after running the above code.
102 James 33 10000
103 Tony 32 20000
104 John 25 25000
105 Nataliya 19 15000
Code:
public class Join extends Configured implements Tool
{
public static class JoinMapper extends Mapper
{
Path[] cachefiles = new Path[0]; //To store the path of lookup files
List exEmployees = new ArrayList();//To store the data of lookup files
/********************Setup Method******************************************/
#Override
public void setup(Context context)
{
Configuration conf = context.getConfiguration();
try
{
cachefiles = DistributedCache.getLocalCacheFiles(conf);
BufferedReader reader = new BufferedReader(new FileReader(cachefiles[0].toString()));
String line;
while ((line = reader.readLine())!= null)
{
exEmployees.add(line); //Data of lookup files get stored in list object
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
/************************setup method ends***********************************************/
/********************Map Method******************************************/
public void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException
{
String[] line = value.toString().split("\t");
for (String e : exEmployees)
{
String[] listLine = e.toString().split("\t");
if(line[0].equals(listLine[0]))
{
context.write(new Text(line[0]), new Text(line[1]+"\t"+line[2]+"\t"+listLine[2]));
}
}
} //map method ends
/***********************************************************************/
}
/********************run Method******************************************/
public int run(String[] args) throws Exception
{
Configuration conf = new Configuration();
Job job = new Job(conf, "aggprog");
job.setJarByClass(Join.class);
DistributedCache.addCacheFile(new Path(args[0]).toUri(), job.getConfiguration());
FileInputFormat.addInputPath(job, new Path(args [1]));
FileOutputFormat.setOutputPath(job, new Path(args [2]));
job.setMapperClass(JoinMapper.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
return (job.waitForCompletion(true) ? 0 : 1);
}
public static void main (String[] args) throws Exception
{
int ecode = ToolRunner.run(new Join(), args);
System.exit(ecode);
}
}
Execution Command :
case1:
hadoop jar '/home/cloudera/Desktop/DistributedCache.jar' Join My_Job/MultiInput_1/Input/input.txt My_Job/MultiInput_1/Input/smallerinput.txt My_Job/MultiInput_1/My_Output
case2:
hadoop jar '/home/cloudera/Desktop/DistributedCache.jar' Join My_Job/MultiInput_1/Input/input.txt My_Job/MultiInput_1/Input/smallerinput.txt My_Job/MultiInput_1/My_Output
I have tried above two commands, but it is not working. I don't Know what the problem is and also where the problem is. I am unable to execute the above code.
finally i tried below code it worked
hadoop jar '/home/cloudera/Desktop/DistributedCache.jar' Join hdfs/Input/smallerfile.txt hdfs/Input/input.txt My_Job/MultiInput_1/MyOutput
I found my mistake. I was checking the large file with the small file. But, when I tried the reverse way it worked for me, but the output was not as expected.
Expected output is:
101 Vince 12000
102 James 33 10000
103 Tony 32 20000
104 John 25 25000
105 Nataliya 19 15000
106 Anna 20
107 Harold 29
But my output is:
101 Vince 12000
102 James 33 10000
103 Tony 32 20000
104 John 25 25000
105 Nataliya 19 15000
106 Anna 20
107 Harold 29
Can somebody help me?

Yes user3475485. Your files should be put in hdfs for this code to run or because your are using Genericoptionsparse you can use this format
hadoop jar jarname.jar drivername -files file1,file2 should work for you.

Related

How to get only characters from a file in java [duplicate]

This question already has answers here:
extract data column-wise from text file using Java
(2 answers)
Closed 4 years ago.
I have a file txt. This is the file:
Team P W L D F A Pts
1. Arsenal 38 26 9 3 79 - 36 87
2. Liverpool 38 24 8 6 67 - 30 80
3. Manchester_U 38 24 5 9 87 - 45 77
4. Newcastle 38 21 8 9 74 - 52 71
5. Leeds 38 18 12 8 53 - 37 66
6. Chelsea 38 17 13 8 66 - 38 64
7. West_Ham 38 15 8 15 48 - 57 53
8. Aston_Villa 38 12 14 12 46 - 47 50
9. Tottenham 38 14 8 16 49 - 53 50
How can I get only the name of teams? I tried to use the regex in the following way but don't work:
FileReader f;
f=new FileReader("file.txt");
BufferedReader b;
b=new BufferedReader(f);
s=b.readLine();
String[] name = s.split("\\w+");
for(int i=0;i<name.length;i++)
System.out.println(name[i]);
How do I solve? Thanks to everyone in advance!

FileReader f;
f=new FileReader("file.txt");
BufferedReader b;
b=new BufferedReader(f);
while(s=b.readLine()!=null){
Matcher name=Pattern.compile("(?<=\\d\\.\\s)\\S+").matcher(s);
if(name.find())
System.out.println(name.group());
}
here the regex (?<=\\d\\.\\s)\\S+ will match only the name after the serial no. Regex

If you want to read line by line and your file has structure as you presented. These code enable you to get clubs names.
File f = new File("file.txt");
Scanner sc = new Scanner(f);
sc.nextLine();
while (sc.hasNextLine()) {
String[] name = sc.nextLine().split("\\s+");
System.out.println(name[1]);
}

try to use replaceAll, find all word characters (a-zA-Z_) and replace them all with empty. this gives team name.
s=b.readLine();
s.replaceAll("[^a-zA-Z_]+","");
System.out.println(s);

Your string s is one line:
1. Arsenal 38 26 9 3 79 - 36 87
All you need to do is split by space and get second entry:
s.split(" ")[1]
RegEx is overkill here. Do this for each line and add the name to a list at each step.

for loop iteration cant find whats making i jump values

FILE THATS BEING READ
Rob Gronkowski 48
Zach Ertz 34
Travis Kelce 29
Evan Engram 15
Jimmy Graham 12
Cameron Brate 10
Delanie Walker 9
Kyle Rudolph 6
Austin Seferian-Jenkins 6
Jack Doyle 6
Hunter Henry 5
Jason Witten 4
Jordan Reed 4
Vernon Davis 3
Jared Cook 3
Tyler Kroft 3
Ed Dickson 3
Charles Clay 3
George Kittle 3
Antonio Brown 67
DeAndre Hopkins 62
A.J. Green 62
Mike Evans 62
Julio Jones 56
Michael Thomas 55
Dez Bryant 53
Michael Crabtree 45
Brandin Cooks 42
Tyreek Hill 42
Doug Baldwin 42
Keenan Allen 32
Jarvis Landry 29
Will Fuller 29
Amari Cooper 29
Stefon Diggs 29
Alshon Jeffery 27
Nelson Agholor 24
Adam Thielen 24
Chris Hogan 24
Golden Tate 24
Demaryius Thomas 22
Jordy Nelson 22
Larry Fitzgerald 22
DeSean Jackson 21
JuJu Smith-Schuster 19
Devante Parker 18
Devin Funchess 18
Kelvin Benjamin 18
T.Y. Hilton 17
Emmanuel Sanders 17
Marvin Jones 15
Rishard Matthews 14
Pierre Garcon 14
Cooper Kupp 14
Sterling Shepard 14
Paul Richardson 11
Danny Amendola 10Le’Veon Bell 70
Kareem Hunt 63
Todd Gurley 63
Leonard Fournette 60
Melvin Gordon 60
LeSean McCoy 60
Mark Ingram 50
Devonta Freeman 50
Jordan Howard 50
Lamar Miller 41
Doug Martin 34
Carlos Hyde 34
Aaron Jones 27
Alvin Kamara 27
Jerick McKinnon 24
DeMarco Murray 21
Chris Thompson 21
Jay Ajayi 21
Joe Mixon 18
C.J. Anderson 17
Tevin Coleman 17
Christian McCaffrey 17
Derrick Henry 16
Alex Collins 16
Dion Lewis 15
Adrian Peterson 13
Duke Johnson 12
Marshawn Lynch 11
Ameer Abdullah 10
Bilal Powell 9
LeGarrette Blount 9
Marlon Mack 9
James White 8
Ezekiel Elliott 7
Latavius Murray 7
Frank Gore 7
Isaiah Crowell 7
Orleans Darkwa 7
Kenyan Drake 5
Matt Forte 5
Darren McFadden 5
Alfred Morris 5
Damien Williams 3
Tarik Cohen 3
Jonathan Stewart 3
Robert Kelley 3
Danny Woodhead 3
Ty Montgomery 2
Javorius Allen 2
Mike Gillislee 2
Thomas Rawls 2
Theo Riddick 2
DeAndre Washington 2
Eddie Lacy 2
Giovani Bernard 2
Andre Ellington 2
Austin Ekeler 2
Jalen Richard 2
Ted Ginn 10
Robby Anderson 10
Jermaine Kearse 9
Davante Adams 9
Kenny Stills 9
Sammy Watkins 9
Marqise Lee 5
Mohamed Sanu 5
Allen Hurns 5
Josh Doctson 5
Jamison Crowder 4
Jeremy Maclin 3
Randall Cobb 3
Tyrell Williams 3
Robert Woods 3
Corey Davis 3
Jordan Matthews 3
Tyler Lockett 3
John Brown 2
Willie Snead 2
Donte Moncrief 2
Deshaun Watson 31
Dak Prescott 26
Tom Brady 24
Russell Wilson 22
Drew Brees 22
Carson Wentz 20
Alex Smith 14
Kirk Cousins 13
Matthew Stafford 11
Marcus Mariota 11
Tyrod Taylor 11
Cam Newton 11
Matt Ryan 11
Philip Rivers 8
having some problems been looking all over for answers. I found out my for loop iteration is incorrect it prints the series:0,1,2,10 etc. I was wondering if someone can point out my flaw, so I can fix this. I apprectiate anyone reading this, and appolgozie for the length of code. But just wanted to include everything so I don't miss anything. FOR LOOP LINE 87 thanks again, sincerely java noob
CODE
package trades;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.Scanner;
import java.util.regex.*;
public class Fantasy {
public static void main(String[] args) {
int[] playerRanking = new int[75];
String infoComingIn = null;
//Finding file path
String filename = "C:\\Users\\Karanvir\\Desktop\\21days\\players.txt";
File filez = new File(filename);
BufferedReader br;
String[] playerNames = new String[75];
int counterOfReadLines = 0;
Pattern p = Pattern.compile("[0-9]{2,3}");
ArrayList<Integer> arrayList = new ArrayList<Integer>();
try {
br = new BufferedReader(new FileReader(filez));
playerNames[counterOfReadLines] = br.readLine();
while (br.readLine() != null) {
counterOfReadLines = counterOfReadLines + 1;
playerNames[counterOfReadLines] = br.readLine();
System.out.println(playerNames[counterOfReadLines - 1]);
}
br.close();
for (int i = 0; i < playerNames.length; i++) {
Matcher m = p.matcher(playerNames[i]);
if (m.find()) {
String matched = m.group(0);
int addToArray = Integer.parseInt(matched);
playerRanking[i] = addToArray;
System.out.println(i);
}
}
} catch (Exception e) {}
}
}

Okay, so by seeing the post, I can point out only one issue. Since you are incrementing counterOfReadLines variable before the line
playerNames[counterOfReadLines] = br.readLine();
so what happens is playerNames is initializing with the array of index 1 not 0 and when you are trying to call the loop below:-
for (int i = 0; i < playerNames.length; i++) {
Matcher m = p.matcher(playerNames[i]);
if (m.find()) {
String matched = m.group(0);
int addToArray = Integer.parseInt(matched);
playerRanking[i] = addToArray;
System.out.println(i);
}
it is incrementing with 0. So either start it from i=1 or increment the counterOfReadLines after the line
playerNames[counterOfReadLines] = br.readLine();
so your error will go away...! if not let me know... :) !

Loop issue on reading from file

I am having trouble with my loop. If anyone could take a look and try to find where im going wrong it would be awesome. I am reading from two different files and I want my code to loop through the entire files. So far it is only looping the first 11 lines of the file.
package lab.pkg02;
import java.util.Scanner;
import java.io.*;
public class Lab02 {
public static void main(String[] args) throws IOException {
File usageFile;
File historyFile;
PrintWriter resultsFile;
PrintWriter newHistoryFile;
Scanner usageSC,historySC;
String vin,make,model;
int year, beginingOdo, endingOdo, currentGallons, currentGas,
currentRepair, mpg, costPerMile, totalGas, totalRepair,
currentMiles;
//Display Report Heading to Report File
resultsFile = new PrintWriter("reportfile.txt");
resultsFile.printf("%-5s%10s%15s%12s%13s%16s%5s%16s%17s%20s\n", "VIN",
"Vehicle Description", "Beginning Odo",
"Ending Odo", "Current Gas","Current Repair", "MPG",
"Cost Per Mile", "Historical Gas", "Historical Repair");
//Process Each Vehicle
for(int cnt = 0; cnt < 15; cnt++) {
//Get Vehicle Information from Usage File
usageFile = new File("usage.txt");
usageSC = new Scanner(usageFile);
vin = usageSC.nextLine( );
year = usageSC.nextInt( );
usageSC.nextLine();
make = usageSC.nextLine( );
model = usageSC.nextLine( );
beginingOdo = usageSC.nextInt( );
usageSC.nextLine();
endingOdo = usageSC.nextInt( );
usageSC.nextLine();
currentGallons = usageSC.nextInt( );
usageSC.nextLine();
currentGas = usageSC.nextInt( );
usageSC.nextLine();
currentRepair = usageSC.nextInt( );
usageSC.nextLine();
mpg = usageSC.nextInt( );
usageSC.nextLine();
costPerMile = usageSC.nextInt( );
usageSC.close( );
//Get Vehicle History from History File
historyFile = new File ("historyfile.txt");
historySC = new Scanner(historyFile);
vin = historySC.nextLine( );
totalGas = historySC.nextInt( );
historySC.nextLine();
totalRepair = historySC.nextInt( );
historySC.nextLine();
historySC.close( );
//Calculate Updated Vehicle Information
currentMiles = endingOdo - beginingOdo;
mpg = currentMiles / currentGallons;
costPerMile = (currentGas + currentRepair) / currentMiles;
totalGas = totalGas + currentGas;
totalRepair = totalRepair + currentRepair;
//Store Updated Vehicle Information to New History File
newHistoryFile = new PrintWriter("newhistoryfile.txt");
newHistoryFile.println(vin);
newHistoryFile.println(totalGas);
newHistoryFile.println(totalRepair);
newHistoryFile.close( );
//Display Vehicle Summary Line to Report File
resultsFile.printf("%-5s%10s%15s%12s%13s%16s%5s%16s%17s%20s\n", vin,
year,make,model, beginingOdo,endingOdo, currentGas,currentRepair, mpg
,costPerMile, totalGas, totalRepair);
resultsFile.close( );
}
}
}
Both files are posted below im sure that the issue of the loop is not because of the file but do to an error in the code.
****Usage File*****
1FTSW2BR8AEA51037
2017
Ford
Fiesta
12345
123456
200
2500
50
40
100
4S7AU2F966C091212
2016
Ford
Focus
2356
23567
80
150
10
30
101
1FTEX1EM9EFD29979
2015
Ford
Mustang
23
235
86
100
30
29
102
1XPVD09X5AD163651
2015
Ford
Escape
15000
235679
800
350
750
28
103
2G1WF5EK0B1163554
2014
Ford
Explorer
7854
12498
736
259
123
27
104
1GDP8C1Y7GV522436
2013
Audi
A6
5269
54697
456
2464
61431
26
104
1FMCU92709KC54353
2012
Audi
A8
123
3456
52
86
10
25
106
1GDHK44K59F125839
2011
Audi
TT
5689
46546
14
89
15
24
107
3GYFNBE38ES603704
2010
Audi
Q5
54875
646656
69
84
1000
23
108
SAJPX1148VC828077
2009
Audi
R8
1201
1209
213
1321
11000
25
109
JS2RF9A72C6152147
2008
Audi
A7
2589
36644
874
1511
110
41
111
JT2SK13E4S0334527
BMW
2007
i8
652
3664
856
151
11
26
110
1GTHC34K6KE580545
BMW
2006
X6
65
324
231
1636
11136
19
112
1FDNS24L0XHA16500
BMW
2005
X1
546
64654
2654
16354
112
21
113
2C3AA53G55H689466
BMW
2004
M4
1233
6464
264
1354
12
32
114
*****historyfile*******
1FTSW2BR8AEA51037
4500
150
4S7AU2F966C091212
2150
1000
1FTEX1EM9EFD29979
10000
15000
1XPVD09X5AD163651
3500
7500
2G1WF5EK0B1163554
2590
1230
1GDP8C1Y7GV522436
24640
614310
1FMCU92709KC54353
860
100
1GDHK44K59F125839
8909
150
3GYFNBE38ES603704
8408
10000
SAJPX1148VC828077
132107
110000
JS2RF9A72C6152147
151106
1100
JT2SK13E4S0334527
15105
110
1GTHC34K6KE580545
163604
111360
1FDNS24L0XHA16500
1635403
1120
2C3AA53G55H689466
135402
1201

From what I see, you are re-initializing
usageFile = new File("usage.txt");
usageSC = new Scanner(usageFile);
historyFile = new File ("historyfile.txt");
historySC = new Scanner(historyFile);
newHistoryFile = new PrintWriter("newhistoryfile.txt");
in every loop which runs 15 times, and you close the scanner in each loop.
Move those outside the loop and it will work and change nextLine() to next() to read next strings for usage.
Your file has empty lines after the 11th vin in usage.

HashMap keeps returning null value for no obvious reason

I have a hash map and an ArrayList. Both of them are populated as I have tried to print them out and it works fine. The arrayList contains MeterNumbers (MeterNumber is the key of the HashMap). The map contains MeterNumbers for keys and String for values.
What I want to be able to do is, get the String value from the the hasMap given the MeterNumber key which i will provide from the ArrayList. I don't think I need to check if it exists there coz I know it does for sure.I have tried all I can to get the value but it keeps giving me null values. Here is my code.
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
public class Try {
static Map <MeterNumber, String> map2 = new HashMap <MeterNumber, String>();
static ArrayList<MeterNumber> blackOutMeters = new ArrayList<MeterNumber>();
public static void main (String args[]) {
try {
Scanner sc2 = new Scanner(new java.io.File("meters.txt"));
Scanner sc3 = new Scanner(new java.io.File("outages.txt"));
while (sc2.hasNextLine()) {
String transformerId;
MeterNumber meterId;
String line = sc2.nextLine();
String[] array = line.split(" ");
if (array.length>3){
transformerId = array[3];
meterId = MeterNumber.fromString(array [0] + array [1] + array [2]);
map2.put(meterId, transformerId);
}
}
// System.out.println (map2.values());
while (sc3.hasNextLine()) {
MeterNumber meterId;
String line = sc3.nextLine();
String[] array = line.split(" ");
if (array.length>2){
meterId = MeterNumber.fromString(array [0] + array [1] + array [2]);
blackOutMeters.add(meterId);
}
}
for (int i = 0; i <blackOutMeters.size(); i++){
String s = map2.get(blackOutMeters.get(i));
System.out.println (s);
}
}
catch (FileNotFoundException e) {
e.printStackTrace();
}
}}
file format for meters.txt is:
900 791 330 T1
379 165 846 T1
791 995 073 T1
342 138 557 T1
114 125 972 T1
970 324 636 T1
133 997 798 T1
308 684 630 T1
169 329 493 T1
540 085 209 T1
265 229 117 T1
970 173 664 T1
264 943 573 T1
462 043 136 T1
087 307 071 T1
001 343 243 T1
file format for outages.txt is:
900 791 330
379 165 846
791 995 073
342 138 557
114 125 972
970 324 636
133 997 798
Thank you in advance.

You need to implement hashCode and equals for MeterNumber
Otherwise Java has no way of knowing how to compare your objects

Comparing lines in a file

I am trying to compare File 1 and File 2.
File 1:
7.3 0.28 0.36 12.7 0.04 38 140 0.998 3.3 0.79 9.6 6 1
7.4 0.33 0.26 15.6 0.049 67 210 0.99907 3.06 0.68 9.5 5 1
7.3 0.25 0.39 6.4 0.034 8 84 0.9942 3.18 0.46 11.5 5 1
6.9 0.38 0.25 9.8 0.04 28 191 0.9971 3.28 0.61 9.2 5 1
5.1 0.11 0.32 1.6 0.028 12 90 0.99008 3.57 0.52 12.2 6 1
File 2:
5.1 0.11 0.32 1.6 0.028 12 90 0.99008 3.57 0.52 12.2 6 -1
7.3 0.25 0.39 6.4 0.034 8 84 0.9942 3.18 0.46 11.5 5 1
6.9 0.38 0.25 9.8 0.04 28 191 0.9971 3.28 0.61 9.2 5 -1
7.4 0.33 0.26 15.6 0.049 67 210 0.99907 3.06 0.68 9.5 5 -1
7.3 0.28 0.36 12.7 0.04 38 140 0.998 3.3 0.79 9.6 6 1
In both files the last element in each line is class label.
I am comparing if the class labels are equal.
ie compare the classlabel of
line1:7.3 0.28 0.36 12.7 0.04 38 140 0.998 3.3 0.79 9.6 6 1
with
line2:7.3 0.28 0.36 12.7 0.04 38 140 0.998 3.3 0.79 9.6 6 1
Matches.
compare
line1:7.4 0.33 0.26 15.6 0.049 67 210 0.99907 3.06 0.68 9.5 5 1
with
line2:7.4 0.33 0.26 15.6 0.049 67 210 0.99907 3.06 0.68 9.5 5 -1
Not matches
Updated
What I did is
String line1;
String line2;
int notequalcnt = 0;
while((line1 = bfpart.readLine())!=null){
found = false;
while((line2 = bfin.readLine())!=null){
if(line1.equals(line2)){
found = true;
break;
}
else{
System.out.println("not equal");
notequalcnt++;
}
}
}
But I am getting every one as not equal.
Am I doing anything wrong.

After the first iteration itself, line2 becomes null. So, the loop will not execute again... Declare line2 buffer after the first while loop. Use this code:
public class CompareFile {
public static void main(String args[]) throws IOException{
String line1;
String line2;
boolean found;
int notequalcnt =0;
BufferedReader bfpart = new BufferedReader(new FileReader("file1.txt"));
while((line1 = bfpart.readLine())!=null){
found = false;
BufferedReader bfin = new BufferedReader(new FileReader("file2.txt"));
while((line2 = bfin.readLine())!=null){
System.out.println("line1"+line1);
System.out.println("line2"+line1);
if(line1.equals(line2)){
System.out.println("equal");
found = true;
break;
}
else{
System.out.println("not equal");
}
}
bfin.close();
if(found==false)
notequalcnt++;
}
bfpart.close();
}
}

You're comparing every line from file 1 with every line from file 2, and you are printing "not equal" every time any one of them doesn't match.
If file 2 has 6 lines, and you are looking for a given line from file 1 (say it's also in file 2), then 5 of the lines from file 2 won't match, and "not equal" will be output 5 times.
Your current implementation says "if any lines in file 2 don't match, it's not a match", but what you really mean is "if any lines in file 2 do match, it is a match". So your logic (pseudocode) should be more like this:
for each line in file 1 {
found = false
reset file 2 to beginning
for each line in file 2
if line 1 equals line 2
found = true, break.
if found
"found!"
else
"not found!"
}
Also you describe this as comparing "nth line of file 1 with nth line of file 2", but that's not actually what your implementation does. Your implementation is actually comparing the first line of file 1 with every line of file 2 then stopping, because you've already consumed every line of file 2 in that inner loop.
Your code has a lot of problems, and you probably need to sit back and work out your logic on paper first.

If the target is to compare and find the matching lines. Convert the file contents to an arraylist and compare the values.
Scanner s = new Scanner(new File("file1.txt"));
ArrayList<String> file1_list = new ArrayList<String>();
while (s.hasNext()){
file1_list .add(s.next());
}
s.close();
s = new Scanner(new File("file2.txt"));
ArrayList<String> file2_list = new ArrayList<String>();
while (s.hasNext()){
file2_list .add(s.next());
}
s.close();
for(String line1 : file1_list ){
if(file2_list.contains(line1)){
// found the line
}else{
// NOt found the line
}
}

Check Apache file Utils o compare files.
http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Mapreduce Program(Java) using Distributed Cache - java

Yes user3475485. Your files should be put in hdfs for this code to run or because your are using Genericoptionsparse you can use this format hadoop jar jarname.jar drivername -files file1,file2 should work for you.

Related

How to get only characters from a file in java [duplicate]

for loop iteration cant find whats making i jump values

Loop issue on reading from file

HashMap keeps returning null value for no obvious reason

Comparing lines in a file

Categories

Resources