Fork/Join combine with FileChannel to copy file

Fork/Join combine with FileChannel to copy file - java

Recently I am making an exercise using Java 7 FORK/JOIN framework and FileChannel to copy a file. Here is my code (Test.java):
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.channels.FileChannel;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;
import java.util.concurrent.TimeUnit;
import java.util.logging.Level;
import java.util.logging.Logger;
public class Test {
private ArrayList<FileProcessor> processors = new ArrayList<FileProcessor>();
public Test(){
String outputDir = "C:\\temp";
if (!Files.isDirectory(Paths.get(outputDir))) {
System.out.println("this is not a path");
} else {
try {
//start copying file
ForkJoinPool pool = new ForkJoinPool();
int numberOfThread = 2;
File file = new File("C:\\abc.cdm");
long length = file.length();
long lengthPerCopy = (long)(length/numberOfThread);
long position = 0L;
for (int i = 0; i < numberOfThread; i++) {
FileProcessor processor = null;
if (i == numberOfThread - 1) {
//the last thread
processor = new FileProcessor("abc.cdm", "C:\\abc.cdm", "C:\\temp", position, length - position);
} else {
processor = new FileProcessor("abc.cdm", "C:\\abc.cdm", "C:\\temp", position, lengthPerCopy);
position = position + lengthPerCopy + 1;
}
processors.add(processor);
pool.execute(processor);
}
do {
System.out.printf("******************************************\n");
System.out.printf("Main: Parallelism: %d\n", pool.getParallelism());
System.out.printf("Main: Active Threads: %d\n", pool.getActiveThreadCount());
System.out.printf("Main: Task Count: %d\n", pool.getQueuedTaskCount());
System.out.printf("Main: Steal Count: %d\n", pool.getStealCount());
System.out.printf("******************************************\n");
try
{
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException e)
{
e.printStackTrace();
}
} while (!isDone()); //when all the thread not been done
pool.shutdown();
System.out.println("copy done");
} catch (Exception ex) {
//out an error here...
}
}
}
private boolean isDone(){
boolean res = false;
for (int i = 0; i < processors.size(); i++) {
res = res || processors.get(i).isDone();
}
return res;
}
public static void main(String args[]) {
Test test = new Test();
}
class FileProcessor extends RecursiveTask<Integer>
{
private static final long serialVersionUID = 1L;
private long copyPosition;
private long copyCount;
FileChannel source = null;
FileChannel destination = null;
//Implement the constructor of the class to initialize its attributes
public FileProcessor(String fileName, String filePath, String outputPath, long position, long count) throws FileNotFoundException, IOException{
this.copyPosition = position;
this.copyCount = count;
this.source = new FileInputStream(new File(filePath)).getChannel().position(copyPosition);
this.destination = new FileOutputStream(new File(outputPath + "/" + fileName), true).getChannel().position(copyPosition);
}
#Override
protected Integer compute()
{
try {
this.copyFile();
} catch (IOException ex) {
Logger.getLogger(FileProcessor.class.getName()).log(Level.SEVERE, null, ex);
}
return new Integer(0);
}
private void copyFile() throws IOException {
try {
destination.transferFrom(source, copyPosition, copyCount);
}
finally {
if (source != null) {
source.close();
}
if (destination != null) {
destination.close();
}
}
}
}
}
I run my code,
if number of threads is 1, the file is copied exactly, but when number of theads is 2, file "C:\abc.cdm" is 77KB (78335), but after copied, file "C:\temp\abc.cdm" is just (39KB).
Where did I get wrong, please tell me??
Update: My problem has been solves
The problem is in isDone method, it must be:
boolean res = true;
for (int i = 0; i < processors.size(); i++) {
res = res && processors.get(i).isDone();
}
return res;
Also edit the following lines of codes:
File file = new File(selectedFile[i].getPath());
long length = file.length();
new RandomAccessFile("C:\\temp\abc.cdm", "rw").setLength(length);
This is just a practice for FORK/JOIN usage!

Your isDone() method was indeed wrong and you corrected it in the original question. But there is another issue in the FileProcessor. You assume that setting the position on the destination past the end of the file will automatically grow the file when you transfer to it. This is not the case.
Your first segment will always write because the write position is 0 and the file's length cannot be less than zero. That was the 39K you saw, which is roughly half of the total file size. The second segment never got written.
In order to get your code to run, you can do the following at the start:
File file = new File("C:\\abc.cdm");
long length = file.length();
new RandomAccessFile("C:\\temp\\abc.cdm", "rw").setLength(length);`

Related

File Downloader with Java Commons IO stuck at 1MB/s+

Lately I've been experimenting with Java and Commons IO trying to create a web file downloader, however I've encountered a problem, in fact it seems that the file download speed does not exceed 1MB/s while the same download from the browser runs smoothly at 3MB /s. Could you help me? I would be really grateful.
This is my downloader code:
package com.application.steammachine;
import com.github.junrar.Archive;
import com.github.junrar.Junrar;
import com.github.junrar.exception.RarException;
import com.github.junrar.rarfile.FileHeader;
import com.github.junrar.volume.FileVolumeManager;
import javafx.beans.property.SimpleStringProperty;
import javafx.concurrent.Task;
import org.apache.commons.io.IOUtils;
import org.ini4j.Wini;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.io.*;
import java.math.BigDecimal;
import java.math.RoundingMode;
import java.net.URL;
public class Downloader extends Task<Void> {
private URL url;
private String fileName;
private Game game;
public Downloader(URL url, String fileName, Game game) {
this.url = url;
this.fileName = fileName;
this.game = game;
}
public class ProgressListener implements ActionListener {
private double bytes = 0;
private double mbDownloaded = 0;
private double fileSize = 0;
private double lastMB = 0;
private long initialTime;
private double speed = 0;
private String downloadedText = "";
private String sizeText;
public ProgressListener(double fileSize){
this.fileSize = fileSize;
initialTime = System.nanoTime();
}
#Override
public void actionPerformed(ActionEvent e) {
bytes = ((DownloadCountingOutputStream) e.getSource()).getByteCount();
updateProgress(bytes, fileSize);
mbDownloaded = round(bytes/1e+6, 2);
if(fileSize >= 1073741824){ //>= 1GB
double temp = ((fileSize/1e+6)/1024);
sizeText = round(temp,2) + " GB";
}else {
double temp = (fileSize/1e+6);
sizeText = round(temp,2) + " MB";
}
if(mbDownloaded >= 1024){
downloadedText = String.valueOf(round(mbDownloaded/1024,2));
}else{
downloadedText = String.valueOf(mbDownloaded);
}
if((System.nanoTime() - initialTime) >= (Math.pow(10, 9))){
speed = round((mbDownloaded - lastMB), 3);
initialTime = System.nanoTime();
lastMB = mbDownloaded;
}
updateMessage(String.valueOf(speed)+"MB/s,"+String.valueOf(downloadedText + "/" + sizeText));
}
}
#Override
protected Void call() throws Exception {
URL dl = this.url;
File fl = null;
String x = null;
OutputStream os = null;
InputStream is = null;
try {
updateMessage("Searching files...,---/---");
fl = new File(Settings.getInstallPath() +"/"+ this.fileName);
os = new FileOutputStream(fl);
is = dl.openStream();
DownloadCountingOutputStream dcount = new DownloadCountingOutputStream(os);
double fileSize = Double.valueOf(dl.openConnection().getHeaderField("Content-Length"));
ProgressListener progressListener = new ProgressListener(fileSize);
dcount.setListener(progressListener);
IOUtils.copy(is, dcount, 512000 );
updateMessage("Concluding...,Almost finished");
} catch (Exception e) {
System.out.println(e);
IOUtils.closeQuietly(os);
IOUtils.closeQuietly(is);
updateMessage(",");
this.cancel(true);
return null;
} finally {
IOUtils.closeQuietly(os);
IOUtils.closeQuietly(is);
updateMessage(",");
updateProgress(0, 0);
this.cancel(true);
return null;
}
}
protected static double round(double value, int places) {
if (places < 0) throw new IllegalArgumentException();
BigDecimal bd = new BigDecimal(Double.toString(value));
bd = bd.setScale(places, RoundingMode.HALF_UP);
return bd.doubleValue();
}
}
This is the DownloadCountingOutputStream class, which i use for keeping track of the download status:
import java.awt.event.ActionListener;
import java.io.IOException;
import java.io.OutputStream;
public class DownloadCountingOutputStream extends CountingOutputStream {
private ActionListener listener = null;
public DownloadCountingOutputStream(OutputStream out) {
super(out);
}
public void setListener(ActionListener listener) {
this.listener = listener;
}
#Override
protected void afterWrite(int n) throws IOException {
super.afterWrite(n);
if (listener != null) {
listener.actionPerformed(new ActionEvent(this, 0, null));
}
}
}
Thanks in adavance

First, there are a couple of simple things you could try to speed up the transfers:
Try using a larger transfer buffer size. Change the 8K buffer size to 64K or 512K.
Get rid of the DownloadCountingOutputStream and transfer directly to the FileOutputStream.
These should be simple to try ... and they may help a bit.
On a Linux system, it may also be worthwhile to replace Apache IOUtils.copy call with code that uses the kernel's zero-copy transfer support. See Efficient data transfer through zero copy for an explanation.
The example in the article is for uploading using transferTo, but downloading using transferFrom should be analogous. The article claims a 65% speedup for large file transfers compared with conventional Java I/O. But that is likely to depend on the characteristics of your network connection.

Is it necessary lock the file when multiple threads try to append the content using NIO in JAVA?

At first I have created an empty file, and then I've invoked some thread to search the database and get the result content, and then append to the file. The result content is String type and may be 20M. Each thread should write into the file one at a time. I have tested many times and I find that it is not necessary to lock. Is that right? The total lines of the example is 1000. When should I need to add a write lock to operate on the file?
String currentName = "test.txt";
final String LINE_SEPARATOR = System.getProperty("line.separator");
ThreadPoolExecutor pool = new ThreadPoolExecutor(
10, 100, 10, TimeUnit.SECONDS, new LinkedBlockingDeque<Runnable>());
for (int i = 0; i < 500; i++) {
pool.execute(() -> {
try {
appendFileByFilesWrite(currentName, "abc" +
ThreadLocalRandom.current().nextInt(1000) + LINE_SEPARATOR);
} catch (IOException e) {
e.printStackTrace();
}
});
}
IntStream.range(0, 500).<Runnable>mapToObj(a -> () -> {
try {
appendFileByFilesWrite( currentName,
"def" + ThreadLocalRandom.current().nextInt(1000) +
LINE_SEPARATOR);
} catch (IOException e) {
e.printStackTrace();
}
}).forEach(pool::execute);
pool.shutdown();
Here is the method:
public static void appendFileByFilesWrite(String fileName,String fileContent) throws IOException {
Files.write(Paths.get(fileName), fileContent.getBytes(),StandardOpenOption.APPEND);
}

The answer is: always.
Your test works for you. Right now. Today. Maybe during a full moon, it won't. Maybe if you buy a new computer, or your OS vendor updates, or the JDK updates, or you're playing a britney spears song in your winamp, it won't.
The spec says that it is legitimate for the write to be smeared out over multiple steps, and the behaviour of SOO.APPEND is undefined at that point. Possibly if you write 'Hello' and 'World' simultaneously, the file may end up containing 'HelWorllod'. It probably won't. But it could.
Generally, bugs in concurrency are very hard (sometimes literally impossible) to test for. Doesn't make it any less of a bug; mostly you end up with a ton of bug reports, and you answering 'cannot reproduce' on all of them. This is not a good place to be.
Most likely if you want to observe the problem in action, you should write extremely long strings in your writer; the aim is to end up with the actual low-level disk command involving multiple separated out blocks. Even then there is no guarantee that you'll observe a problem. And yet, absence of proof is not proof of absence.

I use this class when I need to lock a file. It allows for read write locks across multiple JVMs and multiple threads.
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.RandomAccessFile;
import java.nio.channels.Channels;
import java.nio.channels.FileChannel;
import java.util.Date;
import java.util.Map;
import java.util.Objects;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantReadWriteLock;
import com.lfp.joe.core.process.CentralExecutor;
public class FileLocks {
private static final String WRITE_MODE = "rws";
private static final String READ_MODE = "r";
private static final Map<String, LockContext> JVM_LOCK_MAP = new ConcurrentHashMap<>();
private FileLocks() {
}
public static <X> X read(File file, ReadAccessor<X> accessor) throws IOException {
return access(file, false, fc -> {
try (var is = Channels.newInputStream(fc);) {
return accessor.read(fc, is);
}
});
}
public static void write(File file, WriterAccessor accessor) throws IOException {
access(file, true, fc -> {
try (var os = Channels.newOutputStream(fc);) {
accessor.write(fc, os);
}
return null;
});
}
public static <X> X access(File file, boolean write, FileChannelAccessor<X> accessor)
throws FileNotFoundException, IOException {
Objects.requireNonNull(file);
Objects.requireNonNull(accessor);
String path = file.getAbsolutePath();
var lockContext = JVM_LOCK_MAP.compute(path, (k, v) -> {
if (v == null)
v = new LockContext();
v.incrementAndGetThreadCount();
return v;
});
var jvmLock = write ? lockContext.getAndLockWrite() : lockContext.getAndLockRead();
try (var randomAccessFile = new RandomAccessFile(file, write ? WRITE_MODE : READ_MODE);
var fileChannel = randomAccessFile.getChannel();) {
var fileLock = write ? fileChannel.lock() : null;
try {
return accessor.access(fileChannel);
} finally {
if (fileLock != null && fileLock.isValid())
fileLock.close();
}
} finally {
jvmLock.unlock();
JVM_LOCK_MAP.compute(path, (k, v) -> {
if (v == null)
return null;
var threadCount = v.decrementAndGetThreadCount();
if (threadCount <= 0)
return null;
return v;
});
}
}
public static interface FileChannelAccessor<X> {
X access(FileChannel fileChannel) throws IOException;
}
public static interface ReadAccessor<X> {
X read(FileChannel fileChannel, InputStream inputStream) throws IOException;
}
public static interface WriterAccessor {
void write(FileChannel fileChannel, OutputStream outputStream) throws IOException;
}
private static class LockContext {
private final ReentrantReadWriteLock rwLock = new ReentrantReadWriteLock();
private long threadCount = 0;
public long incrementAndGetThreadCount() {
threadCount++;
return threadCount;
}
public long decrementAndGetThreadCount() {
threadCount--;
return threadCount;
}
public Lock getAndLockWrite() {
var lock = rwLock.writeLock();
lock.lock();
return lock;
}
public Lock getAndLockRead() {
var lock = rwLock.readLock();
lock.lock();
return lock;
}
}
}
You can then use it for writing like so:
File file = new File("test/lock-test.txt");
FileLocks.write(file, (fileChannel, outputStream) -> {
try (var bw = new BufferedWriter(new OutputStreamWriter(outputStream));) {
bw.append("cool beans " + new Date().getTime());
}
});
And reading:
File file = new File("test/lock-test.txt")
var lines = FileLocks.read(file, (fileChannel, inputStream) -> {
try (var br = new BufferedReader(new InputStreamReader(inputStream));) {
return br.lines().collect(Collectors.toList());
}
});

You can use fileLock or just add synchronized to the method.
while (true) {
try {
lock = fc.lock();
break;
} catch (OverlappingFileLockException e) {
Thread.sleep(1 * 1000);
}
}
appendFileByFilesWrite( fileName, fileContent) ;
or just change like this:
public synchronized static void appendFileByFilesWrite(String fileName,String fileContent) throws IOException {
Files.write(Paths.get(fileName), fileContent.getBytes(),StandardOpenOption.APPEND);
}

OutOfMemoryError: Java heap space when trying to create ArrayList

I am trying to write a program which converts all G1 lines of a G code to lines which say MOVX (x-coordinate of G1 command)
Eg. G1 X0.1851 should become MOVX(0.1851)
At the moment the program is just appending the text file that has been read and printing the new code below the old one in the same text file.
The problem is that when I try to create an array list of the number after the X in the G-Code, I get a problem with the memory in the heap space overflowing.
I have added a clear() statement after each iteration of a line of the G-Code to try to prevent the array list becoming larger and larger but it keeps overflowing.
Here is my code:
package textfiles;
import java.io.IOException;
import java.util.ArrayList;
public class FileData {
public static void main(String[] args) throws IOException {
String file_name = "C:/blabla";
try {
ReadFile file = new ReadFile(file_name);
WriteFile data = new WriteFile(file_name, true);
String[] aryLines = file.OpenFile();
int i;
int j;
int y;
for (i=0; i < aryLines.length; i++ ) { //goes through whole text file
System.out.println( aryLines[ i ]);
if (i == 0) {
data.writeToFile("");
System.lineSeparator();
}
char[] ch = aryLines[ i ].toCharArray();
ArrayList<Character> num = new ArrayList<Character>();
String xCo = null;
boolean counterX = false;
if ((ch[0]) == 'G' && ch[1] == '1') {
for (j = 0; j < ch.length; j++) { //goes through each line of text file
for (y = 0; counterX == true; y++) {
num.add(ch[j]);
}
if (ch[j] == 'X') {
counterX = true;
}
else if (ch[j] == ' ') {
counterX = false;
}
}
xCo = num.toString();
data.writeToFile("MOVX (" + xCo + ")");
}
num.clear();
}
}
catch (IOException e) {
System.out.println( e.getMessage() );
}
System.out.println("Text File Written To");
}
}

I'd suggest to avoid reading data into memory and to use streaming instead.
Then function which converts lines could look like:
public void convertFile(String fileName, String tmpFileName) throws IOException {
try (FileWriter writer = new FileWriter(tmpFileName, true)){
Pattern pG1_X = Pattern.compile("^G1 X");
Files.newBufferedReader(Paths.get(fileName)).lines().forEach(line -> {
try {
double x = Double.parseDouble(pG1_X.split(line)[1]); // get coordinate
String newLine = String.format("MOVX(%f)\n",x); // attempt to replace coordinate format
writer.write(newLine);
} catch (Exception e) {
LOGGER.log(Level.WARNING, String.format("error wile converting line %s", line), e);
}
});
}
}
Testcase which demonstrates how it works:
package com.github.vtitov.test;
import org.junit.experimental.theories.DataPoints;
import org.junit.experimental.theories.Theories;
import org.junit.experimental.theories.Theory;
import org.junit.rules.TemporaryFolder;
import org.junit.runner.RunWith;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Random;
import java.util.UUID;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.regex.Pattern;
import java.nio.file.StandardCopyOption;
#RunWith(Theories.class)
public class ReadWriteTest {
final static Logger LOGGER = Logger.getLogger(ReadWriteTest.class.getName());
public void convertFile(String fileName, String tmpFileName) throws IOException {
try (FileWriter writer = new FileWriter(tmpFileName, true)){
Pattern pG1_X = Pattern.compile("^G1 X");
Files.newBufferedReader(Paths.get(fileName)).lines().forEach(line -> {
try {
double x = Double.parseDouble(pG1_X.split(line)[1]); // get coordinate
String newLine = String.format("MOVX(%f)\n",x); // attempt to replace coordinate format
writer.write(newLine);
} catch (Exception e) {
LOGGER.log(Level.WARNING, String.format("error wile converting line %s", line), e);
}
});
}
}
#DataPoints static public Long[] fileSizes() {return new Long[]{100L,10_000L,1_000_000L}; }
#Theory
public void readWriteTest(Long fileSize) throws Exception {
TemporaryFolder folder = TemporaryFolder.builder().parentFolder(new File("target")).build();
folder.create();
File file = folder.newFile(UUID.randomUUID() + ".txt");
File tmpFile = folder.newFile(file.getName() + ".tmp");
createFile(fileSize, file);
String filePath = file.getPath();
LOGGER.info(String.format("created file %s of %d lines", filePath, fileSize));
String tmpFilePath = filePath + ".tmp";
convertFile(filePath, tmpFilePath);
LOGGER.info(String.format("file %s converted to %s", filePath, tmpFilePath));
//assert false;
Files.move(new File(tmpFilePath).toPath(), new File(filePath).toPath(),
StandardCopyOption.REPLACE_EXISTING, StandardCopyOption.ATOMIC_MOVE);
LOGGER.info(String.format("file %s moved to %s", tmpFilePath, filePath));
folder.delete();
}
private void createFile(long fileSize, File file) throws Exception {
try (FileWriter writer = new FileWriter(file,true)) {
Random rnd = new Random();
rnd.doubles(fileSize).forEach(l -> {
try { writer.write(String.format("G1 X%f\n", l)); } catch (IOException ignored) {}
});
}
}
}

I am trying to generate 7z file using java exec utility but it will creating empty zip file

package com.otp.util;
import java.io.FileWriter; import java.io.IOException; import
java.text.SimpleDateFormat; import java.util.Date;
import com.otp.servlets.MessageServlet;
public class CDRWriter {
public FileWriter fileWriter = null;
static int lineCounter = 0;
static String fileName = null;
public void writeCDR(String cdrData) throws IOException {
if(lineCounter == 0){
fileName = createFile();
}else if(lineCounter>500){
String temp=fileName;
fileName = createFile();
lineCounter=0;
Runtime rt = Runtime.getRuntime();
String zipCmd="7z a "+"\""+MessageServlet.filePath+temp+".7z"+"\""+" "+"\""+MessageServlet.filePath+temp+"\"";
System.out.println("zipCmd = "+zipCmd);
rt.exec(zipCmd);
//rt.exec("del "+MessageServlet.filePath+temp);
}
System.out.println("cdr data = "+cdrData);
try {
if(lineCounter == 0){
fileWriter = new FileWriter(MessageServlet.filePath+fileName);
}else{
fileWriter = new FileWriter(MessageServlet.filePath+fileName,true);
}
System.out.println("cdr after if else condition ="+cdrData);
fileWriter.write(cdrData.toString());
System.out.println("cdr after write method ="+cdrData);
fileWriter.write("\r\n");
fileWriter.flush();
//fileWriter.close();
lineCounter++;
System.out.println("CDRWriter : lineCounter = "+lineCounter); } catch (IOException e) {
e.printStackTrace();
}
}// end of WriterCDR method
public String createFile() throws IOException {
SimpleDateFormat sdf = new
SimpleDateFormat("dd-MM-yyyy-HH-mm-ss");
String fileName ="GSMS_CDR_"+ sdf.format(new Date())+".txt" ;
return fileName;
}// end of the createFile method
}// end of CDRWriter class

I would do something like that:
import java.io.*;
import SevenZip.Compression.LZMA.*;
public class Create7Zip
{
public static void main(String[] args) throws Exception
{
// file to compress
File inputToCompress = new File(args[0]);
BufferedInputStream inputStream = new BufferedInputStream(new java.io.FileInputStream(inputToCompress));
// archive
File compressedOutput = new File(args[1] + ".7z");
BufferedOutputStream outputStream = new BufferedOutputStream(new java.io.FileOutputStream(compressedOutput));
Encoder encoder = new Encoder();
encoder.SetAlgorithm(2);
encoder.SetDictionarySize(8388608);
encoder.SetNumFastBytes(128);
encoder.SetMatchFinder(1);
encoder.SetLcLpPb(3,0,2);
encoder.SetEndMarkerMode(false);
encoder.WriteCoderProperties(outputStream);
long fileSize;
fileSize = inputToCompress.length();
for (int i = 0; i < 8; i++)
{
outputStream.write((int) (fileSize >>> (8 * i)) & 0xFF);
}
encoder.Code(inputStream, outputStream, -1, -1, null);
// free resources
outputStream.flush();
outputStream.close();
inputStream.close();
}
}
The SKD for the SevenZip packages come from the offical SKD. Download it here ;).
Disclaimer: I believe, I found that snippet a while ago on the net...but I don't found the source anymore.

Dynamically compiling and Running a Hadoop job from another Java File

I am trying to write a Java file that receives the source code of a MapReduce job, compiles it dynamically and runs the job on a Hadoop cluster. To reach this, I have written 3 methods called compile(), makeJAR() and run_Hadoop_Job(). Everything works fine with the compilation and creation of the JAR file. However, when the job is submitted to Hadoop, as soon as the job starts, it faces problem with finding required Mapper/Reducer classes and throws a ClassNotFoundException for both the Mapper_Class and Reducer_Class *(java.lang.ClassNotFoundException: reza.rCloud.Mapper_Reducer_Classes$Mapper_Class.class)* . I know that there should be something wrong with how I have referenced the required Mapper/Reducer classes but I was not able to figure it out after several. Any help/suggestion on how to solve the issue is highly appreciated.
Regarding the details of the project: I have a file called "rCloud_test/src/reza/Mapper_Reducer_Classes.java" that contains the source code for Mapper_Class and Reducer_Class. This file is ultimately received during the runtime but for now I copied the Hadoop WordCount example in it and store it locally in the same folder as my main class file: rCloud_test/src/reza/Platform2.java.
Here below you can see the main() method of the Platform2.java which is my main class for this project:
public static void main(String[] args){
System.out.println("Code Execution Started");
String className = "Mapper_Reducer_Classes";
Platform2 myPlatform = new Platform2();
//step 1: compile the received class file dynamically:
boolean compResult = myPlatform.compile(className);
System.out.println(className + ".java compilation result: "+compResult);
//step 2: make a JAR file out of the compiled file:
if (compResult) {
compResult = myPlatform.makeJAR("jar_file", myPlatform.compilation_Output_Folder);
System.out.println("JAR creation result: "+compResult);
}
//step 3: Now let's run the Hadoop job:
if (compResult) {
compResult = myPlatform.run_Hadoop_Job(className);
System.out.println("Running on Hadoop result: "+compResult);
}
The method that is causing me all the problems is the run_Hadoop_Job() which is as below:
private boolean run_Hadoop_Job(String className){
try{
System.out.println("*Starting to run the code on Hadoop...");
String[] argsTemp = { "project_test/input", "project_test/output" };
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:54310");
conf.set("mapred.job.tracker", "localhost:54311");
conf.set("mapred.jar", jar_Output_Folder + "/jar_file"+".jar");
conf.set("libjars", required_Execution_Classes);
//THIS IS WHERE IT CAN'T FIND THE MENTIONED CLASSES, ALTHOUGH THEY EXIST BOTH ON DISK
// AND IN THE CREATED JAR FILE:??????
System.out.println("Getting Mapper/Reducer package name: " +
Mapper_Reducer_Classes.class.getName());
conf.set("mapreduce.map.class", "reza.rCloud.Mapper_Reducer_Classes$Mapper_Class");
conf.set("mapreduce.reduce.class", "reza.rCloud.Mapper_Reducer_Classes$Reducer_Class");
Job job = new Job(conf, "Hadoop Example for dynamically and programmatically compiling-running a job");
job.setJarByClass(Platform2.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(argsTemp[0]));
FileSystem fs = FileSystem.get(conf);
Path out = new Path(argsTemp[1]);
fs.delete(out, true);
FileOutputFormat.setOutputPath(job, new Path(argsTemp[1]));
//job.submit();
System.out.println("*and now submitting the job to Hadoop...");
System.exit(job.waitForCompletion(true) ? 0 : 1);
System.out.println("Job Finished!");
} catch (Exception e) {
System.out.println("****************Exception!" );
e.printStackTrace();
return false;
}
return true;
}
if needed, here's the source code for the compile() method:
private boolean compile(String className) {
String fileToCompile = JOB_FOLDER + "/" +className+".java";
JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
FileOutputStream errorStream = null;
try{
errorStream = new FileOutputStream(JOB_FOLDER + "/logs/Errors.txt");
} catch(FileNotFoundException e){
//if problem creating the file, default wil be console
}
int compilationResult =
compiler.run( null, null, errorStream,
"-classpath", required_Compilation_Classes,
"-d", compilation_Output_Folder,
fileToCompile);
if (compilationResult == 0) {
//Compilation is successful:
return true;
} else {
//Compilation Failed:
return false;
}
}
and the source code for makeJAR() method:
private boolean makeJAR(String outputFileName, String inputDirectory) {
Manifest manifest = new Manifest();
manifest.getMainAttributes().put(Attributes.Name.MANIFEST_VERSION,
"1.0");
JarOutputStream target = null;
try {
target = new JarOutputStream(new FileOutputStream(
jar_Output_Folder+ "/"
+ outputFileName+".jar" ), manifest);
add(new File(inputDirectory), target);
} catch (Exception e) { return false; }
finally {
if (target != null)
try{
target.close();
} catch (Exception e) { return false; }
}
return true;
}
private void add(File source, JarOutputStream target) throws IOException
{
BufferedInputStream in = null;
try
{
if (source.isDirectory())
{
String name = source.getPath().replace("\\", "/");
if (!name.isEmpty())
{
if (!name.endsWith("/"))
name += "/";
JarEntry entry = new JarEntry(name);
entry.setTime(source.lastModified());
target.putNextEntry(entry);
target.closeEntry();
}
for (File nestedFile: source.listFiles())
add(nestedFile, target);
return;
}
JarEntry entry = new JarEntry(source.getPath().replace("\\", "/"));
entry.setTime(source.lastModified());
target.putNextEntry(entry);
in = new BufferedInputStream(new FileInputStream(source));
byte[] buffer = new byte[1024];
while (true)
{
int count = in.read(buffer);
if (count == -1)
break;
target.write(buffer, 0, count);
}
target.closeEntry();
}
finally
{
if (in != null)
in.close();
}
}
and finally the fixed parameters used for accessing the files:
private String JOB_FOLDER = "/Users/reza/My_Software/rCloud_test/src/reza/rCloud";
private String HADOOP_SOURCE_FOLDER = "/Users/reza/My_Software/hadoop-0.20.2";
private String required_Compilation_Classes = HADOOP_SOURCE_FOLDER + "/hadoop-0.20.2-core.jar";
private String required_Execution_Classes = required_Compilation_Classes + "," +
"/Users/reza/My_Software/ActorFoundry_dist_ver/lib/commons-cli-1.1.jar," +
"/Users/reza/My_Software/ActorFoundry_dist_ver/lib/commons-logging-1.1.1.jar";
public String compilation_Output_Folder = "/Users/reza/My_Software/rCloud_test/dyn_classes";
private String jar_Output_Folder = "/Users/reza/My_Software/rCloud_test/dyn_jar";
As a result of running the Platform2, the structure of the project on disk looks as below:
rCloud_test/classes/reza/rCloud/Platform2.class: contain the Platform2 class
rCloud_test/dyn_classes/reza/rCloud/ contains the classes for Mapper_Reducer_Classes.class, Mapper_Reducer_Classes$Mapper_Class.class, and Mapper_Reducer_Classes$Reducer_Class.class
rCloud_test/dyn_jar/jar_file.jar contains the created jar file
REVSED: here's the source code for the rCloud_test/src/reza/rCloud/Mapper_Reducer_Classes.java:
package reza.rCloud;
import java.io.IOException;
import java.lang.InterruptedException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class Mapper_Reducer_Classes {
/**
* The map class of WordCount.
*/
public static class Mapper_Class
extends Mapper<Object, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
/**
* The reducer class of WordCount
*/
public static class Reducer_Class
extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}
}

Try to set them by using the setClass() method :
conf.setClass("mapreduce.map.class",
Class.forName("reza.rCloud.Mapper_Reducer_Classes$Mapper_Class"),
Mapper.class);
conf.setClass("mapreduce.reduce.class",
Class.forName("reza.rCloud.Mapper_Reducer_Classes$Reducer_Class"),
Reducer.class);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Fork/Join combine with FileChannel to copy file - java

Related

File Downloader with Java Commons IO stuck at 1MB/s+

Is it necessary lock the file when multiple threads try to append the content using NIO in JAVA?

OutOfMemoryError: Java heap space when trying to create ArrayList

I am trying to generate 7z file using java exec utility but it will creating empty zip file

Dynamically compiling and Running a Hadoop job from another Java File

Categories

Resources