I am using IntelliJ IDEA in Ubuntu 14.04 to test my hadoop program. When I chang the number of reducer, I use the following code:
job.setNumReduceTasks(3)
I use build artifacts in IDEA to build a jar file and input hadoop jar xxx.jar MyClass intput output in linux shell. The output shows 3 files (part-r-00000, part-r-00001, part-r-00002), which is completely my expectation. However, when I runs the program in IDEA for convenience using the arguments input/ output/, the output result only contains one file part-r-00000. So I am wondering where goes wrong.
When you run in local mode only one reducer will be used - there is no parallelism in local mode. Nothing is going wrong with your code here.
Also see https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html#Standalone_Operation:
Standalone Operation
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
Related
I can't get multiple threads to run using JOMP no matter what I try. I actually can't run a JOMP program from the command line no matter what I try either in fact, although ironically it will compile from there and then run in Eclipse! Even in Eclipse though I only have one thread. I've been through the notes from my university course about installation of JOMP carefully, but they have not helped. I'll be more specific though:
Items in quotes below are from those notes:
"There are a couple of websites that tell you how to make jomp run under Eclipse, see http://www.lst.inf.ethz.ch/teaching/lectures/ss10/24/ assignments/assignment_10/eclipse.txt"
This refers to a now broken link. It also seem to be the only link anyone on forums like Stackoverflow refer to when talking about this issue. Apparently it has instructions on runtime settings for Eclipse to allow multiple threads to run, but since the link is currently broken I can't access those valuable instructions.
"All that is required in order to do that is to ensure that jomp1.0b.jar is on the CLASSPATH"
I ran echo %CLASSPATH% at the command prompt to check if it was on the class path and got the following response:
C:\Program Files\Java\jre1.8.0_162\lib\jomp1.0b.jar
On my PC the jomp jar file is in that folder, so it appears I should be able to execute compiled JOMP programs from the command line, but unfortunately that is not the case. By executing one of these commands it should run:
java −Djomp.threads=2 parallel
java −Djomp.threads=2 -cp . parallel
java −Djomp.threads=2 -cp C:\Users\terry\eclipse-workspace\JOMPHello\src parallel
This is the folder the jomp, java and compiled class files are in. I also checked if "parallel" is the fully qualified class name in the way I have set it up in Eclipse, and it does appears to be. So running one of these commands should allow me to run the jomp program from the command line as near as I can tell, but they all return the following error:
Error: Could not find or load main class parallel
Caused by: java.lang.ClassNotFoundException: parallel
(To which I feel like telling Java, "You're not looking hard it enough! It is right in the folder I am running this command from!")
Clearly I am missing something. Can anyone tell me how to get JOMP programs running on the command line, or alternatively knows where there are accessible instructions for how to set up the work around runtime settings in Eclipse?
My implementation of the program seems to run with only one thread, so hopefully that means it is correct, but I can only be sure once I have run it with at least a few more threads.
Thanks,
Terry.
I figured out how to set up the runtime argument in Eclipse. You just have to add the following line into the VM Arguments box in under the Argument tab in Run Configurations for the file:
−Djomp.threads=n
(where n as before is the number of threads you want).
I'd still like to know why it's not working on the Command Line though. It makes me think my Java is set up weirdly.
i am working on hadoop but i am not able to understand how we can configure to eclipse and which software install to run hadoop program .in single machine in windows 7.
is there any plugin is available to run hadoop program .
can you please suggest me any link detailed information how can i rum hadoop program.
i refer many sides but i am not getting exact information how can i run program .
is there any side getting all information about the hadoop.program running in eclipse
and how can i map and reduce the program .
I followed the instructions on the Datomic site: http://docs.datomic.com/getting-started.html, but I'm getting this error when trying to start up the datomic shell prompt. I'm using a windows machine. Any suggestions? I tried the same thing on my linux box and did not get this error.
Edit: moved to a different windows machine and it's working. If I have time to troubleshoot this problem and I find a solution I'll report back
I noticed that you cannot run the shell.cmd from within the bin directory, you need to call it with bin\shell.cmd from the parent directory... hope that helps.
In case you are using cygwin/bash and call bin/shell :
The java runtime on windows does not understand classpath with a ":"
but this is what you get from bin/classpath.
Either correct this or use DOS-CMD shell and call bin/shell.cmd inside.
Regards
Some tips for running datomic on Windows (7 at least):
Do not download datomic into Program Files. On startup, it creates logging directories and temp files into its own directories, so unless you run the command prompt as Administrator, you're gonna have screens full of Unable to write to file... errors.
You need to run datomic as such (assuming you extracted the download to C:)
c:\datomic-free-0.x.xxxx>bin\shell.cmd
Note the backslash. Tripped me up forever coming from *nix world.
After that, return to your regularly scheduled datomic tutorials.
In a project, I'm trying to set up an automated build system for Apache Karaf (there are several commands I need to run in Karaf to set up a working environment on a fresh install). Karaf contains a batch/script file that sets several parameters, and then calls the actual Java program. Essentially, I'd like to be able to do something like:
java MyProgramClass.class < commandTextFile.txt
But when I try this it doesn't do anything. My goal is to simply copy the karaf.bat file, modify it slightly (as below) to make a "karaf-install.bat" that I can just run. The part I've modified of karaf.bat is below, and all I've done is add < "C:\commandFile.txt at the end (the following is all on one line, broken for readability):
"%JAVA%" %JAVA_OPTS% %OPTS% -classpath "%CLASSPATH%"
-Djava.endorsed.dirs="%JAVA_HOME%\jre\lib\endorsed;%JAVA_HOME%\lib\endorsed;%KARAF_HOME%\lib\endorsed"
-Djava.ext.dirs="%JAVA_HOME%\jre\lib\ext;%JAVA_HOME%\lib\ext;%KARAF_HOME%\lib\ext"
-Dkaraf.instances="%KARAF_HOME%\instances" -Dkaraf.home="%KARAF_HOME%"
-Dkaraf.base="%KARAF_BASE%" -Dkaraf.data="%KARAF_DATA%"
-Djava.util.logging.config.file="%KARAF_BASE%\etc\java.util.logging.properties"
%KARAF_OPTS% %MAIN% %ARGS% < "C:\commandFile.txt"
However, Karaf shows nothing. It just runs as if I executed it as normal; my commands are not executed. Is there a way to redirect INTO a java program from the console? Am I doing it way wrong?
For what it's worth, this will eventually be done on both Windows and OS X, but I'm focusing on Windows at the moment.
Update: turns out that this seems to work for me on OS X (Karaf struggles (by saying "Command not found: "), but I think it's because it's getting the commands before it's initialized everything), but Windows is still doesn't even get the commands. I'll poke around more.
When piping INTO, you can read it from System.in.
Consider it a Reader, not an InputStream.
I'm just going to write this issue off as Karaf weirdness, seeing as it works on OS X. I was able to work around it by using the client program that comes with Karaf by doing (on OS X in a .sh file):
"$KARAF/bin/client" "karaf_command_here"
or (on Windows in a .bat file)
call "%KARAF%\bin\client.bat" "karaf_command_here"
And instead of having a list of commands to pipe into Karaf, I just made the list of commands a shell/batch script file that would call Karaf's client for each command. Not as pretty as I'd have liked it, but it got the job done.
(Note you need to start Karaf before using the client with start (and close it with stop)).
I added a value at:
HKLM\Software\Microsoft\Windows\CurrentVersion\Run
That looks like this:
Value Name: LDE
Value Data: "java -jar C:\LDE\lde.jar"
Really with the quotes (Because all the others where also with quotes). After adding this, I restarted my computer, but it didn't start automatically.
Will wrapping my jar in an exe help?
I'm running Windows 7.
Any help?
Thanks in advance.
Update:
When I remove the quotes, it works. But now there is appears also a terminal, which I don't need...
A couple of things to note here, concerning the two different issues in the problem:
Format of Windows Run keys
From the Microsoft Windows XP knowledge base:
Run keys cause programs to
automatically run each time that a
user logs on. The Windows XP registry
includes the following four Run keys:
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Run
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\RunOnce
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\RunOnce
Each of these keys has a series of
values. The values allow multiple
entries to exist without overwriting
one another. The data value for a
value is a command line.
Note the emphasis on the last line. The moment quotes are used, the command is bound to fail execution in the same manner it fails as if executed from a command prompt.
Also, note that the above approach is for Windows XP and does hold good for Windows 7. More details can be found in this Microsoft Technet article on the options available in Windows 7.
The javaw vs java application launcher
Once the java process can be initialized at Windows startup, one will get a console window that continues to stay around until the process is terminated. This occurs if the java executable is utilized to initialize the application.
From the technotes of the java application launcher:
The javaw command is identical to
java, except that with javaw there is
no associated console window. Use
javaw when you don't want a command
prompt window to appear. The javaw
launcher will, however, display a
dialog box with error information if a
launch fails for some reason.
Therefore, if you wish to avoid opening a console window for the Java process, you ought to use the javaw executable.
This is very simple. You will find the startup folder in the C:/Documents and Settings/AllUsers/YourUserName/StartUp. It will be on similar kind of path just check it. Then just paste your jar file in that folder and it will work nice. Remember that you put the jar file in the startup folder of your user name folder. You may find that this folders might be hidden so just check it out. If you find this answer useful vote it. Enjoy.....