sparklyr failing with java.lang.OutOfMemoryError: GC overhead limit exceeded

sparklyr failing with java.lang.OutOfMemoryError: GC overhead limit exceeded - java

I'm hitting a GC overhead limit exceeded error in Spark using spark_apply. Here are my specs:
sparklyr v0.6.2
Spark v2.1.0
4 workers with 8 cores and 29G of memory
The closure get_dates pulls data from Cassandra one row at a time. There are about 200k rows total. The process run for about an hour and a half and then given me this memory error.
I've experimented with spark.driver.memory which is supposed to increase the heap size, but it's not working.
Any ideas? Usage below
> config <- spark_config()
> config$spark.executor.cores = 1 # this ensures a max of 32 separate executors
> config$spark.cores.max = 26 # this ensures that cassandra gets some resources too, not all to spark
> config$spark.driver.memory = "4G"
> config$spark.driver.memoryOverhead = "10g"
> config$spark.executor.memory = "4G"
> config$spark.executor.memoryOverhead = "1g"
> sc <- spark_connect(master = "spark://master",
+ config = config)
> accounts <- sdf_copy_to(sc, insight %>%
+ # slice(1:100) %>%
+ {.}, "accounts", overwrite=TRUE)
> accounts <- accounts %>% sdf_repartition(78)
> dag <- spark_apply(accounts, get_dates, group_by = c("row"),
+ columns = list(row = "integer",
+ last_update_by = "character",
+ last_end_time = "character",
+ read_val = "numeric",
+ batch_id = "numeric",
+ fail_reason = "character",
+ end_time = "character",
+ meas_type = "character",
+ svcpt_id = "numeric",
+ org_id = "character",
+ last_update_date = "character",
+ validation_status = "character"
+ ))
> peak_usage <- dag %>% collect
Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.spark.sql.execution.SparkPlan$$anon$1.next(SparkPlan.scala:260)
at org.apache.spark.sql.execution.SparkPlan$$anon$1.next(SparkPlan.scala:254)
at scala.collection.Iterator$class.foreach(Iterator.scala:743)
at org.apache.spark.sql.execution.SparkPlan$$anon$1.foreach(SparkPlan.scala:254)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeCollect$1.apply(SparkPlan.scala:276)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeCollect$1.apply(SparkPlan.scala:275)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:275)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2371)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2765)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2370)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2375)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2375)
at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2778)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2375)
at org.apache.spark.sql.Dataset.collect(Dataset.scala:2351)
at sparklyr.Utils$.collect(utils.scala:196)
at sparklyr.Utils.collect(utils.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sparklyr.Invoke$.invoke(invoke.scala:102)
at sparklyr.StreamHandler$.handleMethodCall(stream.scala:97)
at sparklyr.StreamHandler$.read(stream.scala:62)
at sparklyr.BackendHandler.channelRead0(handler.scala:52)
at sparklyr.BackendHandler.channelRead0(handler.scala:14)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)

Maybe I have misread your example but the memory problem seems to occur when you collect and not when you use spark_apply. Try
config$spark.driver.maxResultSize <- XXX
where XXX is what you expect to need (I have set it to 4G for a similar job). See https://spark.apache.org/docs/latest/configuration.html for further details.

This is a GC problem, maybe you should try configuring your JVM with other arguments, are you using G1 as your GC?
If you are not able to provide more memory and you have issues with the gc collect times, you should try using another JVM (maybe Zing from Azul systems?

I've set the overhead memory needed for spark_apply using spark.yarn.executor.memoryOverhead. I've found that using the by= argument of sfd_repartition is useful and using the group_by= in spark_apply also helps. The more you are able to split up your data between executors the better.

Related

Iterative GraphFrames AggregateMessages hitting memory limits

I'm using GraphFrame's aggregateMessages capability to build a custom clustering algorithm. I tested this algorithm on a small sample dataset (~100 items) and verified that it works. But when I run this on my real dataset of 50k items, I am getting OOM errors after ~10 iterations. Interestingly, the first few iterations are processed in a couple mins and mem is the normal range. It's after iteration 6 that mem usage creeps to ~30GB and eventually bombs. I am running this on a 2 node cluster 16cores with 32GB.
Since this is an iterative algorithm and the fact that the mem after each iteration only increases, I wonder if I need to release memory somehow. I added the unpersist blocks at the end of the the loop but that hasnt helped.
Are there any other efficiencies I could use? Are there best practices around using GraphFrames in an iterative setting?
Another thing I've noticed is that on the spark UI of the executor page, the used "storage memory" for ~300MB, but the spark process is infact taking ~30GB. Not sure if this is a memory leak!
while ( true ) {
System.out.println("["+new Date()+"] Running " + i);
Dataset<Row> lastRoutesDs = groups;
Dataset<Row> groupUnwind = groups.withColumn("id", explode(col("routeItems")));
GraphFrame gf = new GraphFrame(groupUnwind, edgesDs);
Dataset<Row> lvl1 = gf.aggregateMessages()
.sendToSrc(when(
callUDF("contains_in_array_str", AggregateMessages.dst().getField("routeItems"),
AggregateMessages.src().getField("id")).equalTo(false),
struct(AggregateMessages.dst().getField("routeItems").as("routeItems"),
AggregateMessages.dst().getField("routeScores").as("routeScores"),
AggregateMessages.dst().getField("grpId").as("grpId"),
AggregateMessages.dst().getField("grpScore").as("grpScore"),
AggregateMessages.edge().getField("score").as("edgeScore"))))
.agg(collect_set(AggregateMessages.msg()).as("incomings"))
.withColumn("inItem", explode(col("incomings")))
.groupBy("id", "inItem.grpId")
.agg(first("inItem.routeItems").as("routeItems"), first("inItem.routeScores").as("routeScores"),
first("inItem.grpScore").as("grpScore"), collect_list("inItem.edgeScore").as("inScores"))
.groupBy("grpId")
.agg(bestRouteAgg.apply(col("routeItems"), col("routeScores"), col("inScores"), col("grpScore"),
col("id"), col("grpScore")).as("best"))
.withColumn("newScore", callUDF("calcRouteScores", expr("size(best.routeItems)+1"),
col("best.routeScores"), col("best.inScores")))
.withColumn("edgeCount", expr("size(best.routeScores)"))
.persist(StorageLevel.MEMORY_AND_DISK());
lvl1
.filter("newScore > " + groupMaxScore)
.withColumn("itr", lit(i))
.select("grpId", "best.routeItems","best.routeScores", "best.grpScore", "edgeCount", "itr")
.write()
.mode(SaveMode.Append)
.json(workspaceDir + "clusters-rank-collect");
if (lvl1.count() == 0) {
System.out.println("****** End reached " + i);
break;
}
Dataset<Row> newGroups = lvl1.filter("newScore <= " + groupMaxScore)
.withColumn("routeItems_new",
callUDF("merge2Array", col("best.routeItems"), array(col("best.newNode"))))
.withColumn("routeScores_new",
callUDF("merge2ArrayDouble", col("best.routeScores"), col("best.inScores")))
.select(col("grpId"), col("routeItems_new").as("routeItems"),
col("routeScores_new").as("routeScores"), col("newScore").as("grpScore"));
if (i > 0 && (i % 2) == 0) {
newGroups = newGroups
.checkpoint();
}
newGroups = newGroups
.persist(StorageLevel.DISK_ONLY());
System.out.println( newGroups.count() );
groups.unpersist();
lastRoutesDs.unpersist();
groupUnwind.unpersist();
lvl1.unpersist();
groups = newGroups;
i++;
}

Sigar ProcCpu gather method always returns 0 for percentage value

I'm using Sigar to try and get the CPU and memory usage of individual processes (under Windows). I am able to get these stats correctly for the system as a whole with the below code :
Sigar sigar = new Sigar();
long totalMemory = sigar.getMem().getTotal() / 1024 /1024;
model.addAttribute("totalMemory", totalMemory);
double usedPercentage = sigar.getMem().getUsedPercent();
model.addAttribute("usedPercentage", String.format( "%.2f", usedPercentage));
double freePercentage = sigar.getMem().getFreePercent();
model.addAttribute("freePercentage", String.format( "%.2f", freePercentage));
double cpuUsedPercentage = sigar.getCpuPerc().getCombined() * 100;
model.addAttribute("cpuUsedPercentage", String.format( "%.2f", cpuUsedPercentage));
This displays the following quite nicely in my web page :
Total System Memory : 16289 MB
Used Memory Percentage : 66.81 %
Free Memory Percentage : 33.19 %
CPU Usage : 30.44 %
Now I'm trying to get info from individual processes such as Java and SQL Server and, while the memory is correctly gathered, the CPU usage for both processes is ALWAYS 0. Below is the code I'm using :
Sigar sigar = new Sigar();
List<ProcessInfo> processes = new ArrayList<>();
ProcessFinder processFinder = new ProcessFinder(sigar);
long[] javaPIDs = null;
Long sqlPID = null;
try
{
javaPIDs = processFinder.find("Exe.Name.ct=" + "java.exe");
sqlPID = processFinder.find("Exe.Name.ct=" + "sqlservr.exe")[0];
}
catch (Exception ex)
{}
int i = 0;
while (i < javaPIDs.length)
{
Long javaPID = javaPIDs[i];
ProcessInfo javaProcess = new ProcessInfo();
javaProcess.setPid(javaPID);
javaProcess.setName("Java");
ProcMem javaMem = new ProcMem();
javaMem.gather(sigar, javaPID);
javaProcess.setMemoryUsage(javaMem.getResident() / 1024 / 1024);
MultiProcCpu javaCpu = new MultiProcCpu();
javaCpu.gather(sigar, javaPID);
javaProcess.setCpuUsage(String.format("%.2f", javaCpu.getPercent() * 100));
processes.add(javaProcess);
i++;
}
if (sqlPID != null)
{
ProcessInfo sqlProcess = new ProcessInfo();
sqlProcess.setPid(sqlPID);
sqlProcess.setName("SQL Server");
ProcMem sqlMem = new ProcMem();
sqlMem.gather(sigar, sqlPID);
sqlProcess.setMemoryUsage(sqlMem.getResident() / 1024 / 1024);
ProcCpu sqlCpu = new MultiProcCpu();
sqlCpu.gather(sigar, sqlPID);
sqlProcess.setCpuUsage(String.format( "%.2f", sqlCpu.getPercent()));
processes.add(sqlProcess);
}
model.addAttribute("processes", processes);
I have tried both ProcCpu and MultiProcCpu and both of them always return 0.0 even if I can see Java using 15% CPU in task manager. The documentation on the Sigar library is virtually non existent but the research i did tells me that i appear to be doing this correctly.
Does anyone know what I'm doing wrong?
Thanks!

I found the issue while continuing to search online. Basically, the sigar library can only retrieve the correct CPU values after a certain time. My issue is that i was initializing a new Sigar instance every time the page was displayed. I made my Sigar instance global to my Spring controller and now it returns correct percentages.

Orientdb - SQL query with millions of vertices causes Java OutOfMemory error

I need to create edges between all vertices of class V1 and all vertices of class V2. My classes have 2-3 million vertices each. A double for loop with a SELECT * FROM V1, SELECT * FROM V2 gives a Java OutOfMemory (heap space) error (see below). This is an offline process that will be performed once or twice if needed (not a frequent operation) as the graph will not be regularly updated by the users, only myself.
How can I do it in batches (using SELECT...LIMIT or g.getvertices()) to avoid this?
Here's my code:
OrientGraphNoTx G = MyOrientDBFactory.getNoTx();
G.setUseLightweightEdges(false);
G.declareIntent(new OIntentMassiveInsert());
for (Vertex p1 : (Iterable<Vertex>) EG.command( new OCommandSQL("SELECT * FROM V1")).execute())
{
for (Vertex p2 : (Iterable<Vertex>) EG.command( new OCommandSQL("SELECT * FROM V2")).execute())
{
if (p1.getProperty("prop1")==p2.getProperty("prop1")
{
//p1.addEdge("MyEdge", p2);
EG.command( new OCommandSQL("create edge MyEdge from" + p1.getId() +"to "+ p2.getId() + " retry 100") ).execute ();
}
}
}
G.shutdown();
OrientDB 2.1.5 with Java/Graph API
NetBeans 8.1 with VM options -Xmx4096m and -Dstorage.diskCache.bufferSize=7200
Error message in console:
2016-05-24 15:48:06:112 INFO {db=MyDB} [TIP] Query 'SELECT * FROM
V1' returned a result set with more than 10000 records. Check if
you really need all these records, or reduce the resultset by using a
LIMIT to improve both performance and used RAM
[OProfilerStub]java.lang.OutOfMemoryError: Java heap space Dumping
heap to java_pid7896.hprof ...
Error message in Netbeans output
Exception in thread "main"
com.orientechnologies.orient.enterprise.channel.binary.OResponseProcessingException:
Exception during response processing. at
com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryAsynchClient.throwSerializedException(OChannelBinaryAsynchClient.java:443)
at
com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryAsynchClient.handleStatus(OChannelBinaryAsynchClient.java:398)
at
com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:282)
at
com.orientechnologies.orient.enterprise.channel.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:171)
at
com.orientechnologies.orient.client.remote.OStorageRemote.beginResponse(OStorageRemote.java:2166)
at
com.orientechnologies.orient.client.remote.OStorageRemote.command(OStorageRemote.java:1189)
at
com.orientechnologies.orient.client.remote.OStorageRemoteThread.command(OStorageRemoteThread.java:444)
at
com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:63)
at
com.tinkerpop.blueprints.impls.orient.OrientGraphCommand.execute(OrientGraphCommand.java:49)
at xx.xxx.xxx.xx.MyEdge.(MyEdge.java:40) at
xx.xxx.xxx.xx.GMain.main(GMain.java:60) Caused by:
java.lang.OutOfMemoryError: GC overhead limit exceeded

As a workaround you can use code similar to the following
Iterable<Vertex> cv1= g.command( new OCommandSQL("SELECT count(*) FROM V1")).execute();
long counterv1=cv1.iterator().next().getProperty("count");
int[] ids=g.getRawGraph().getMetadata().getSchema().getClass("V1").getClusterIds();
long repeat=counterv1/10000;
long rest=counterv1-(repeat*10000);
List<Vertex> v1=new ArrayList<Vertex>();
int rid=0;
for(int i=0;i<repeat;i++){
Iterable<Vertex> v= g.command( new OCommandSQL("SELECT * FROM V1 WHERE #rid >= " + ids[0] + ":" + rid + " limit 10000")).execute();
CollectionUtils.addAll(v1, v.iterator());
rid=10000*(i+1);
}
if(rest>0){
Iterable<Vertex> v=g.command( new OCommandSQL("SELECT * FROM V1 WHERE #rid >= " + ids[0] + ":" + rid + " limit "+ rest)).execute();
CollectionUtils.addAll(v1, v.iterator());
}
Hope it helps.

from where Jvm takes memory for Heap in 64 bit OS?

Am getting 104 Gp heap memory using command -Xms500m -Xmx139000m
Am usig core i5 processor 64 bit Windows 7 Os. 500 Gp harddisk. 4GB RAM Only
I Just want to know from where Jvm takes memory of 104 GB(heap) ?
In the output no memmory usage is displayed?
`public class CPUusage {
public static void main(String[] args)throws Exception
{
int mb = 1024*1024;
int GB = 1024*1024*1024;
/* Total number of processors or cores available to the JVM */
System.out.println("Available processors (cores): " + Runtime.getRuntime().availableProcessors());
/* Total amount of free memory available to the JVM */
System.out.println("Free memory (MB): " + Runtime.getRuntime().freeMemory()/mb);
/* This will return Long.MAX_VALUE if there is no preset limit */
long maxMemory = Runtime.getRuntime().maxMemory()/GB;
/* Maximum amount of memory the JVM will attempt to use */
System.out.println("Maximum memory (GB): " + maxMemory);
/* Total memory currently in use by the JVM */
System.out.println("Total memory (MB): " + Runtime.getRuntime().totalMemory()/mb);
/* Get a list of all filesystem roots on this system */
File log=new File("D:\\log.txt");
log.createNewFile();
FileWriter fstream = new FileWriter(log);
BufferedWriter out = new BufferedWriter(fstream);
out.write("--------------------------------------------"+"\n\n");
out.write("Available processors (cores): " + Runtime.getRuntime().availableProcessors());
out.newLine();
out.write("Free memory (MB): " + Runtime.getRuntime().freeMemory()/mb);
out.newLine();
out.write("Maximum memory (MB): " + (maxMemory == Long.MAX_VALUE ? "no limit" : maxMemory));
out.newLine();
out.write("Total memory (MB): " + Runtime.getRuntime().totalMemory()/mb);
out.newLine();
File[] roots = log.listRoots();
/* For each filesystem root, print some info */
for (File root : roots) {
System.out.println("-------------------------------------------");
System.out.println("File system root: " + root.getAbsolutePath());
System.out.println("Total space (GB): " + root.getTotalSpace()/GB);
System.out.println("Free space (GB): " + root.getFreeSpace()/GB);
System.out.println("Usable space (GB): " + root.getUsableSpace()/GB);
out.write("-------------------------------------------");
out.newLine();
out.write("File system root: " + root.getAbsolutePath());
out.newLine();
out.write("Total space (GB): " + root.getTotalSpace()/GB);
out.newLine();
out.write("Free space (GB): " + root.getFreeSpace()/GB);
out.newLine();
out.write("Usable space (GB): " + root.getUsableSpace()/GB);
out.newLine();
}
out.write("-------------------------------------------");
out.newLine();
out.close();
}
}
`
And the output is
`Available processors (cores): 4
Free memory (MB): 476
Maximum memory (GB): 104
Total memory (MB): 479
-------------------------------------------
File system root: C:\
Total space (GB): 97
Free space (GB): 70
Usable space (GB): 70
-------------------------------------------
File system root: D:\
Total space (GB): 368
Free space (GB): 366
Usable space (GB): 366
`

The options "-Xms500m -Xmx139000m" mean "allocate an initial heap size of 500Mb, and let it grow to a maximum of 139GB ... if it needs to".
The output you are seeing from your program is entirely consistent with that. At the point the program ran, the heap had not reached 139Gb. And it might never reach that level. And it may not even be able to reach that level ... depending on the resources that the operating system is able to give the JVM if / when it asks for them.
If you really want to force the JVM to use a 139Gb heap, you should try setting 139Gb as the initial heap size too; e.g. "-Xms139000m -Xmx139000m". But that's probably not a good idea, especially if you don't have that much physical RAM.

Virtual memory means RAM is allocated on usage, not when the address space is reserved. If you actually use the 104GB, the OS will use swap (a file or partition on disk) to maintain the illusion of more RAM than you physically have.

Get OS-level system information

I'm currently building a Java app that could end up being run on many different platforms, but primarily variants of Solaris, Linux and Windows.
Has anyone been able to successfully extract information such as the current disk space used, CPU utilisation and memory used in the underlying OS? What about just what the Java app itself is consuming?
Preferrably I'd like to get this information without using JNI.

You can get some limited memory information from the Runtime class. It really isn't exactly what you are looking for, but I thought I would provide it for the sake of completeness. Here is a small example. Edit: You can also get disk usage information from the java.io.File class. The disk space usage stuff requires Java 1.6 or higher.
public class Main {
public static void main(String[] args) {
/* Total number of processors or cores available to the JVM */
System.out.println("Available processors (cores): " +
Runtime.getRuntime().availableProcessors());
/* Total amount of free memory available to the JVM */
System.out.println("Free memory (bytes): " +
Runtime.getRuntime().freeMemory());
/* This will return Long.MAX_VALUE if there is no preset limit */
long maxMemory = Runtime.getRuntime().maxMemory();
/* Maximum amount of memory the JVM will attempt to use */
System.out.println("Maximum memory (bytes): " +
(maxMemory == Long.MAX_VALUE ? "no limit" : maxMemory));
/* Total memory currently available to the JVM */
System.out.println("Total memory available to JVM (bytes): " +
Runtime.getRuntime().totalMemory());
/* Get a list of all filesystem roots on this system */
File[] roots = File.listRoots();
/* For each filesystem root, print some info */
for (File root : roots) {
System.out.println("File system root: " + root.getAbsolutePath());
System.out.println("Total space (bytes): " + root.getTotalSpace());
System.out.println("Free space (bytes): " + root.getFreeSpace());
System.out.println("Usable space (bytes): " + root.getUsableSpace());
}
}
}

The java.lang.management package does give you a whole lot more info than Runtime - for example it will give you heap memory (ManagementFactory.getMemoryMXBean().getHeapMemoryUsage()) separate from non-heap memory (ManagementFactory.getMemoryMXBean().getNonHeapMemoryUsage()).
You can also get process CPU usage (without writing your own JNI code), but you need to cast the java.lang.management.OperatingSystemMXBean to a com.sun.management.OperatingSystemMXBean. This works on Windows and Linux, I haven't tested it elsewhere.
For example ... call the get getCpuUsage() method more frequently to get more accurate readings.
public class PerformanceMonitor {
private int availableProcessors = getOperatingSystemMXBean().getAvailableProcessors();
private long lastSystemTime = 0;
private long lastProcessCpuTime = 0;
public synchronized double getCpuUsage()
{
if ( lastSystemTime == 0 )
{
baselineCounters();
return;
}
long systemTime = System.nanoTime();
long processCpuTime = 0;
if ( getOperatingSystemMXBean() instanceof OperatingSystemMXBean )
{
processCpuTime = ( (OperatingSystemMXBean) getOperatingSystemMXBean() ).getProcessCpuTime();
}
double cpuUsage = (double) ( processCpuTime - lastProcessCpuTime ) / ( systemTime - lastSystemTime );
lastSystemTime = systemTime;
lastProcessCpuTime = processCpuTime;
return cpuUsage / availableProcessors;
}
private void baselineCounters()
{
lastSystemTime = System.nanoTime();
if ( getOperatingSystemMXBean() instanceof OperatingSystemMXBean )
{
lastProcessCpuTime = ( (OperatingSystemMXBean) getOperatingSystemMXBean() ).getProcessCpuTime();
}
}
}

I think the best method out there is to implement the SIGAR API by Hyperic. It works for most of the major operating systems ( darn near anything modern ) and is very easy to work with. The developer(s) are very responsive on their forum and mailing lists. I also like that it is GPL2 Apache licensed. They provide a ton of examples in Java too!
SIGAR == System Information, Gathering And Reporting tool.

There's a Java project that uses JNA (so no native libraries to install) and is in active development. It currently supports Linux, OSX, Windows, Solaris and FreeBSD and provides RAM, CPU, Battery and file system information.
https://github.com/oshi/oshi

For windows I went this way.
com.sun.management.OperatingSystemMXBean os = (com.sun.management.OperatingSystemMXBean) ManagementFactory.getOperatingSystemMXBean();
long physicalMemorySize = os.getTotalPhysicalMemorySize();
long freePhysicalMemory = os.getFreePhysicalMemorySize();
long freeSwapSize = os.getFreeSwapSpaceSize();
long commitedVirtualMemorySize = os.getCommittedVirtualMemorySize();
Here is the link with details.

You can get some system-level information by using System.getenv(), passing the relevant environment variable name as a parameter. For example, on Windows:
System.getenv("PROCESSOR_IDENTIFIER")
System.getenv("PROCESSOR_ARCHITECTURE")
System.getenv("PROCESSOR_ARCHITEW6432")
System.getenv("NUMBER_OF_PROCESSORS")
For other operating systems the presence/absence and names of the relevant environment variables will differ.

Add OSHI dependency via maven:
<dependency>
<groupId>com.github.dblock</groupId>
<artifactId>oshi-core</artifactId>
<version>2.2</version>
</dependency>
Get a battery capacity left in percentage:
SystemInfo si = new SystemInfo();
HardwareAbstractionLayer hal = si.getHardware();
for (PowerSource pSource : hal.getPowerSources()) {
System.out.println(String.format("%n %s # %.1f%%", pSource.getName(), pSource.getRemainingCapacity() * 100d));
}

Have a look at the APIs available in the java.lang.management package. For example:
OperatingSystemMXBean.getSystemLoadAverage()
ThreadMXBean.getCurrentThreadCpuTime()
ThreadMXBean.getCurrentThreadUserTime()
There are loads of other useful things in there as well.

Usually, to get low level OS information you can call OS specific commands which give you the information you want with Runtime.exec() or read files such as /proc/* in Linux.

CPU usage isn't straightforward -- java.lang.management via com.sun.management.OperatingSystemMXBean.getProcessCpuTime comes close (see Patrick's excellent code snippet above) but note that it only gives access to time the CPU spent in your process. it won't tell you about CPU time spent in other processes, or even CPU time spent doing system activities related to your process.
for instance i have a network-intensive java process -- it's the only thing running and the CPU is at 99% but only 55% of that is reported as "processor CPU".
don't even get me started on "load average" as it's next to useless, despite being the only cpu-related item on the MX bean. if only sun in their occasional wisdom exposed something like "getTotalCpuTime"...
for serious CPU monitoring SIGAR mentioned by Matt seems the best bet.

On Windows, you can run the systeminfo command and retrieves its output for instance with the following code:
private static class WindowsSystemInformation
{
static String get() throws IOException
{
Runtime runtime = Runtime.getRuntime();
Process process = runtime.exec("systeminfo");
BufferedReader systemInformationReader = new BufferedReader(new InputStreamReader(process.getInputStream()));
StringBuilder stringBuilder = new StringBuilder();
String line;
while ((line = systemInformationReader.readLine()) != null)
{
stringBuilder.append(line);
stringBuilder.append(System.lineSeparator());
}
return stringBuilder.toString().trim();
}
}

If you are using Jrockit VM then here is an other way of getting VM CPU usage. Runtime bean can also give you CPU load per processor. I have used this only on Red Hat Linux to observer Tomcat performance. You have to enable JMX remote in catalina.sh for this to work.
JMXServiceURL url = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://my.tomcat.host:8080/jmxrmi");
JMXConnector jmxc = JMXConnectorFactory.connect(url, null);
MBeanServerConnection conn = jmxc.getMBeanServerConnection();
ObjectName name = new ObjectName("oracle.jrockit.management:type=Runtime");
Double jvmCpuLoad =(Double)conn.getAttribute(name, "VMGeneratedCPULoad");

It is still under development but you can already use jHardware
It is a simple library that scraps system data using Java. It works in both Linux and Windows.
ProcessorInfo info = HardwareInfo.getProcessorInfo();
//Get named info
System.out.println("Cache size: " + info.getCacheSize());
System.out.println("Family: " + info.getFamily());
System.out.println("Speed (Mhz): " + info.getMhz());
//[...]

One simple way which can be used to get the OS level information and I tested in my Mac which works well :
OperatingSystemMXBean osBean =
(OperatingSystemMXBean)ManagementFactory.getOperatingSystemMXBean();
return osBean.getProcessCpuLoad();
You can find many relevant metrics of the operating system here

To get the System Load average of 1 minute, 5 minutes and 15 minutes inside the java code, you can do this by executing the command cat /proc/loadavg using and interpreting it as below:
Runtime runtime = Runtime.getRuntime();
BufferedReader br = new BufferedReader(
new InputStreamReader(runtime.exec("cat /proc/loadavg").getInputStream()));
String avgLine = br.readLine();
System.out.println(avgLine);
List<String> avgLineList = Arrays.asList(avgLine.split("\\s+"));
System.out.println(avgLineList);
System.out.println("Average load 1 minute : " + avgLineList.get(0));
System.out.println("Average load 5 minutes : " + avgLineList.get(1));
System.out.println("Average load 15 minutes : " + avgLineList.get(2));
And to get the physical system memory by executing the command free -m and then interpreting it as below:
Runtime runtime = Runtime.getRuntime();
BufferedReader br = new BufferedReader(
new InputStreamReader(runtime.exec("free -m").getInputStream()));
String line;
String memLine = "";
int index = 0;
while ((line = br.readLine()) != null) {
if (index == 1) {
memLine = line;
}
index++;
}
// total used free shared buff/cache available
// Mem: 15933 3153 9683 310 3097 12148
// Swap: 3814 0 3814
List<String> memInfoList = Arrays.asList(memLine.split("\\s+"));
int totalSystemMemory = Integer.parseInt(memInfoList.get(1));
int totalSystemUsedMemory = Integer.parseInt(memInfoList.get(2));
int totalSystemFreeMemory = Integer.parseInt(memInfoList.get(3));
System.out.println("Total system memory in mb: " + totalSystemMemory);
System.out.println("Total system used memory in mb: " + totalSystemUsedMemory);
System.out.println("Total system free memory in mb: " + totalSystemFreeMemory);

Hey you can do this with java/com integration. By accessing WMI features you can get all the information.

Not exactly what you asked for, but I'd recommend checking out ArchUtils and SystemUtils from commons-lang3. These also contain some relevant helper facilities, e.g.:
import static org.apache.commons.lang3.ArchUtils.*;
import static org.apache.commons.lang3.SystemUtils.*;
System.out.printf("OS architecture: %s\n", OS_ARCH); // OS architecture: amd64
System.out.printf("OS name: %s\n", OS_NAME); // OS name: Linux
System.out.printf("OS version: %s\n", OS_VERSION); // OS version: 5.18.16-200.fc36.x86_64
System.out.printf("Is Linux? - %b\n", IS_OS_LINUX); // Is Linux? - true
System.out.printf("Is Mac? - %b\n", IS_OS_MAC); // Is Mac? - false
System.out.printf("Is Windows? - %b\n", IS_OS_WINDOWS); // Is Windows? - false
System.out.printf("JVM name: %s\n", JAVA_VM_NAME); // JVM name: Java HotSpot(TM) 64-Bit Server VM
System.out.printf("JVM vendor: %s\n", JAVA_VM_VENDOR); // JVM vendor: Oracle Corporation
System.out.printf("JVM version: %s\n", JAVA_VM_VERSION); // JVM version: 11.0.12+8-LTS-237
System.out.printf("Username: %s\n", getUserName()); // Username: johndoe
System.out.printf("Hostname: %s\n", getHostName()); // Hostname: garage-pc
var processor = getProcessor();
System.out.printf("CPU arch: %s\n", processor.getArch()) // CPU arch: BIT_64
System.out.printf("CPU type: %s\n", processor.getType()); // CPU type: X86

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

sparklyr failing with java.lang.OutOfMemoryError: GC overhead limit exceeded - java

This is a GC problem, maybe you should try configuring your JVM with other arguments, are you using G1 as your GC? If you are not able to provide more memory and you have issues with the gc collect times, you should try using another JVM (maybe Zing from Azul systems?

I've set the overhead memory needed for spark_apply using spark.yarn.executor.memoryOverhead. I've found that using the by= argument of sfd_repartition is useful and using the group_by= in spark_apply also helps. The more you are able to split up your data between executors the better.

Related

Iterative GraphFrames AggregateMessages hitting memory limits

Sigar ProcCpu gather method always returns 0 for percentage value

Orientdb - SQL query with millions of vertices causes Java OutOfMemory error

from where Jvm takes memory for Heap in 64 bit OS?

Get OS-level system information

Categories

Resources