Neo4j import tool - OutOfMemory error: GC overhead limit exceeded - java

I am using the neo4j-import tool (Windows) to import ~1 million nodes with ~20 million relationships, all of which should be unique. The process proceeds smoothly until it gets to the "Relationship Count" task, where it loads all the way up to 20M (seemingly all of the relationships) but then it hangs for awhile (30 min-1 hour), eventually returning "java.lang.OutOfMemoryError: GC overhead limit exceeded".
I have loaded large graph databases successfully before (39M nodes, 21M relationships) so I'm not sure what the issue is. Is it because the graph database is more densely connected compared to the previous database that I loaded?
Or, could there be a memory leak? In my task manager, the Java Platform SE Binary process requires an increasingly large amount memory (up to 12-13GB out of 16GB of RAM) as the import loads, especially towards the end. This seems suspiciously large, especially since the 39M node/21M relationship graph database was able to import successfully using the import tool relatively quickly (didn't hang at relationship count).
Any thoughts as to what could be going wrong? Thanks in advance!
If it helps to look at my nodes/relationships files, here is a link to them:
https://drive.google.com/open?id=0Bw7N-SlJA3ZCei0ycEhoa2YwNUU
Here is the neo4j shell output:
C:Users\Username\Documents\Neo4j>neo4jImport -into graphDB1.graphdb --nodes D:\concept.csv --relationships D:\predicate.csv --stacktrace --idtype integer
WARNING! This batch script has been deprecated. Please use the provided PowerShell scripts instead: http://neo4j.com/docs/stable/powershell.html
The system cannot find the path specified.
Importing the contents of these files into graphDB1.graphdb:
Nodes:
D:\concept.csv
Relationships:
D:\predicate.csv
Available memory:
Free machine memory: 13.50 GB
Max heap memory : 12.75 GB
Nodes
[>:|PR|NOD|*LABEL SCAN---------------------------------|v:6.79 MB/s----------------------------] 1M
Done in 40s 562ms
Prepare node index
[*DETECT:20.37 MB------------------------------------------------------------------------------] 1M
Done in 802ms
Calculate dense nodes
[*>:59.38 MB/s----------------------------------|PREPARE(3)====================================] 20M
Done in 12s 566ms
Relationships
[>:2.01 |PREPARE-----------|P|RELATIONSHI|*v:4.05 MB/s-----------------------------------------] 20M
Done in 6m 3s 655ms
Node --> Relationship
[>:3.19 MB/s--------------------------|L|*v:2.39 MB/s------------------------------------------] 1M
Done in 8s 421ms
Relationship --> Relationship
[*>:6.82 MB/s--------------------------------------|LINK-----------|v:6.82 MB/s----------------] 20M
Done in 1m 36s 849ms
Node counts
[*COUNT:91.55 MB-------------------------------------------------------------------------------] 1M
Done in 3m 35s 21ms
Relationship counts
[*>:8.62 MB/s-----------------------------------------------------------|COUNT-----------------] 20MException in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOf(Unknown Source)
at java.util.ArrayList.toArray(Unknown Source)
at java.util.ArrayList.<init>(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.stats.StepStats.<init>(StepStats.java:39)
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.stats(AbstractStep.java:220)
at org.neo4j.unsafe.impl.batchimport.staging.StageExecution$1.compare(StageExecution.java:123)
at org.neo4j.unsafe.impl.batchimport.staging.StageExecution$1.compare(StageExecution.java:118)
at java.util.TimSort.countRunAndMakeAscending(Unknown Source)
at java.util.TimSort.sort(Unknown Source)
at java.util.TimSort.sort(Unknown Source)
at java.util.Arrays.sort(Unknown Source)
at java.util.Collections.sort(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stepsOrderedBy(StageExecution.java:117)
at org.neo4j.unsafe.impl.batchimport.staging.DynamicProcessorAssigner.assignProcessorsToPotentialBottleNeck(DynamicProcessorAssigner.java:94)
at org.neo4j.unsafe.impl.batchimport.staging.DynamicProcessorAssigner.check(DynamicProcessorAssigner.java:81)
at org.neo4j.unsafe.impl.batchimport.staging.MultiExecutionMonitor.check(MultiExecutionMonitor.java:106)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:65)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseExecution(ExecutionSupervisors.java:80)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:224)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:185)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:363)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:279)
UPDATE 1:
Here is the thread dump at the moment(s) that the import hangs at relationship counts:
2016-02-17 08:28:12
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
"MuninnPageCache[1]-FlushTask" daemon prio=6 tid=0x0000000026855800 nid=0xfe0 waiting on condition [0x00000000288fe000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000004c0189810> (a org.neo4j.io.pagecache.impl.muninn.MuninnPageCache)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.continuouslyFlushPages(MuninnPageCache.java:909)
at org.neo4j.io.pagecache.impl.muninn.FlushTask.run(FlushTask.java:36)
at org.neo4j.io.pagecache.impl.muninn.BackgroundTask.run(BackgroundTask.java:45)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
"MuninnPageCache[1]-EvictionTask" daemon prio=6 tid=0x0000000026904000 nid=0x3bd4 runnable [0x00000000287fe000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000004c0189810> (a org.neo4j.io.pagecache.impl.muninn.MuninnPageCache)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.parkEvictor(MuninnPageCache.java:697)
at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.parkUntilEvictionRequired(MuninnPageCache.java:751)
at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.continuouslySweepPages(MuninnPageCache.java:732)
at org.neo4j.io.pagecache.impl.muninn.EvictionTask.run(EvictionTask.java:39)
at org.neo4j.io.pagecache.impl.muninn.BackgroundTask.run(BackgroundTask.java:45)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
"Service Thread" daemon prio=6 tid=0x0000000024ee8000 nid=0x301c runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x0000000024ee6000 nid=0x3060 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x0000000024ee2800 nid=0x2198 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Attach Listener" daemon prio=10 tid=0x0000000024ee2000 nid=0x1ae4 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x0000000024ee1000 nid=0x135c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=8 tid=0x0000000024ed9000 nid=0x3480 in Object.wait() [0x00000000278ff000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000004c000d4b0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(Unknown Source)
- locked <0x00000004c000d4b0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(Unknown Source)
at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)
"Reference Handler" daemon prio=10 tid=0x0000000024ed8000 nid=0x1ae8 in Object.wait() [0x00000000277ff000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000004c000d300> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:503)
at java.lang.ref.Reference$ReferenceHandler.run(Unknown Source)
- locked <0x00000004c000d300> (a java.lang.ref.Reference$Lock)
"main" prio=6 tid=0x00000000023c2800 nid=0x2e7c waiting on condition [0x00000000023bf000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.neo4j.io.fs.FileUtils.waitAndThenTriggerGC(FileUtils.java:253)
at org.neo4j.io.fs.FileUtils.deleteFile(FileUtils.java:110)
at org.neo4j.io.fs.DefaultFileSystemAbstraction.deleteFile(DefaultFileSystemAbstraction.java:127)
at org.neo4j.kernel.impl.storemigration.FileOperation$3.perform(FileOperation.java:93)
at org.neo4j.kernel.impl.storemigration.StoreFile.fileOperation(StoreFile.java:267)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:389)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:279)
"VM Thread" prio=10 tid=0x0000000024ed1800 nid=0x3058 runnable
"GC task thread#0 (ParallelGC)" prio=6 tid=0x00000000023d7000 nid=0x313c runnable
"GC task thread#1 (ParallelGC)" prio=6 tid=0x00000000023d9000 nid=0x3144 runnable
"GC task thread#2 (ParallelGC)" prio=6 tid=0x00000000023da800 nid=0x974 runnable
"GC task thread#3 (ParallelGC)" prio=6 tid=0x00000000023dc000 nid=0x3a3c runnable
"GC task thread#4 (ParallelGC)" prio=6 tid=0x00000000023de800 nid=0x3684 runnable
"GC task thread#5 (ParallelGC)" prio=6 tid=0x00000000023e1000 nid=0x35b8 runnable
"GC task thread#6 (ParallelGC)" prio=6 tid=0x00000000023e4000 nid=0x3950 runnable
"GC task thread#7 (ParallelGC)" prio=6 tid=0x00000000023e5800 nid=0x318c runnable
"GC task thread#8 (ParallelGC)" prio=6 tid=0x00000000023e8800 nid=0x30b8 runnable
"GC task thread#9 (ParallelGC)" prio=6 tid=0x00000000023e9800 nid=0x32dc runnable
"VM Periodic Task Thread" prio=10 tid=0x0000000024eed800 nid=0x3710 waiting on condition
JNI global references: 377
Heap
PSYoungGen total 2071552K, used 0K [0x0000000780000000, 0x0000000800000000, 0x0000000800000000)
eden space 2043904K, 0% used [0x0000000780000000,0x0000000780000000,0x00000007fcc00000)
from space 27648K, 0% used [0x00000007fe500000,0x00000007fe500000,0x0000000800000000)
to space 25600K, 0% used [0x00000007fcc00000,0x00000007fcc00000,0x00000007fe500000)
ParOldGen total 11534336K, used 10982258K [0x00000004c0000000, 0x0000000780000000, 0x0000000780000000)
object space 11534336K, 95% used [0x00000004c0000000,0x000000075e4dcb50,0x0000000780000000)
PSPermGen total 21504K, used 13521K [0x00000004bae00000, 0x00000004bc300000, 0x00000004c0000000)
object space 21504K, 62% used [0x00000004bae00000,0x00000004bbb34588,0x00000004bc300000)
2016-02-17 08:28:20

That is very strange on such a small dataset. How many unique relationships and labels do you expect there to be in this dataset? Also can you provide a thread dump some way into that pause when it happens?
EDIT: problem was that a column containing property values was used as LABEL. This produced an enormous amount of labels by mistake and the counting doesn't scale with that.

Related

Flyway regularly hangs (MariaDB connector, RDS)

I've been seeing frequent hangs on deployment, at the migration step. Java/Scala application packaged in WAR for Tomcat. Database is RDS Aurora using MariaDB connector (https://downloads.mariadb.org/connector-java/).
Probably has nothing to do with Flyway but is a generic problem getting a connection.
Migration is run from shell in container:
java -cp `echo WEB-INF/lib/*|tr ' ' :` foo.Migrate
Migration code looks like:
def main(args: Array[String]): Unit = {
Environment.dbFlywayPassword.foreach { pass =>
val flyway = new Flyway
flyway.setDataSource(Environment.jdbcUrl, "flyway", pass)
flyway.migrate
}
}
Connection string:
jdbc:mysql:aurora://%RDS_HOST%/xxx?serverSslCert=/rds-ca-2015-root.pem&useSSL=true&connectTimeout=10000
I've tried increasing logging level in Flyway, but nothing is logged after this line:
15:57:35.115 [main] INFO o.f.c.internal.util.VersionPrinter - Flyway 4.2.0 by Boxfuse
So I got a thread dump, that looks like this:
15:57:35.115 [main] INFO o.f.c.internal.util.VersionPrinter - Flyway 4.2.0 by Boxfuse
2017-06-08 15:57:56
Full thread dump OpenJDK 64-Bit Server VM (25.121-b13 mixed mode):
"MariaDb-failover-1" #8 daemon prio=5 os_prio=0 tid=0x00005555f80ae000 nid=0x14 waiting on condition [0x00007fc330b8f000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000f5c59b10> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00005555f70bf000 nid=0x12 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00005555f7063000 nid=0x11 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00005555f7060800 nid=0x10 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00005555f705e800 nid=0xf waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00005555f702f000 nid=0xe in Object.wait() [0x00007fc331616000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000f5a30c58> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000000f5a30c58> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00005555f702c000 nid=0xd in Object.wait() [0x00007fc331717000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000f5a30e10> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000f5a30e10> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"main" #1 prio=5 os_prio=0 tid=0x00005555f6f85000 nid=0xb runnable [0x00007fc34341d000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
- locked <0x00000000f0a66090> (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
- locked <0x00000000f0a81eb0> (a sun.security.ssl.AppInputStream)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0x00000000f06b5118> (a java.io.BufferedInputStream)
at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacketArray(StandardPacketInputStream.java:125)
at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacket(StandardPacketInputStream.java:95)
at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readPacket(AbstractQueryProtocol.java:1002)
at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:982)
at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.readRequestSessionVariables(AbstractConnectProtocol.java:498)
at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.readPipelineAdditionalData(AbstractConnectProtocol.java:544)
at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:410)
at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:357)
at org.mariadb.jdbc.internal.protocol.AuroraProtocol.loop(AuroraProtocol.java:149)
at org.mariadb.jdbc.internal.failover.impl.AuroraListener.reconnectFailedConnection(AuroraListener.java:179)
at org.mariadb.jdbc.internal.failover.impl.MastersSlavesListener.initializeConnection(MastersSlavesListener.java:154)
at org.mariadb.jdbc.internal.failover.FailoverProxy.<init>(FailoverProxy.java:94)
at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:464)
at org.mariadb.jdbc.Driver.connect(Driver.java:103)
at org.flywaydb.core.internal.util.jdbc.DriverDataSource.getConnectionFromDriver(DriverDataSource.java:416)
at org.flywaydb.core.internal.util.jdbc.DriverDataSource.getConnection(DriverDataSource.java:381)
at org.flywaydb.core.internal.util.jdbc.JdbcUtils.openConnection(JdbcUtils.java:51)
at org.flywaydb.core.Flyway.execute(Flyway.java:1418)
at org.flywaydb.core.Flyway.migrate(Flyway.java:971)
at tgam.service.data.Migrate$.$anonfun$main$1(Migrate.scala:11)
at foo.Migrate$.$anonfun$main$1$adapted(Migrate.scala:8)
at foo.Migrate$$$Lambda$4/458209687.apply(Unknown Source)
at scala.Option.foreach(Option.scala:257)
at foo.Migrate$.main(Migrate.scala:8)
at foo.Migrate.main(Migrate.scala)
"VM Thread" os_prio=0 tid=0x00005555f7022000 nid=0xc runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00005555f70da800 nid=0x13 waiting on condition
JNI global references: 232
Heap
def new generation total 4928K, used 1611K [0x00000000f0600000, 0x00000000f0b50000, 0x00000000f5950000)
eden space 4416K, 24% used [0x00000000f0600000, 0x00000000f0712c10, 0x00000000f0a50000)
from space 512K, 100% used [0x00000000f0a50000, 0x00000000f0ad0000, 0x00000000f0ad0000)
to space 512K, 0% used [0x00000000f0ad0000, 0x00000000f0ad0000, 0x00000000f0b50000)
tenured generation total 10944K, used 4187K [0x00000000f5950000, 0x00000000f6400000, 0x0000000100000000)
the space 10944K, 38% used [0x00000000f5950000, 0x00000000f5d66e78, 0x00000000f5d67000, 0x00000000f6400000)
Metaspace used 13061K, capacity 13306K, committed 13568K, reserved 1060864K
class space used 1381K, capacity 1449K, committed 1536K, reserved 1048576K
Looks like an I/O hang in org.mariadb.jdbc.Driver.connect, but I do have a connectTimeout set (10 seconds). This timeout doesn't seem to be effective (would I need a socketTimeout per https://github.com/brettwooldridge/HikariCP/issues/754 ?)
This has been happening for a while. The same thing happened when I was using Tomcat's contextInitialized hook to do migrations. I decided to refactor out into a separate invocation before starting Tomcat, which looks like a better idea in general, but it hasn't affected this behaviour.
What will typically happen is that the code will hang, after 2-3 minutes ECS will timeout, and trigger a redeploy. After some number of these retries (e.g. up to 10), Flyway will run successfully and the service will start.
OK, seems this is a known bug in RDS Aurora - and it's alluded to in the MariaDB Connector doc (surely it should be a runtime warning, though!)
https://mariadb.com/kb/en/mariadb/about-mariadb-connector-j/#infrequently-used
usePipelineAuth Not compatible with aurora During connection,
different queries are executed. When option is active those queries
are send using pipeline (all queries are send, then only all results
are reads), permitting faster connection creation. Default: true.
Since 1.6.0
Also kudos to wlad_ in Freenode #maria who pointed me in the right direction.

why does trying to attach to myself crash my vm?

running the following code works fine on macos (sun jdk 1.8.0_51), but crashes every time with a thread dump on centOS (sun jdk 1.8.0 101) when it comes to attaching to itself, while succeeding to attach any other visible VM:
import com.sun.tools.attach.VirtualMachine;
import com.sun.tools.attach.VirtualMachineDescriptor;
public class Test {
public static void main(String[] args) throws Exception {
for(VirtualMachineDescriptor vmd : VirtualMachine.list()) {
VirtualMachine vm = VirtualMachine.attach(vmd);
}
}
}
the thread dump:
"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007fbb780bf000 nid=0x28a0 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007fbb780bc000 nid=0x289f runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fbb780ba000 nid=0x289e runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fbb780b7000 nid=0x289d runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fbb780b6000 nid=0x289c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fbb78082800 nid=0x289b in Object.wait() [0x00007fbb62eed000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000d6608ee0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000000d6608ee0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fbb7807e000 nid=0x289a in Object.wait() [0x00007fbb62fee000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000d6606b50> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000d6606b50> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"main" #1 prio=5 os_prio=0 tid=0x00007fbb78008000 nid=0x2894 waiting on condition [0x00007fbb7f307000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:100)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250)
at Test.main(Test.java:10)
"VM Thread" os_prio=0 tid=0x00007fbb78076800 nid=0x2899 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fbb7801d800 nid=0x2895 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fbb7801f800 nid=0x2896 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007fbb78021000 nid=0x2897 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fbb78023000 nid=0x2898 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007fbb780c2000 nid=0x28a1 waiting on condition
JNI global references: 19
Heap
PSYoungGen total 37888K, used 9835K [0x00000000d6600000, 0x00000000d9000000, 0x0000000100000000)
eden space 32768K, 30% used [0x00000000d6600000,0x00000000d6f9ad98,0x00000000d8600000)
from space 5120K, 0% used [0x00000000d8b00000,0x00000000d8b00000,0x00000000d9000000)
to space 5120K, 0% used [0x00000000d8600000,0x00000000d8600000,0x00000000d8b00000)
ParOldGen total 86016K, used 0K [0x0000000083200000, 0x0000000088600000, 0x00000000d6600000)
object space 86016K, 0% used [0x0000000083200000,0x0000000083200000,0x0000000088600000)
Metaspace used 3451K, capacity 4728K, committed 4864K, reserved 1056768K
class space used 382K, capacity 424K, committed 512K, reserved 1048576K
Exception in thread "main" com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250)
at Test.main(Test.java:10)
can anyone explain?

how to interprete this thread dump from a hung Java Swing application?

I have the following thread dump from a hung java swing application. It hung after a button is clicked and the GUI changed to blank. Other threads in socket communication and task management are still working (from the log file I can tell). I have removed some non-relevant output.
The #13 AW-EventQueue-0 should send out a command through the socket but it seems failed there. The #20 and #21 are AW-EventQueue-0-SharedResourceRunner which is not the same as the #13? It seems there is no deadlock but the GUI is not responsive and became blank.
do you see any useful information about the cause of the hanging?
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.20-b23 mixed mode):
"DestroyJavaVM" #32 prio=5 os_prio=0 tid=0x00007f286c009800 nid=0xa41 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"TimerQueue" #22 daemon prio=5 os_prio=0 tid=0x00007f28002a8800 nid=0xa65 waiting on condition [0x00007f284c56f000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000088a8f5c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.DelayQueue.take(DelayQueue.java:211)
at javax.swing.TimerQueue.run(TimerQueue.java:171)
at java.lang.Thread.run(Thread.java:745)
"AWT-EventQueue-0-SharedResourceRunner" #21 daemon prio=6 os_prio=0 tid=0x00007f280021d000 nid=0xa64 in Object.wait() [0x00007f284d434000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000088aec748> (a jogamp.opengl.SharedResourceRunner)
at java.lang.Object.wait(Object.java:502)
at jogamp.opengl.SharedResourceRunner.run(SharedResourceRunner.java:276)
- locked <0x0000000088aec748> (a jogamp.opengl.SharedResourceRunner)
at java.lang.Thread.run(Thread.java:745)
"AWT-EventQueue-0-SharedResourceRunner" #20 daemon prio=6 os_prio=0 tid=0x00007f28001f3000 nid=0xa63 in Object.wait() [0x00007f284f7f5000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000088aed588> (a jogamp.opengl.SharedResourceRunner)
at java.lang.Object.wait(Object.java:502)
at jogamp.opengl.SharedResourceRunner.run(SharedResourceRunner.java:276)
- locked <0x0000000088aed588> (a jogamp.opengl.SharedResourceRunner)
at java.lang.Thread.run(Thread.java:745)
"AWT-EventQueue-0" #13 prio=6 os_prio=0 tid=0x00007f286c444800 nid=0xa59 in Object.wait() [0x00007f2858913000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000dc467018> (a java.lang.Object)
at java.lang.Object.wait(Object.java:502)
at com.mycp.common.task.BMBTaskBase.startTask(BMBTaskBase.java:551)
- locked <0x00000000dc467018> (a java.lang.Object)
at com.mycp.uiapp.workmgmt.WorkMgmtMgr.sendBegCmd(WorkMgmtMgr.java:334)
at com.mycp.uiapp.workmgmt.WorkMgmtPanelBase.prepareAndSendBegWork(WorkMgmtPanelBase.java:559)
at com.mycp.uiapp.workmmgmt.WorkMgmtPanel.prepareAndSendBegWork(WorkMgmtPanel.java:1479)
at com.mycp.uiapp.workmgmt.WorkMgmtPanelBase.btnPrepareClicked(WorkMgmtPanelBase.java:363)
at com.mycp.uiapp.workmgmt.WorkMgmtPanel.btnPrepareClicked(WorkMgmtPanel.java:1412)
at com.mycp.uiapp.workmgmt.WorkMgmtPanel.actionPerformed(WorkMgmtPanel.java:1336)
at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2022)
at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2346)
"AWT-Shutdown" #14 prio=5 os_prio=0 tid=0x00007f286c443000 nid=0xa58 in Object.wait() [0x00007f2858a17000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000088ae8c28> (a java.lang.Object)
at java.lang.Object.wait(Object.java:502)
at sun.awt.AWTAutoShutdown.run(AWTAutoShutdown.java:295)
- locked <0x0000000088ae8c28> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:745)
"AWT-XAWT" #12 daemon prio=6 os_prio=0 tid=0x00007f286c384000 nid=0xa51 runnable [0x00007f285914f000]
java.lang.Thread.State: RUNNABLE
at sun.awt.X11.XToolkit.waitForEvents(Native Method)
at sun.awt.X11.XToolkit.run(XToolkit.java:559)
at sun.awt.X11.XToolkit.run(XToolkit.java:523)
at java.lang.Thread.run(Thread.java:745)
"Java2D Disposer" #10 daemon prio=10 os_prio=0 tid=0x00007f286c35e000 nid=0xa50 in Object.wait() [0x00007f2859250000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000087ab7ec0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
- locked <0x0000000087ab7ec0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
at sun.java2d.Disposer.run(Disposer.java:148)
at java.lang.Thread.run(Thread.java:745)
"Thread-0" #9 prio=5 os_prio=0 tid=0x00007f286c234800 nid=0xa4f waiting on condition [0x00007f2859af5000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.mycp.logging.BMBLogging$Task.run(BMBLogging.java:1072)
at java.lang.Thread.run(Thread.java:745)
"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007f286c0cf800 nid=0xa4d runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f286c0b2000 nid=0xa4c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f286c0b0000 nid=0xa4b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f286c0ad800 nid=0xa4a waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f286c0ab000 nid=0xa49 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f286c07c000 nid=0xa48 in Object.wait() [0x00007f285a2dd000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000087a7e6c8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
- locked <0x0000000087a7e6c8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f286c07a000 nid=0xa47 in Object.wait() [0x00007f285a3de000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000087a7e708> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157)
- locked <0x0000000087a7e708> (a java.lang.ref.Reference$Lock)
"VM Thread" os_prio=0 tid=0x00007f286c072800 nid=0xa46 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f286c01e800 nid=0xa42 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f286c020800 nid=0xa43 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f286c022000 nid=0xa44 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f286c024000 nid=0xa45 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f286c0d2000 nid=0xa4e waiting on condition
JNI global references: 485
Heap
PSYoungGen total 118272K, used 98176K [0x00000000d6e00000, 0x00000000de700000, 0x0000000100000000)
eden space 113152K, 82% used [0x00000000d6e00000,0x00000000dc8e00c8,0x00000000ddc80000)
from space 5120K, 100% used [0x00000000de180000,0x00000000de680000,0x00000000de680000)
to space 5120K, 0% used [0x00000000ddc80000,0x00000000ddc80000,0x00000000de180000)
ParOldGen total 159744K, used 76671K [0x0000000084a00000, 0x000000008e600000, 0x00000000d6e00000)
object space 159744K, 47% used [0x0000000084a00000,0x00000000894dfc50,0x000000008e600000)
Metaspace used 30027K, capacity 30212K, committed 30464K, reserved 1077248K
class space used 3528K, capacity 3582K, committed 3584K, reserved 1048576K
The thread dump shows stack traces from about 22 different threads. Many of them look like application threads (as opposed to JVM internal threads). Most of the application threads are waiting for something. Which of those threads should not be waiting?
I'd start by looking at thread 13: Looks like the Swing EDT, and it's waiting inside a call to a button's actionPerformed(...) handler. That can't be good.
I think I am late in the game. Anyways, from your logs we could see that a thread is in Parking Waiting state.
"TimerQueue" #22 daemon prio=5 os_prio=0 tid=0x00007f28002a8800 nid=0xa65 waiting on condition [0x00007f284c56f000]
java.lang.Thread.State: WAITING (parking)
We could see that this thread is expecting something from DelayQueue.
DelayQueue-An unbounded blocking queue of Delayed elements, in which an element can only be taken when its delay has expired.
So In the same TimerQueue, we have
at java.util.concurrent.DelayQueue.take(DelayQueue.java:211)
This take() function waits if an element with expired delay is available on this queue.
This could be the reason for application hanging issue as this thread is still waiting and doesn't shutdown. So, there are still threads alive. To resolve this you will need to kill these threads.
For this you could just use ExecutorServices.shutdown() method. OR Simply you could use System.exit().
I would recommend you to use System.exit().

How to handle a SSLSocketImpl Deadlock properly?

Very rarely I get a deadlock while using wiki-java. Having a look at the full thread dump (acquired via kill -3 $JAVA-PID) suggests that the deadlock seems to be originating somewhere in the SSLSocketImpl. I'd prefer to avoid this deadlock in the first place (instead of doing some hacky recovery) but I am unsure how to find the cause and prevent it. Is there a way to set a timeout in the SSLSocketImpl or throw an exception in case of the deadlock? (It would be pretty straightforward to catch it in the main loop and redo the last call)
Full thread dump OpenJDK 64-Bit Server VM (24.51-b03 mixed mode):
"Service Thread" daemon prio=10 tid=0x00007f3cd816b000 nid=0x102c runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x00007f3cd8168800 nid=0x102b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x00007f3cd8165800 nid=0x102a waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x00007f3cd8163800 nid=0x1029 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x00007f3cd8140800 nid=0x1028 waiting on condition [0x00007f3ccb9f7000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007d77a9080> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:799)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:672)
at sun.security.ssl.SSLSocketImpl.sendAlert(SSLSocketImpl.java:2005)
at sun.security.ssl.SSLSocketImpl.warning(SSLSocketImpl.java:1832)
at sun.security.ssl.SSLSocketImpl.closeInternal(SSLSocketImpl.java:1600)
- locked <0x00000007d77a8d78> (a sun.security.ssl.SSLSocketImpl)
at sun.security.ssl.SSLSocketImpl.close(SSLSocketImpl.java:1538)
at sun.security.ssl.BaseSSLSocketImpl.finalize(BaseSSLSocketImpl.java:249)
at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:101)
at java.lang.ref.Finalizer.access$100(Finalizer.java:32)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:190)
"Reference Handler" daemon prio=10 tid=0x00007f3cd813e800 nid=0x1027 in Object.wait() [0x00007f3ccbaf9000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000078495a2c8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:503)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
- locked <0x000000078495a2c8> (a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x00007f3cd8008000 nid=0x1021 waiting for monitor entry [0x00007f3cdfdb7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.security.ssl.SSLSocketImpl.getConnectionState(SSLSocketImpl.java:649)
- waiting to lock <0x00000007d77a8d78> (a sun.security.ssl.SSLSocketImpl)
at sun.security.ssl.SSLSocketImpl.isClosed(SSLSocketImpl.java:1446)
at java.net.Socket.getTcpNoDelay(Socket.java:965)
at sun.security.ssl.BaseSSLSocketImpl.getTcpNoDelay(BaseSSLSocketImpl.java:345)
at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:819)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:801)
at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:122)
- locked <0x00000007d77a8d60> (a sun.security.ssl.AppOutputStream)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
- locked <0x00000007d77a8d48> (a java.io.BufferedOutputStream)
at java.io.PrintStream.flush(PrintStream.java:338)
- locked <0x00000007d77a8d28> (a java.io.PrintStream)
at sun.net.www.MessageHeader.print(MessageHeader.java:297)
- locked <0x00000007d6d057b0> (a sun.net.www.MessageHeader)
at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:599)
at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:610)
at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:619)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1321)
- locked <0x00000007d6d05640> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
at sun.net.www.protocol.http.HttpURLConnection.getHeaderFieldKey(HttpURLConnection.java:2731)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getHeaderFieldKey(HttpsURLConnectionImpl.java:307)
at shared.Wiki.grabCookies(Wiki.java:6907)
at shared.Wiki.fetch(Wiki.java:6462)
at shared.Wiki.getPageText(Wiki.java:1465)
at smallBots.Bot1.getText(Bot1.java:204)
at smallBots.Bot1.crawlCategory(Bot1.java:74)
at smallBots.Bot1.main(Bot1.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
"VM Thread" prio=10 tid=0x00007f3cd813a000 nid=0x1026 runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f3cd801d800 nid=0x1022 runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f3cd801f800 nid=0x1023 runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f3cd8021800 nid=0x1024 runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f3cd8023000 nid=0x1025 runnable
"VM Periodic Task Thread" prio=10 tid=0x00007f3cd8175800 nid=0x102d waiting on condition
JNI global references: 205
Found one Java-level deadlock:
=============================
"Finalizer":
waiting for ownable synchronizer 0x00000007d77a9080, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "main"
"main":
waiting to lock monitor 0x00007f3cac0015c8 (object 0x00000007d77a8d78, a sun.security.ssl.SSLSocketImpl),
which is held by "Finalizer"
Java stack information for the threads listed above:
===================================================
"Finalizer":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007d77a9080> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:799)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:672)
at sun.security.ssl.SSLSocketImpl.sendAlert(SSLSocketImpl.java:2005)
at sun.security.ssl.SSLSocketImpl.warning(SSLSocketImpl.java:1832)
at sun.security.ssl.SSLSocketImpl.closeInternal(SSLSocketImpl.java:1600)
- locked <0x00000007d77a8d78> (a sun.security.ssl.SSLSocketImpl)
at sun.security.ssl.SSLSocketImpl.close(SSLSocketImpl.java:1538)
at sun.security.ssl.BaseSSLSocketImpl.finalize(BaseSSLSocketImpl.java:249)
at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:101)
at java.lang.ref.Finalizer.access$100(Finalizer.java:32)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:190)
"main":
at sun.security.ssl.SSLSocketImpl.getConnectionState(SSLSocketImpl.java:649)
- waiting to lock <0x00000007d77a8d78> (a sun.security.ssl.SSLSocketImpl)
at sun.security.ssl.SSLSocketImpl.isClosed(SSLSocketImpl.java:1446)
at java.net.Socket.getTcpNoDelay(Socket.java:965)
at sun.security.ssl.BaseSSLSocketImpl.getTcpNoDelay(BaseSSLSocketImpl.java:345)
at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:819)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:801)
at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:122)
- locked <0x00000007d77a8d60> (a sun.security.ssl.AppOutputStream)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
- locked <0x00000007d77a8d48> (a java.io.BufferedOutputStream)
at java.io.PrintStream.flush(PrintStream.java:338)
- locked <0x00000007d77a8d28> (a java.io.PrintStream)
at sun.net.www.MessageHeader.print(MessageHeader.java:297)
- locked <0x00000007d6d057b0> (a sun.net.www.MessageHeader)
at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:599)
at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:610)
at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:619)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1321)
- locked <0x00000007d6d05640> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
at sun.net.www.protocol.http.HttpURLConnection.getHeaderFieldKey(HttpURLConnection.java:2731)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getHeaderFieldKey(HttpsURLConnectionImpl.java:307)
at shared.Wiki.grabCookies(Wiki.java:6907)
at shared.Wiki.fetch(Wiki.java:6462)
at shared.Wiki.getPageText(Wiki.java:1465)
at smallBots.Bot1.getText(Bot1.java:204)
at smallBots.Bot1.crawlCategory(Bot1.java:74)
at smallBots.Bot1.main(Bot1.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Found 1 deadlock.
Heap
PSYoungGen total 10752K, used 801K [0x00000007d6d00000, 0x00000007d7880000, 0x0000000800000000)
eden space 9728K, 1% used [0x00000007d6d00000,0x00000007d6d1fc20,0x00000007d7680000)
from space 1024K, 65% used [0x00000007d7780000,0x00000007d7828b40,0x00000007d7880000)
to space 1024K, 0% used [0x00000007d7680000,0x00000007d7680000,0x00000007d7780000)
ParOldGen total 93696K, used 69956K [0x0000000784800000, 0x000000078a380000, 0x00000007d6d00000)
object space 93696K, 74% used [0x0000000784800000,0x0000000788c51160,0x000000078a380000)
PSPermGen total 21504K, used 9537K [0x000000077a200000, 0x000000077b700000, 0x0000000784800000)
object space 21504K, 44% used [0x000000077a200000,0x000000077ab50720,0x000000077b700000)
The asker's answer: update to Java 8 to fix it.

My program will not stop after run() method in main thread finishes

I am using Java. The main thread sends data, while a worker thread listens to responses. I also have Timer in case timeout occurs. In main(), I am calling run(), which can finish, according to the output. Here is what it looks like:
class Send {
Worker w;
run() {
// w was initialized in constructor
w.start();
....
w.join();
}
main(args) {
Send s = new Send();
s.run();
}
private class Worker extend Thread {
public void run() {
....
}
}
}
In s.run(), every time I need to cancel the Timer or restart the Timer, I would do
timer.cancel();
timer.purge();
timer = new Timer();
timer.schdule(...);
The TimerTask is simply calling a static method in Send to handle the timeout.
So what did I do wrong to cause my program hanging after the main thread finishes?
Thank you.
EDIT: the out put of kill -3 process-id:
Full thread dump OpenJDK 64-Bit Server VM (19.0-b09 mixed mode):
"DestroyJavaVM" prio=10 tid=0x00007f9f20035000 nid=0x73f1 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Timer-10" prio=10 tid=0x0000000001a7d800 nid=0x740f in Object.wait() [0x00007f9f1e39c000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000075833fe78> (a java.util.TaskQueue)
at java.lang.Object.wait(Object.java:502)
at java.util.TimerThread.mainLoop(Timer.java:505)
- locked <0x000000075833fe78> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:484)
"Low Memory Detector" daemon prio=10 tid=0x00007f9f20004800 nid=0x7402 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"CompilerThread1" daemon prio=10 tid=0x0000000001a70000 nid=0x7401 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"CompilerThread0" daemon prio=10 tid=0x00007f9f20001000 nid=0x7400 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x0000000001a6e800 nid=0x73ff waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x0000000001a49000 nid=0x73fe in Object.wait() [0x00007f9f1f3f2000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007580b1310> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
- locked <0x00000007580b1310> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)
"Reference Handler" daemon prio=10 tid=0x0000000001a47000 nid=0x73fd in Object.wait() [0x00007f9f1f4f3000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007580b11e8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
- locked <0x00000007580b11e8> (a java.lang.ref.Reference$Lock)
"VM Thread" prio=10 tid=0x0000000001a40000 nid=0x73fc runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00000000019d7000 nid=0x73f2 runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00000000019d9000 nid=0x73f3 runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x00000000019da800 nid=0x73f4 runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x00000000019dc800 nid=0x73f5 runnable
"GC task thread#4 (ParallelGC)" prio=10 tid=0x00000000019de800 nid=0x73f6 runnable
"GC task thread#5 (ParallelGC)" prio=10 tid=0x00000000019e0000 nid=0x73f7 runnable
"GC task thread#6 (ParallelGC)" prio=10 tid=0x00000000019e2000 nid=0x73f8 runnable
"GC task thread#7 (ParallelGC)" prio=10 tid=0x00000000019e4000 nid=0x73f9 runnable
"GC task thread#8 (ParallelGC)" prio=10 tid=0x00000000019e5800 nid=0x73fa runnable
"GC task thread#9 (ParallelGC)" prio=10 tid=0x00000000019e7800 nid=0x73fb runnable
"VM Periodic Task Thread" prio=10 tid=0x00007f9f20007800 nid=0x7403 waiting on condition
JNI global references: 886
Heap
PSYoungGen total 150528K, used 7745K [0x00000007580b0000, 0x00000007628a0000, 0x0000000800000000)
eden space 129088K, 6% used [0x00000007580b0000,0x00000007588405b8,0x000000075fec0000)
from space 21440K, 0% used [0x00000007613b0000,0x00000007613b0000,0x00000007628a0000)
to space 21440K, 0% used [0x000000075fec0000,0x000000075fec0000,0x00000007613b0000)
PSOldGen total 343936K, used 0K [0x0000000608200000, 0x000000061d1e0000, 0x00000007580b0000)
object space 343936K, 0% used [0x0000000608200000,0x0000000608200000,0x000000061d1e0000)
PSPermGen total 21248K, used 3130K [0x00000005fdc00000, 0x00000005ff0c0000, 0x0000000608200000)
object space 21248K, 14% used [0x00000005fdc00000,0x00000005fdf0ea30,0x00000005ff0c0000)
A Java program will exit after the last thread finishes.
To prevent that, mark the other threads as daemon threads.
Get a thread dump by running kill -3 <process id>. This will tell you which threads are hanging around & preventing the java process from exiting. Feel free to post (some of) the thread dump & folks can help you figure out what to do next.
It looks like you are creating several Timer threads. Check that all of them have exited properly, that is probably the problem.
If you check the Timer JavaDoc - http://download.oracle.com/javase/6/docs/api/java/util/Timer.html - you will notice the following note:
By default, the task execution thread does not run as a daemon thread
You can use a debugger (both Eclipse and NetBeans have excellent ones) to see which threads are still alive.

Categories