Closed
Description
We - @acogoluegnes @Gsantomaggio - started looking at why stream-max-length.publish.java-6c7cbd7d98-dv42n
Java Publisher stopped publishing since:
2021-09-02 23:08:23 | 135, published 1877 msg/s, confirmed 1805 msg/s, consumed 0 msg/s, latency min/median/75th/95th/99th 0/0/0/0/0 µs, chunk size 0
We ssh-ed into the pod, downloaded https://github.com/adoptium/temurin11-binaries/releases/download/jdk-11.0.12%2B7/OpenJDK11U-jdk_x64_linux_hotspot_11.0.12_7.tar.gz (requires curl or wget) and ran the following commands:
cd jdk-11.0.12+7/bin
./jps
1 stream-perf-test.jar
3305 Jps
./jstack 1
2021-09-06 14:49:27
Full thread dump OpenJDK 64-Bit Server VM (11.0.12+7-LTS mixed mode):
Threads class SMR info:
_java_thread_list=0x00007fb98c001f00, length=20, elements={
0x00007fb9b0027000, 0x00007fb9b010b000, 0x00007fb9b010f000, 0x00007fb9b0124000,
0x00007fb9b0126000, 0x00007fb9b0128000, 0x00007fb9b012a800, 0x00007fb9b012c800,
0x00007fb9b01af000, 0x00007fb9b0438800, 0x00007fb9b095f000, 0x00007fb960001800,
0x00007fb9b087d800, 0x00007fb9b03f0000, 0x00007fb95c001000, 0x00007fb9b03f2800,
0x00007fb9b03f4000, 0x00007fb9b0970800, 0x00007fb96000e800, 0x00007fb98c001000
}
"main" #1 prio=5 os_prio=0 cpu=1721.24ms elapsed=319407.51s tid=0x00007fb9b0027000 nid=0x7 waiting on condition [0x00007fb9b688e000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000fae9aae8> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.12/AbstractQueuedSynchronizer.java:885)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.12/AbstractQueuedSynchronizer.java:1039)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.12/AbstractQueuedSynchronizer.java:1345)
at java.util.concurrent.CountDownLatch.await(java.base@11.0.12/CountDownLatch.java:232)
at com.rabbitmq.stream.perf.StreamPerfTest.call(StreamPerfTest.java:743)
at com.rabbitmq.stream.perf.StreamPerfTest.call(StreamPerfTest.java:84)
at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
at picocli.CommandLine.executeHelpRequest(CommandLine.java:1929)
at picocli.CommandLine.executeHelpRequest(CommandLine.java:1909)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2176)
at picocli.CommandLine.execute(CommandLine.java:2078)
at com.rabbitmq.stream.perf.StreamPerfTest.run(StreamPerfTest.java:353)
at com.rabbitmq.stream.perf.StreamPerfTest.main(StreamPerfTest.java:344)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.12/Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.12/NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.12/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base@11.0.12/Method.java:566)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:108)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:58)
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:88)
"Reference Handler" #2 daemon prio=10 os_prio=0 cpu=5.42ms elapsed=319407.48s tid=0x00007fb9b010b000 nid=0x9 waiting on condition [0x00007fb9b4104000]
java.lang.Thread.State: RUNNABLE
at java.lang.ref.Reference.waitForReferencePendingList(java.base@11.0.12/Native Method)
at java.lang.ref.Reference.processPendingReferences(java.base@11.0.12/Reference.java:241)
at java.lang.ref.Reference$ReferenceHandler.run(java.base@11.0.12/Reference.java:213)
"Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.68ms elapsed=319407.48s tid=0x00007fb9b010f000 nid=0xa in Object.wait() [0x00007fb9a0ffe000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(java.base@11.0.12/Native Method)
- waiting on <0x00000000fab3bd68> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(java.base@11.0.12/ReferenceQueue.java:155)
- waiting to re-lock in wait() <0x00000000fab3bd68> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(java.base@11.0.12/ReferenceQueue.java:176)
at java.lang.ref.Finalizer$FinalizerThread.run(java.base@11.0.12/Finalizer.java:170)
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.83ms elapsed=319407.43s tid=0x00007fb9b0124000 nid=0xb runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.07ms elapsed=319407.43s tid=0x00007fb9b0126000 nid=0xc runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #6 daemon prio=9 os_prio=0 cpu=13131.35ms elapsed=319407.43s tid=0x00007fb9b0128000 nid=0xd waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"C1 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=3724.80ms elapsed=319407.43s tid=0x00007fb9b012a800 nid=0xe waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"Sweeper thread" #8 daemon prio=9 os_prio=0 cpu=0.22ms elapsed=319407.43s tid=0x00007fb9b012c800 nid=0xf runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Common-Cleaner" #9 daemon prio=8 os_prio=0 cpu=382.06ms elapsed=319407.24s tid=0x00007fb9b01af000 nid=0x11 in Object.wait() [0x00007fb9a0610000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(java.base@11.0.12/Native Method)
- waiting on <no object reference available>
at java.lang.ref.ReferenceQueue.remove(java.base@11.0.12/ReferenceQueue.java:155)
- waiting to re-lock in wait() <0x00000000fab3c558> (a java.lang.ref.ReferenceQueue$Lock)
at jdk.internal.ref.CleanerImpl.run(java.base@11.0.12/CleanerImpl.java:148)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
at jdk.internal.misc.InnocuousThread.run(java.base@11.0.12/InnocuousThread.java:134)
"stream-perf-test-env-0" #13 prio=5 os_prio=0 cpu=168292.99ms elapsed=319399.64s tid=0x00007fb9b0438800 nid=0x14 waiting on condition [0x00007fb979d46000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000fad11488> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.12/AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.12/ScheduledThreadPoolExecutor.java:1177)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.12/ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.12/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.12/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.12/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"nioEventLoopGroup-2-1" #15 prio=10 os_prio=0 cpu=4500.71ms elapsed=319399.32s tid=0x00007fb9b095f000 nid=0x15 runnable [0x00007fb979c45000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPoll.wait(java.base@11.0.12/Native Method)
at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.12/EPollSelectorImpl.java:120)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.12/SelectorImpl.java:124)
- locked <0x00000000fad55fa8> (a io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000000fad55f50> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(java.base@11.0.12/SelectorImpl.java:136)
at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:814)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"pool-3-thread-1" #16 prio=5 os_prio=0 cpu=535.18ms elapsed=319399.23s tid=0x00007fb960001800 nid=0x16 waiting on condition [0x00007fb979b44000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000fae9aef0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.12/AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.12/LinkedBlockingQueue.java:433)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.12/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.12/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.12/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"stream-perf-test-env-1" #17 prio=5 os_prio=0 cpu=169655.49ms elapsed=319399.14s tid=0x00007fb9b087d800 nid=0x17 waiting on condition [0x00007fb979a43000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000fad11488> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.12/LockSupport.java:234)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@11.0.12/AbstractQueuedSynchronizer.java:2123)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.12/ScheduledThreadPoolExecutor.java:1182)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.12/ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.12/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.12/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.12/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"nioEventLoopGroup-2-2" #18 prio=10 os_prio=0 cpu=3078.23ms elapsed=319399.13s tid=0x00007fb9b03f0000 nid=0x18 runnable [0x00007fb979942000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPoll.wait(java.base@11.0.12/Native Method)
at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.12/EPollSelectorImpl.java:120)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.12/SelectorImpl.java:124)
- locked <0x00000000fad56540> (a io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000000fad564e8> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(java.base@11.0.12/SelectorImpl.java:136)
at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:814)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"pool-4-thread-1" #19 prio=5 os_prio=0 cpu=612.73ms elapsed=319399.12s tid=0x00007fb95c001000 nid=0x19 waiting on condition [0x00007fb979841000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000fae9b490> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.12/AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.12/LinkedBlockingQueue.java:433)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.12/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.12/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.12/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"pool-5-thread-1" #20 prio=5 os_prio=0 cpu=3517.46ms elapsed=319399.01s tid=0x00007fb9b03f2800 nid=0x1a waiting on condition [0x00007fb979740000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000faead570> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.12/AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.12/LinkedBlockingQueue.java:433)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.12/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.12/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.12/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"pool-5-thread-2" #21 prio=5 os_prio=0 cpu=3510.34ms elapsed=319399.01s tid=0x00007fb9b03f4000 nid=0x1b waiting on condition [0x00007fb97963f000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000faead570> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.12/AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.12/LinkedBlockingQueue.java:433)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.12/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.12/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.12/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"pool-6-thread-1" #22 prio=5 os_prio=0 cpu=19099.76ms elapsed=319398.92s tid=0x00007fb9b0970800 nid=0x1c waiting on condition [0x00007fb97953e000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000fae9bb38> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.12/LockSupport.java:234)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@11.0.12/AbstractQueuedSynchronizer.java:2123)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.12/ScheduledThreadPoolExecutor.java:1182)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@11.0.12/ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.12/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.12/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.12/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"pool-7-thread-1" #26 prio=5 os_prio=0 cpu=316.80ms elapsed=318800.33s tid=0x00007fb96000e800 nid=0x1f waiting on condition [0x00007fb97813b000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method)
- parking to wait for <0x00000000fafd1a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.12/AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.12/LinkedBlockingQueue.java:433)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.12/ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.12/ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.12/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)
"Attach Listener" #27 daemon prio=9 os_prio=0 cpu=0.39ms elapsed=0.10s tid=0x00007fb98c001000 nid=0xce8 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"VM Thread" os_prio=0 cpu=13002.25ms elapsed=319407.49s tid=0x00007fb9b0102800 nid=0x8 runnable
"VM Periodic Task Thread" os_prio=0 cpu=249760.35ms elapsed=319407.34s tid=0x00007fb9b0171000 nid=0x10 waiting on condition
JNI global refs: 17, weak refs: 0
We would first like to understand if the publishing thread crashes (we can't find it in the thread list above), and how does it crash (we are currently discarding the exception).
Originally posted by @gerhard in https://github.com/rabbitmq/opportunities/issues/99 (private repository)