New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transport-test deadlocks intermittently #617
Comments
Finally I've got how the deadlock occurrs.
I consider this is a bug of JDK. I'm considering reporting our issue to the JDK team unless you have a different opinion. I'll also send a quick-fix PR, which replaces |
… service/start!
Wow AkihiroSuda, this is terrific detective work! Thank you! That'll save a whole ton of headache in the merge process. |
I want to try and preserve service startup parallelism if possible--what do you think about calling IOUtil/load to force the static initializer to run earlier? |
So this seems to work on JDK8, but there's gotta be something different going on in JDK7 because it doesn't have IOUtil/load. I'm gonna try promoting the lock of Runtime to the server start functions. |
Ugh, so I introduced a lock mutexing sse-server and tcp-server's startup, and it still deadlocks in Travis. Maybe a different codepath this time? |
Couldn't reproduce with my local OpenJDK 7u79. 😞 |
I know, right? I tried too! Outta time to hack on this today but keep me posted, and thank you! —Reply to this email directly or view it on GitHub. |
In JDK7, |
On second thought, riemann itself is deadlocking at transport_test.clj:76. |
The bug seems still not resolved for JDK 8 (for the build run 14 days ago, at least).. https://travis-ci.org/riemann/riemann/jobs/107986718 |
For debugging riemann#617 on Travis.
For debugging riemann#617 on Travis.
The latest 0ed63b29 (Feb 14) still hanging with JDK8. |
Here is the JDK bug: https://bugs.openjdk.java.net/browse/JDK-8194653 |
The bug has been observed on Travis #990, #979, and more.
jstack
indicates thatjava.lang.Runtime.loadLibrary0
is inBLOCKED
state,but I'm still not sure why it's in such a weird state.
Note that no packet can be observed when we hit the bug.
So the problem seems not related to network IO.
Reproduction procedure
On Terminal 1:
On Terminal 2:
Environment
The text was updated successfully, but these errors were encountered: