DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Mastering Thread-Local Variables in Java: Explanation and Issues
  • Unlocking Performance: Exploring Java 21 Virtual Threads [Video]
  • Visualizing Thread-Safe Singletons in Java
  • Double-Checked Locking Design Pattern in Java

Trending

  • Fraud Detection Using Artificial Intelligence and Machine Learning
  • Beyond ChatGPT, AI Reasoning 2.0: Engineering AI Models With Human-Like Reasoning
  • Testing SingleStore's MCP Server
  • What Is Plagiarism? How to Avoid It and Cite Sources
  1. DZone
  2. Software Design and Architecture
  3. Integration
  4. Java Thread Deadlock: A Case Study

Java Thread Deadlock: A Case Study

By 
Pierre - Hugues Charbonneau user avatar
Pierre - Hugues Charbonneau
·
May. 06, 12 · Interview
Likes (0)
Comment
Save
Tweet
Share
14.6K Views

Join the DZone community and get the full member experience.

Join For Free

This article will describe the complete root cause analysis of a recent Java deadlock problem observed from a Weblogic 11g production system running on the IBM JVM 1.6.This case study will also demonstrate the importance of mastering Thread Dump analysis skills; including for the IBM JVM Thread Dump format.

Environment specification

 

  • Java EE server: Oracle Weblogic Server 11g & Spring 2.0
  • OS: AIX 5.3
  • Java VM: IBM JRE 1.6.0
  • Platform type: Portal & ordering application

Monitoring and troubleshooting tools


  • JVM Thread Dump (IBM JVM format)
  • Compuware Server Vantage (Weblogic JMX monitoring & alerting)

Problem overview

A major stuck Threads problem was observed & reported from Compuware Server Vantage and affecting 2 of our Weblogic 11g production managed servers causing application impact and timeout conditions from our end users.

Gathering and validation of facts

As usual, a Java EE problem investigation requires gathering of technical and non-technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:


·        What is the client impact? MEDIUM (only 2 managed servers / JVM affected out of 16)
·        Recent change of the affected platform? Yes (new JMS related asynchronous component)
·        Any recent traffic increase to the affected platform? No
·        How does this problem manifest itself?  A sudden increase of Threads was observed leading to rapid Thread depletion
·        Did a Weblogic managed server restart resolve the problem? Yes, but problem is returning after few hours (unpredictable & intermittent pattern)

-        Conclusion #1: The problem is related to an intermittent stuck Threads behaviour affecting only a few Weblogic managed servers at the time
-        Conclusion #2: Since problem is intermittent, a global root cause such as a non-responsive downstream system is not likely

Thread Dump analysis – first pass


The first thing to do when dealing with stuck Thread problems is to generate a JVM Thread Dump. This is a golden rule regardless of your environment specifications & problem context. A JVM Thread Dump snapshot provides you with crucial information about the active Threads and what type of processing / tasks they are performing at that time.


Now back to our case study, an IBM JVM Thread Dump (javacore.xyz format) was generated which did reveal the following Java Thread deadlock condition below:

1LKDEADLOCK    Deadlock detected !!!
NULL           ---------------------
NULL          
2LKDEADLOCKTHR  Thread "[STUCK] ExecuteThread: '8' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000012CC08B00)
3LKDEADLOCKWTR    is waiting for:
4LKDEADLOCKMON      sys_mon_t:0x0000000126171DF8 infl_mon_t: 0x0000000126171E38:
4LKDEADLOCKOBJ      weblogic/jms/frontend/FESession@0x07000000198048C0/0x07000000198048D8:
3LKDEADLOCKOWN    which is owned by:
2LKDEADLOCKTHR  Thread "[STUCK] ExecuteThread: '10' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000012E560500)
3LKDEADLOCKWTR    which is waiting for:
4LKDEADLOCKMON      sys_mon_t:0x000000012884CD60 infl_mon_t: 0x000000012884CDA0:
4LKDEADLOCKOBJ      weblogic/jms/frontend/FEConnection@0x0700000019822F08/0x0700000019822F20:
3LKDEADLOCKOWN    which is owned by:
2LKDEADLOCKTHR  Thread "[STUCK] ExecuteThread: '8' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000012CC08B00)

This deadlock situation can be translated as per below:


-        Weblogic Thread #8 is waiting to acquire an Object monitor lock owned by Weblogic Thread #10
-        Weblogic Thread #10 is waiting to acquire an Object monitor lock owned by Weblogic Thread #8

Conclusion: both Weblogic Threads #8 & #10 are waiting on each other; forever!

Now before going any deeper in this root cause analysis, let me provide you a high level overview on Java Thread deadlocks.

Java Thread deadlock overview


Most of you are probably familiar with Java Thread deadlock principles but did you really experience a true deadlock problem?

From my experience, true Java deadlocks are rare and I have only seen ~5 occurrences over the last 10 years. The reason is that most stuck Threads related problems are due to Thread hanging conditions (waiting on remote IO call etc.) but not involved in a true deadlock condition with other Thread(s).

A Java Thread deadlock is a situation for example where Thread A is waiting to acquire an Object monitor lock held by Thread B which is itself waiting to acquire an Object monitor lock held by Thread A. Both these Threads will wait for each other forever. This situation can be visualized as per below diagram:



Thread deadlock is confirmed…now what can you do?


Once the deadlock is confirmed (most JVM Thread Dump implementations will highlight it for you), the next step is to perform a deeper dive analysis by reviewing each Thread involved in the deadlock situation along with their current task & wait condition.Find below the partial Thread Stack Trace from our problem case for each Thread involved in the deadlock condition:


** Please note that the real application Java package name was renamed for confidentiality purposes **

Weblogic Thread #8
"[STUCK] ExecuteThread: '8' for queue: 'weblogic.kernel.Default (self-tuning)'" J9VMThread:0x000000012CC08B00, j9thread_t:0x00000001299E5100, java/lang/Thread:0x070000001D72EE00, state:B, prio=1
(native thread ID:0x111200F, native priority:0x1, native policy:UNKNOWN)
Java callstack:
       at weblogic/jms/frontend/FEConnection.stop(FEConnection.java:671(Compiled Code))
       at weblogic/jms/frontend/FEConnection.invoke(FEConnection.java:1685(Compiled Code))
       at weblogic/messaging/dispatcher/Request.wrappedFiniteStateMachine(Request.java:961(Compiled Code))
       at weblogic/messaging/dispatcher/DispatcherImpl.syncRequest(DispatcherImpl.java:184(Compiled Code))
       at weblogic/messaging/dispatcher/DispatcherImpl.dispatchSync(DispatcherImpl.java:212(Compiled Code))
       at weblogic/jms/dispatcher/DispatcherAdapter.dispatchSync(DispatcherAdapter.java:43(Compiled Code))
       at weblogic/jms/client/JMSConnection.stop(JMSConnection.java:863(Compiled Code))
       at weblogic/jms/client/WLConnectionImpl.stop(WLConnectionImpl.java:843)
       at org/springframework/jms/connection/SingleConnectionFactory.closeConnection(SingleConnectionFactory.java:342)
       at org/springframework/jms/connection/SingleConnectionFactory.resetConnection(SingleConnectionFactory.java:296)
       at org/app/JMSReceiver.receive()
……………………………………………………………………
Weblogic Thread #10
"[STUCK] ExecuteThread: '10' for queue: 'weblogic.kernel.Default (self-tuning)'" J9VMThread:0x000000012E560500, j9thread_t:0x000000012E35BCE0, java/lang/Thread:0x070000001ECA9200, state:B, prio=1
 (native thread ID:0x4FA027, native priority:0x1, native policy:UNKNOWN)
Java callstack:
       at weblogic/jms/frontend/FEConnection.getPeerVersion(FEConnection.java:1381(Compiled Code))
       at weblogic/jms/frontend/FESession.setUpBackEndSession(FESession.java:755(Compiled Code))
       at weblogic/jms/frontend/FESession.consumerCreate(FESession.java:1025(Compiled Code))
       at weblogic/jms/frontend/FESession.invoke(FESession.java:2995(Compiled Code))
       at weblogic/messaging/dispatcher/Request.wrappedFiniteStateMachine(Request.java:961(Compiled Code))
       at weblogic/messaging/dispatcher/DispatcherImpl.syncRequest(DispatcherImpl.java:184(Compiled Code))
       at weblogic/messaging/dispatcher/DispatcherImpl.dispatchSync(DispatcherImpl.java:212(Compiled Code))
       at weblogic/jms/dispatcher/DispatcherAdapter.dispatchSync(DispatcherAdapter.java:43(Compiled Code))
       at weblogic/jms/client/JMSSession.consumerCreate(JMSSession.java:2982(Compiled Code))
       at weblogic/jms/client/JMSSession.setupConsumer(JMSSession.java:2749(Compiled Code))
       at weblogic/jms/client/JMSSession.createConsumer(JMSSession.java:2691(Compiled Code))
       at weblogic/jms/client/JMSSession.createReceiver(JMSSession.java:2596(Compiled Code))
       at weblogic/jms/client/WLSessionImpl.createReceiver(WLSessionImpl.java:991(Compiled Code))
       at org/springframework/jms/core/JmsTemplate102.createConsumer(JmsTemplate102.java:204(Compiled Code))
       at org/springframework/jms/core/JmsTemplate.doReceive(JmsTemplate.java:676(Compiled Code))
       at org/springframework/jms/core/JmsTemplate$10.doInJms(JmsTemplate.java:652(Compiled Code))
       at org/springframework/jms/core/JmsTemplate.execute(JmsTemplate.java:412(Compiled Code))
       at org/springframework/jms/core/JmsTemplate.receiveSelected(JmsTemplate.java:650(Compiled Code))
       at org/springframework/jms/core/JmsTemplate.receiveSelected(JmsTemplate.java:641(Compiled Code))
       at org/app/JMSReceiver.receive()
……………………………………………………………
As you can see in the above Thread Strack Traces, such deadlock did originate from our application code which is using the Spring framework API for the JMS consumer implementation (very useful when not using MDB’s). The Stack Traces are quite interesting and revealing that both Threads are in a race condition against the same Weblogic JMS consumer session / connection and leading to a deadlock situation:

-        Weblogic Thread #8 is attempting to reset and close the current JMS connection
-        Weblogic Thread #10 is attempting to use the same JMS Connection / Session in order to create a new JMS consumer
-        Thread deadlock is triggered!

Root cause: non Thread safe Spring JMS SingleConnectionFactory implementation

A code review and a quick research from Spring JIRA bug database did reveal the following Thread safe defect below with a perfect correlation with the above analysis:

# SingleConnectionFactory's resetConnection is causing deadlocks with underlying OracleAQ's JMS connection
https://jira.springsource.org/browse/SPR-5987

A patch for Spring SingleConnectionFactory was released back in 2009 which did involve adding proper synchronized{} block in order to prevent Thread deadlock in the event of a JMS Connection reset operation:
synchronized (connectionMonitor) {
  //if condition added to avoid possible deadlocks when trying to reset the target connection
  if (!started) {
    this.target.start();
    started = true;
  }
}

Solution

Our team is currently planning to integrate this Spring patch in to our production environment shortly. The initial tests performed in our test environment are positive.


Conclusion

I hope this case study has helped understand a real-life Java Thread deadlock problem and how proper Thread Dump analysis skills can allow you to quickly pinpoint the root cause of stuck Thread related problems at the code level. Please don’t hesitate to post any comment or question.

 
 
Java (programming language) Threading

Published at DZone with permission of Pierre - Hugues Charbonneau, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Mastering Thread-Local Variables in Java: Explanation and Issues
  • Unlocking Performance: Exploring Java 21 Virtual Threads [Video]
  • Visualizing Thread-Safe Singletons in Java
  • Double-Checked Locking Design Pattern in Java

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!