
How to Investigate high CPU utilization of Java processes
Recently, in one of our client sites, we have observed a crunch in CPU utilization. At some random intervals, CPU went up to 100% and kept utilizing the same until restarting the application. This application is a standalone java application and involves with heavy database operations. It was critical to find the issue and fixing it soon, as this CPU utilization affected other applications on the same server. So I’m going to explain the methodology that we used to find and fix this issue.

Environment
Sun Solaris Server – 12 cores
Finding the process
We have used top
command to get the most CPU utilizing applications. Since the server had 12 cores, it showed total CPU utilization by all cores. But once disabled the Irix mode (pressing ‘I’) we got the average CPU per process. Then obtained the PID of the most CPU utilizing the process.
Irix/Solaris_Mode_toggle
When operating in 'Solaris mode' ('I' toggled Off), a task's CPU usage will be divided by the total number of CPUs. After issuing this command, you'll be informed of the new state of this toggle.

Finding the thread
Once the PID of the process is obtained, used ‘H’ key to list the threads of that process. Then obtained the NID or ‘Soft process Id’ from the first column.

Get the thread dump
Used the following command to obtain a thread dump of the JVM process identified in step 1.
jstack -l PID > jstack.txt
Isolate the stack trace
Inside the thread dump file, it had the stack trace of each thread. But most important thing is identifying the correct stack trace. For each stack trace, there was a NID field in HEX format which is equivalent to the thread id obtained in step 2. Once the thread id is converted to HEX, we have isolated the stack trace.

Find the culprit code block
From the stack trace, we have identified the code block which was executed with 100% CPU. In our case, it was a while loop which entered into an infinite cycle when certain condition met.
Fix it!
So we have fixed that, tested and deployed to production. Now the JVM process is running smoothly even CPU doesn’t know that 🙂