Creating and analyzing Heap Dumps is a standard for analyzing memory issues with JVM based programs. To do this, you have to:
-
create the Heap Dump (duh!)
-
move the data to your computer
-
actually analyze the data
Each of these steps has it's own challenges though the first two seem trivial at first sight.
Creating the Heap Dump
The main challenge in Heap Dump creation I faced in today's production environments is that only essential software is installed in order to keep containers and the potential attack surface small. So likely there is only a JVM which is enough to run the program but not enough to get a Heap Dump. Through the recommendation of a colleague I became aware of jattach. It describes itself as:
All-in-one jmap + jstack + jcmd + jinfo functionality in a single tiny program.
So with:
jattach <PID-OF-JVM-PROCESS> dumpheap <PATH_TO_YOUR_HEAPDUMP>
a Heap Dump is created. And this ~ 24 KB standalone binary should fit in any environment.
Moving the data
Depending on the heap size of the program there is now quite a lot of data to move. This can be quite annoying and slow with scp. One obvious workaround in cloud environments is to upload the data to the object storage of your cloud provider and download it from there to your local machine. The only obstacle should be lacking authorization to do so. That means changing the IAM policies concerning your machine/pod to upload and changing it back afterwards, which is quite cumbersome for a one time need.
Luckily S3 has a feature called presigned URLs: You can use this to generate a one-off URL, which allow to upload the data to S3 without the hassle of changing IAM policies. This useful feature is not fully supported by the AWS CLI yet: There is only support for creating URLs to share data in S3. I wrote a little utility to create URLs for upload. Like jattach it is a single binary without dependencies.
Analysis
In all cases I came across the reason for the memory issue was quite apparent when looking at the data in the Heap Dump: Whatever fills up most of the Heap tends to be the root cause of the problem. There are quite a lot of programs for analyzing Heap Dumps: MAT worked well for me and is free to use. But it seems that YourKit still is the gold standard in that domain.