Introduction to Django

Learning Python can be extremely useful for penetration testers, and a simple understanding of its frameworks can be a key to success. We are going to learn about one of the best ones ever made…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




String Deduplication in Java

This article aims to explain and demonstrate what String Deduplication in Java is.

TL;DR
String Deduplication allows multiple Strings to share the same underlying character array. You can activate it as follows.

I’m using OpenJDK 13, as long as your using java version 9 or above you will be able to follow along and reproduce the results presented here.

First, let’s explain the basics of the String type. The String contains a field called value which holds the actual content (the character).

You probably know that it is probably bad practice to create new String objects instead of using something called a “String literal”.

Then open up another console window to extract the heap dump (replace {PID} with your pid).

Open this dump in VisualVM, select “Objects” then “GC Roots” and navigate to the ArrayList, you will find that the ArrayList contains 4 elements as below.

Notice here that the first 4 strings are different references, while the last two are the same. Further, expanding the elements, you can note that the first two strings will also point to two different byte arrays (the value field in String).

The reason the “bad” string will contain the same value field is due to the String literal being used in the String constructor which takes it from the String pool. In this case it’s unnecessary to use the String constructor, there are some special cases where it can be useful, not discussed here.

What to do? Here is where String Deduplication comes into the picture. (Note that String Deduplication only works for the G1 garbage collector.) Run the following commands to start the application.

Then open up another console window to force a GC then extract the heap dump (replace {PID} with your pid).

The reason we do a GC first is because this is when the String Deduplication happens, which can result in slightly longer pause times. But hopefully this should make other phases more efficient as fewer objects needs to be moved around.

Now open up VisualVM and note the difference in the underlying value field of the two first strings.

Expanded the “worst” strings with String Deduplication.

Voila, they now share the same underlying value field.

References

Add a comment

Related posts:

One Year Later

This past year has been a rollercoaster to say the least. This situation is one that I could’ve never imagined in a million years and it completely changed everything about me. When the pandemic…

A Formula for Great Writing

Some will be offended at the very idea that a magic formula works for writing. Writing is an art, they will cry. It can’t be boiled down to a few rules! What is this click-bait bs? They would be both…

Prescribing food to cure what ails us.

Imagine you go into your doctor’s office to get treatment for a rash of symptoms: fatigue, blurred vision, dark patches on your skin, some unexpected weight loss, frequent urination and a constant…