DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Singleton: 6 Ways To Write and Use in Java Programming
  • Google Guava and Its Two Fantastic Libraries: Graph and Eventbus

Trending

  • Immutable Secrets Management: A Zero-Trust Approach to Sensitive Data in Containers
  • Optimizing Integration Workflows With Spark Structured Streaming and Cloud Services
  • Cosmos DB Disaster Recovery: Multi-Region Write Pitfalls and How to Evade Them
  • Endpoint Security Controls: Designing a Secure Endpoint Architecture, Part 2
  1. DZone
  2. Coding
  3. Java
  4. Guava Splitter vs StringUtils

Guava Splitter vs StringUtils

By 
Tom Jefferys user avatar
Tom Jefferys
·
Apr. 25, 12 · Interview
Likes (0)
Comment
Save
Tweet
Share
40.0K Views

Join the DZone community and get the full member experience.

Join For Free

So I recently wrote a post about good old reliable Apache Commons StringUtils, which provoked a couple of comments, one of which was that Google Guava provides better mechanisms for joining and splitting Strings. I have to admit, this is a corner of Guava I've yet to explore. So thought I ought to take a closer look, and compare with StringUtils, and I have to admit I was surprised at what I found.

Splitting strings eh? There can't be many different ways of doing this surely?

Well Guava and StringUtils do take a sylisticly different approach. Lets start with the basic usage.

// Apache StringUtils...
String[] tokens1 = StringUtils.split("one,two,three",',');
 
// Guava splitter...
Iterable<String> tokens2 = Splitter.on(',').split("one,two,three");

So, my first observation is that Splitter is more object orientated. You have to create a splitter object, which you then use to do the splitting. Whereas the StringUtils splitter methods uses a more functional style, with static methods.

Here I much prefer Splitter. Need a reusable splitter that splits comma separated lists? A splitter that also trims leading and trailing white space, and ignores empty elements? Not a problem:

Splitter niceCommaSplitter = Splitter.on(',')
                              .omitEmptyString()
                              .trimResults();
 
niceCommaSplitter.split("one,, two,  three"); //"one","two","three"
niceCommaSplitter.split("  four  ,  five  "); //"four","five"

That looks really useful, any other differences?

The other thing to notice is that Splitter returns an Iterable<String>, whereas StringUtils.split returns a String array.

Don't really see that making much of a difference, most of the time I just want to loop through the tokens in order anyway!

I also didn't think it was a big deal, until I examined the performance of the two approaches. To do this I tried running the following code:

final String numberList = "One,Two,Three,Four,Five,Six,Seven,Eight,Nine,Ten";
 
long start = System.currentTimeMillis(); 
for(int i=0; i<1000000; i++) {
    StringUtils.split(numberList , ',');  
}
System.out.println(System.currentTimeMillis() - start);
   
start = System.currentTimeMillis();
for(int i=0; i<1000000; i++) {
    Splitter.on(',').split(numberList );
}
System.out.println(System.currentTimeMillis() - start);

On my machine this output the following times:

594
31

Guava's Splitter is almost 10 times faster!

Now this is a much bigger difference than I was expecting, Splitter is over 10 times faster than StringUtils. How can this be? Well, I suspect it's something to do with the return type. Splitter returns an Iterable<String>, whereas StringUtils.split gives you an array of Strings! So Splitter doesn't actually need to create new String objects.

It's also worth noting you can cache your Splitter object, which results in an even faster runtime.

Blimey, end of argument? Guava's Splitter wins every time?

Hold on a second. This isn't quite the full story. Notice we're not actually doing anything with the result of the Strings? Like I mentioned, it looks like the Splitter isn't actually creating any new Strings. I suspect it's actually deferring this to the Iterator object it returns.

So can we test this?

Sure thing. Here's some code to repeatedly check the lengths of the generated substrings:

final String numberList = "One,Two,Three,Four,Five,Six,Seven,Eight,Nine,Ten";
long start = System.currentTimeMillis(); 
for(int i=0; i<1000000; i++) {
  final String[] numbers = StringUtils.split(numberList, ',');
    for(String number : numbers) {
      number.length();
    }
  }
System.out.println(System.currentTimeMillis() - start);
   
Splitter splitter = Splitter.on(',');
start = System.currentTimeMillis();
for(int i=0; i<1000000; i++) {
  Iterable<String> numbers = splitter.split(numberList);
    for(String number : numbers) {
      number.length();
    }
  }
System.out.println(System.currentTimeMillis() - start);

On my machine this outputs:

609
2048

Guava's Splitter is almost 4 times slower!

Indeed, I was expecting them to be about the same, or maybe Guava slightly faster, so this is another surprising result. Looks like by returning an Iterable, Splitter is trading immediate gains, for longer term pain. There's also a moral here about making sure performance tests are actually testing something useful.

In conclusion I think I'll still use Splitter most of the time. On small lists the difference in performance is going to be negligible, and Splitter just feels much nicer to use. Still I was surprised by the result, and if you're splitting lots of Strings and performance is an issue, it might be worth considering switching back to Commons StringUtils.

 

 

 

 

 

 

 

 

Google Guava

Published at DZone with permission of Tom Jefferys, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Singleton: 6 Ways To Write and Use in Java Programming
  • Google Guava and Its Two Fantastic Libraries: Graph and Eventbus

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!