DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Implementing Multi-Level Caching in Java With NCache
  • Mastering Distributed Caching on AWS: Strategies, Services, and Best Practices
  • Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example
  • Introduction Garbage Collection Java

Trending

  • The 4 R’s of Pipeline Reliability: Designing Data Systems That Last
  • AI's Dilemma: When to Retrain and When to Unlearn?
  • Comprehensive Guide to Property-Based Testing in Go: Principles and Implementation
  • Unlocking AI Coding Assistants Part 2: Generating Code
  1. DZone
  2. Data Engineering
  3. Data
  4. In-Process Caching vs. Distributed Caching

In-Process Caching vs. Distributed Caching

By 
Faheem Sohail user avatar
Faheem Sohail
·
Aug. 08, 13 · Interview
Likes (8)
Comment
Save
Tweet
Share
106.5K Views

Join the DZone community and get the full member experience.

Join For Free

In this post I will compare the pros and cons of using an in-process cache versus a distributed cache and when you should use either one. First, let's see what each one is.

As the name suggests, an in-process cache is an object cache built within the same address space as your application. The Google Guava Library provides a simple in-process cache API that is a good example. On the other hand, a distributed cache is external to your application and quite possibly deployed on multiple nodes forming a large logical cache. Memcached is a popular distributed cache. Ehcache from Terracotta is a product that can be configured to function either way.

Following are some considerations that should be kept in mind when making a decision.

Considerations In-Process Cache Distributed Cache Comments
Consistency While using an in-process cache, your cache elements are local to a single instance of your application. Many medium-to-large applications, however, will not have a single application instance as they will most likely be load-balanced. In such a setting, you will end up with as many caches as your application instances, each having a different state resulting in inconsistency. State may however be eventually consistent as cached items time-out or are evicted from all cache instances. Distributed caches, although deployed on a cluster of multiple nodes, offer a single logical view (and state) of the cache. In most cases, an object stored in a distributed cache cluster will reside on a single node in a distributed cache cluster. By means of a hashing algorithm, the cache engine can always determine on which node a particular key-value resides. Since there is always a single state of the cache cluster, it is never inconsistent. If you are caching immutable objects, consistency ceases to be an issue. In such a case, an in-process cache is a better choice as many overheads typically associated with external distributed caches are simply not there. If your application is deployed on multiple nodes, you cache mutable objects and you want your reads to  always be consistent rather than eventually consistent, a distributed cache is the way to go.
Overheads This dated but very descriptive article describes how an in-process cache can negatively effect performance of an application with an embedded cache primarily due to garbage collection overheads. Your results however are heavily dependent on factors such as the size of the cache and how quickly objects are being evicted and timed-out. A distributed cache will have two major overheads that will make it slower than an in-process cache (but better than not caching at all): network latency and object serialization As described earlier, if you are looking for an always-consistent global cache state in a multi-node deployment, a distributed cache is what you are looking for (at the cost of performance that you may get from a local in-process cache).
Reliability An in-process cache makes use of the same heap space as your program so one has to be careful when determining the upper limits of memory usage for the cache. If your program runs out of memory there is no easy way to recover from it. A distributed cache runs as an independent processes across multiple nodes and therefore failure of a single node does not result in a complete failure of the cache. As a result of a node failure, items that are no longer cached will make their way into surviving nodes on the next cache miss. Also in the case of distributed caches, the worst consequence of a complete cache failure should be degraded performance of the application as opposed to complete system failure. An in-process cache seems like a better option for a small and predictable number of frequently accessed, preferably immutable objects. For large, unpredictable volumes, you are better off with a distributed cache.

Recommendation

For a small, predictable number of preferably immutable objects that have to be read multiple times, an in-process cache is a good solution because it will perform better than a distributed cache. However, for cases in which the number of objects that can be or should be cached is unpredictable and large, and consistency of reads is a must-have, a distributed cache is perhaps a better solution even though it may not bring the same performance benefits as an in-process cache. It goes without saying that your application can use both schemes for different types of objects depending on what suits the scenario best.

Cache (computing) Distributed cache application Object (computer science)

Published at DZone with permission of Faheem Sohail, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Implementing Multi-Level Caching in Java With NCache
  • Mastering Distributed Caching on AWS: Strategies, Services, and Best Practices
  • Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example
  • Introduction Garbage Collection Java

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!