Allow customized expiry time for cache entry #1203

gissuebot · 2014-10-31T17:38:42Z

Original issue created by nikazim on 2012-11-14 at 11:01 AM

Allow per entry expiry time for caches built with expireAfterWrite / expireAfterAccess (something like put(K key, V value, long timeout).

gissuebot · 2014-11-01T00:48:06Z

Original comment posted by wasserman.louis on 2012-11-14 at 04:43 PM

(No comment entered for this change.)

Labels: Type-Addition, Package-Cache

gissuebot · 2014-11-01T00:53:37Z

Original comment posted by lowasser@google.com on 2012-12-20 at 08:18 PM

Wait. Wouldn't you want that only if the cache was not built with expireAfterWrite/expireAfterAccess?

In any event, this seems really awkward from an API perspective, in addition to being problematic from an implementation point of view. In particular, customized expiration times more or less require you to use an O(log n) priority queue, as opposed to the current O(1) implementation that is possible when you can just maintain a linked list of most recently written or most recently accessed.

gissuebot · 2014-11-01T01:10:44Z

Original comment posted by kevinb@google.com on 2013-03-12 at 06:43 PM

(No comment entered for this change.)

CC: fry@google.com

gissuebot · 2014-11-01T01:36:29Z

Original comment posted by Ben.Manes on 2013-08-21 at 05:01 AM

fyi,

You wouldn't use a priority queue w/ O(lg n) insertion and removal. Instead you'd require an expected range for expiration times and a resolution (e.g. 1..60 in minutes). This would allow you to a create an array of linked list (e.g. a bounded height priority queue) so that add/remove/reorder is O(1). The penalty would be O(r) of scanning the first element in every slot to see if an eviction is required.

The API would still be awkward, even more so than the O(lg n) version requested, but would be much cheaper to maintain.

(Not advocating, but mentioning as this feature is often requested)

gissuebot · 2014-11-01T01:37:36Z

Original comment posted by kak@google.com on 2013-08-22 at 11:29 PM

(No comment entered for this change.)

Status: Research

edudar · 2015-12-08T20:59:18Z

+1 and went with net.jodah:expiringmap for now

jonheg · 2015-12-16T10:55:33Z

+1 aswell. Custom expire time pr entry would be nice

lialun · 2016-02-06T03:18:56Z

+1

jruillier · 2016-03-01T15:13:09Z

+1

imgeorgeabraham · 2016-03-04T21:31:22Z

+1

violette · 2016-03-10T15:01:50Z

+1 Indeed, custom expire time pr entry would be nice.

ben-manes · 2016-03-10T19:36:16Z

I think providing this feature in O(1) time with a clean API that does not detract from the existing one is a potentially difficult problem. In almost every algorithmic design I can think of, the approach could be served equally well by an external library decorating the cache.

The simplest O(1) approach is to rely on the maximum size constraint. For large caches (like memcached) they augment this with a sweeper thread to proactively discard expired entries. The more general approach requires an O(lg n) algorithm to retain a sorted order. As we've tried to avoid managing our own threads, only use O(1) algorithms, and not be excessively opinionated in how to use the cache this is a little difficult to resolve.

The first step is arguments for why variable expiration is useful in a general library. Most cases we came across then and now use a fixed time for domain-oriented caches. The variable were for specific applications, e.g. http caching, where different data commingles into a single, large cache. Justifying this feature as a common case is needed before considering the implementation details.

The second step is whether the algorithm benefits from the internal mechanisms used by the cache (periodic maintenance). If not, e.g. the lazy O(1) approach, then the feature can easily be contributed as a decorating library. That then side steps the next problem of how to evolve the API cleanly.

ben-manes · 2016-03-26T19:51:37Z

I should amend the above since @lowasser referenced this ticket on SO and, as hinted to in my '14 comment, the approach I keep asking myself about is a hierarchical / hashed wheel timer. This is the scheme used by kernel timers by using hierarchical clocks where a hand points to a bucket and an entry is marked with the number of rotations before expiring. This "approximated timer" scheme works because expiration is best effort and not exact. The hierarchical version is O(m) / O(1) time (where m is small) and would fit well into the cache's periodic maintenance.

This is the only approach I could see making it into the cache as a native feature. Unfortunately I still don't see a nice way to evolve the API (Cache and CacheLoader) to feel natural.

jwcarman · 2016-11-14T14:36:11Z

Could we perhaps create a new interface and not evolve the existing API? It would be a special type of cache.

Maaartinus · 2016-11-14T17:04:34Z

Unfortunately I still don't see a nice way to evolve the API (Cache and CacheLoader) to feel natural.

Wouldn't something similar to Weigher do? Maybe

interface Expirer {
    long expireAt(K key, V value, long lastWritten, long lastAccessed);
)

maybe with methods determining if it's to be called after write and/or access.

ben-manes · 2016-11-14T18:54:03Z

For comparison, jsr107 (JCache) has a version of this called ExpiryPolicy. It uses null, a sentinel value (zero), and exceptions as alternative return values to the duration. It does not provide the context (key, value, metadata) due to not wanting to fetch that information if the entry is not in local memory.

I think variable access expiration could be problematic, depending on the implementation details. It may also be the less useful variant and might not be worth supporting. So I might argue that this would be a expireAfterWrite with a function similar to your proposal.

ben-manes · 2017-04-02T03:12:00Z

I am starting to work on this and feedback would be appreciated.

The goal of the Expiry interface is simplicity and to avoid object allocations. We could make it return either a duration or point-in-time, which would be in nanoseconds.

interface Expiry<K, V> {
  long expireAfterCreate(K key, V value, long currentTimeNanos);
  long expireAfterUpdate(K key, V value, long currentTimeNanos, long expirationTimeNanos);
  long expireAfterRead(K key, V value, long currentTimeNanos, long expirationTimeNanos);
}

The duration approach would have the following characteristics:

The duration may not be zero or negative, meaning that it has be re-evaluated to be expired (as it would interact poorly with loaders).
An entry can be made eternal by returning Long.MAX_VALUE
To avoid updating the expiration time on a read or update, you can return
expirationTimeNanos - currentTimeNanos.

The point-in-time would behave instead as,

The new expiration time may not be less than the current time, meaning the entry has already expired.
An entry can be made eternal by returning currentTimeNanos + Long.MAX_VALUE
To avoid updating the expiration time on a read or update, you can return expirationTimeNanos

Suggestions on preferences, a better name than Expiry, etc. would be appreciated.

Implementation-wise, this will be done using a hierarchical timer wheel (prototype). This allows add, remove, rescheduling, and expiring to be performed in O(1) time. Instead of using a O(lg n) priority queue, it uses a hierarchy of ring buffers, where each slot represents a time span (second, minute, hour). As the wheels rotate, the entries in the buckets are expired or rescheduled. This cascading effect amortizes the penalty and all operations are run during the cache's maintenance operation.

Maaartinus · 2017-04-02T12:20:32Z

An entry can be made eternal by returning currentTimeNanos + Long.MAX_VALUE

Note that the sum is negative, lone Long.MAX_VALUE should do. You can consider any time above e.g., Long.MAX_VALUE / 2 as "never" (as there's no practical difference between allowing expiration duration up to 146 or 292 years).

I guess, after an operation (create, update, or read), the corresponding method gets called. I wonder how their results compose. Does the last returned result win?

ben-manes · 2017-04-02T17:25:01Z

Technically no, but practically yes. Ticker uses System.nanoTime() which states,

The value returned represents nanoseconds since some fixed but arbitrary origin time (perhaps in the future, so values may be negative).... The values returned by this method become meaningful only when the difference between two such values... is computed.

I am also leaning towards a duration API since that is consistent with the builder, e.g. expireAfterWrite.

And you're right, the last value wins. Otherwise we'd want an exclusive lock on the entry for reads to ensure atomicity. expireAfterRead (like access) has limited value as it is not about data freshness but lifetime (e.g. a session token).

Maaartinus · 2017-04-02T18:18:12Z

@ben-manes You're right, I forgot about how System.nanoTime() works.

The duration API could possibly be simpler:

interface Expiry<K, V> {
  long expireAfterCreate(K key, V value);
  long expireAfterUpdate(K key, V value, long currentDuration);
  long expireAfterRead(K key, V value, long currentDuration);
}

as you usually don't care about the current time. This is questionable and currentDuration probably needs to be computed.

ben-manes · 2017-04-02T18:33:01Z

That's a good point on the API. My concern is that the duration may be a delta from a timestamp of an external resource. For example, basing it from a database row's creation_date with a 1hr TTL before a hard refresh is required. An example is a Google Maps geocode cache, which can only be held for 30 days by the license.

In these cases you need the current time to calculate with. A call to nanoTime or currentTimeMillis can be expensive under contention, so minimizing is a good practice in hot paths. Also currentTimeMillis can fluctuate by going backwards in time, so hopefully a call wouldn't accidentally violate a response by returning a zero/negative duration.

But perhaps I'm complicating the API by leaking optimizations? It is more important for an API to be elegant than fast. Per Josh Bloch's rule,

Consider the performance consequences of API design decisions, but don’t warp an API to achieve performance gains. Luckily, good APIs typically lend themselves to fast implementations.

ben-manes · 2017-04-02T19:08:21Z

interface Expiry<K, V> {
  long expireAfterCreate(K key, V value, long currentTime);
  long expireAfterUpdate(K key, V value, long currentTime, long currentDuration);
  long expireAfterRead(K key, V value, long currentTime, long currentDuration);
}

Do you think this is the best compromise or that currentTime should be dropped altogether. I agree that currentDuration is more useful than the previous expirationTime field.

Maaartinus · 2017-04-04T10:19:46Z

Do you think this is the best compromise or that currentTime should be dropped altogether.

I'd keep it, assuming you're sure you can always provide it for free.

I agree that currentDuration is more useful than the previous expirationTime field.

It makes things simpler in the common case you don't want change the duration.

ben-manes · 2017-05-08T06:49:23Z

This feature is now implemented in Caffeine. It could be ported into Guava if a core member was inclined.

ben-manes · 2017-05-12T18:35:58Z

@Maaartinus Unfortunately thinking about the currentTime after release, I realized that it is misleading. The currentTime is from System.nanoTime(), which is a timestamp that should only be used to determine the elapsed time between two calls. It cannot be used when calculating against the wall clock time where System.currentTimeMillis() is appropriate. For entries calculated from a timestamp, e.g. 30 days since the database record's creation_date, the caller cannot use the currentTime parameter. So while the parameter was free to provide, it will likely be useless and confusing.

In v3.0 I may need to drop it. One of the nice things about Guava is the internal baking period to catch these mistakes that an independent OSS library may make.

vanjor · 2017-12-26T06:47:29Z

any update for this issue？vote for this feature

lowasser · 2017-12-26T13:56:00Z

We are not doing any feature development for common.cache at this time.

FoosballGG · 2018-04-02T23:00:55Z

Any clarification as to why this issue was closed unresolved? This was one of the most voted for feature requests for the Guava Cache.

ben-manes · 2018-04-02T23:04:49Z

@FoosballGG The feature is available in Caffeine, which includes Guava adapters. Currently Guava's cache is not being actively developed.

gissuebot added migrated package=cache status=research type=addition labels Nov 1, 2014

cgdecker removed the migrated label Nov 1, 2014

lowasser closed this as completed Dec 26, 2017

shamsimam mentioned this issue Oct 3, 2018

SchedulerHooks plugin interface for COOK + image validation plugin twosigma/Cook#981

Closed

ajantha-bhat mentioned this issue Jan 27, 2020

[CARBONDATA-3665] Support TimeBased Cache expiration using Guava Cache apache/carbondata#3580

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow customized expiry time for cache entry #1203

Allow customized expiry time for cache entry #1203

gissuebot commented Oct 31, 2014

gissuebot commented Nov 1, 2014

gissuebot commented Nov 1, 2014

gissuebot commented Nov 1, 2014

gissuebot commented Nov 1, 2014

gissuebot commented Nov 1, 2014

edudar commented Dec 8, 2015

jonheg commented Dec 16, 2015

lialun commented Feb 6, 2016

jruillier commented Mar 1, 2016

imgeorgeabraham commented Mar 4, 2016

violette commented Mar 10, 2016

ben-manes commented Mar 10, 2016

ben-manes commented Mar 26, 2016

jwcarman commented Nov 14, 2016

Maaartinus commented Nov 14, 2016

ben-manes commented Nov 14, 2016

ben-manes commented Apr 2, 2017

Maaartinus commented Apr 2, 2017

ben-manes commented Apr 2, 2017

Maaartinus commented Apr 2, 2017

ben-manes commented Apr 2, 2017

ben-manes commented Apr 2, 2017

Maaartinus commented Apr 4, 2017

ben-manes commented May 8, 2017

ben-manes commented May 12, 2017

vanjor commented Dec 26, 2017 •

edited

Loading

lowasser commented Dec 26, 2017

FoosballGG commented Apr 2, 2018

ben-manes commented Apr 2, 2018

Allow customized expiry time for cache entry #1203

Allow customized expiry time for cache entry #1203

Comments

gissuebot commented Oct 31, 2014

gissuebot commented Nov 1, 2014

gissuebot commented Nov 1, 2014

gissuebot commented Nov 1, 2014

gissuebot commented Nov 1, 2014

gissuebot commented Nov 1, 2014

edudar commented Dec 8, 2015

jonheg commented Dec 16, 2015

lialun commented Feb 6, 2016

jruillier commented Mar 1, 2016

imgeorgeabraham commented Mar 4, 2016

violette commented Mar 10, 2016

ben-manes commented Mar 10, 2016

ben-manes commented Mar 26, 2016

jwcarman commented Nov 14, 2016

Maaartinus commented Nov 14, 2016

ben-manes commented Nov 14, 2016

ben-manes commented Apr 2, 2017

Maaartinus commented Apr 2, 2017

ben-manes commented Apr 2, 2017

Maaartinus commented Apr 2, 2017

ben-manes commented Apr 2, 2017

ben-manes commented Apr 2, 2017

Maaartinus commented Apr 4, 2017

ben-manes commented May 8, 2017

ben-manes commented May 12, 2017

vanjor commented Dec 26, 2017 • edited Loading

lowasser commented Dec 26, 2017

FoosballGG commented Apr 2, 2018

ben-manes commented Apr 2, 2018

vanjor commented Dec 26, 2017 •

edited

Loading