Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

I'm breaking this proposal up into three sections, since it involves pretty invasive changes to not only the core, but to existing (experimental) APIs.

API additions

also the elimination of an experimental API.

API additions

We will use the term "Key" or "Cache Key" to mean the original data (by default the URL) that externally identifies the object. The term "ID" will be used for the post-hash data used internally to identify objects in the cache.

A type to represent the post-hash data. This should include a data pointer (either void* or uint8_t*) and a size indicator (an enumerated type or a literal data length).

Code Block

TSCacheID

The following The follow APIs would be added:

Code Block
  TSReturnCode TSHttpCacheKeySet( TSHttpTxnCacheKeySet(TSHttpTxn txnp, void *data, size_t len);
  TSReturnCode TSHttTxnpCacheKeyUpdate(TSHttpTxn txnp, void *data, size_t len);
  TSCacheID* TSHttpTxnCacheIDGet(TSHttpTxn txnp);
  TSReturnCode TSHttpCacheKeyUpdate(TSCacheKeyGenerate(TSCacheID *result, void *data, size_t data_len);

The Set() API will perform the internal hashing function of the data, replacing the old cache key for the transaction completely. The Update() API will modify the currently active cache key, defaulted by the core as a hash of the remapped request URL, using by in effect appending the data provided to the original cache key. This is the best way to make incremental changes to the cache key, e.g. in a plugin implementing generation IDs over a set of URLs. IDGet() will return the result of hashing the key. Generate() exposes the hashing function. It computes the hash ID based on the input and stores it in the TSCacheID.

NOTE: We also need a way to optionally preserve the data used to generate the cache key for debugging purposes. Users that manipulate the cache key (eg. using the cacheurl plugin) need to be able to verify their changes.

NOTE: We should also study the TSCacheKey API and determine whether it makes sense to extend or repurpose it.

API removal

I'm proposing that we eliminate the 2nd cache state machine, and all the APIs related to this. This includes

...

  • Allow the cache core to tag a cache entry to be served stale, and for how long. This functionality would be exposed through three paths:
    1. Explicitly set via a plugin using new APIs.
    2. Implicitly set via the stale-while-revalidate Cache-Control: header as specific in RFC5861. This could be overriden overridden by a plugin above.
    3. Defaulted via records.config and/or hosting.config settings. If in records.config, it could be configured per-remap via conf_remap plugin.

...

There's a large amount of code around this. I'm suggesting we remove the 2nd cache state machine, and reduce the (fairly large) number of URL objects stored. Instead, we store a single INK_MD5 for the cache key, and keep it updated through the state-machine as well as via APIs. I will also provide benchmarks on the effects of this once completed, but we have some anecdotal evidence how expensive modifying the cache key is today via the current API.

We should also investigate how the existing cache key (TSCacheKey, for non-HTTP caches) can possibly interact with this.