You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

The current state of affairs when it comes to cache lookup URLs is quite a mess. This document describes the outline of the changes I'd like to make. Part of this includes eliminating a few APIs, which depend on a strange concept of a 2nd cache state machine. The proposal includes the alternatives I have for this, but bear in mind that the implementation of those alternatives belongs in different project plans.

The problem

There are a number of issues with how we deal with cache URLs in the code right now:

  • There are a number of URL objects that are setup to store potential cache keys. It's quite a mess, and some of them are already dead code. Some of them come from the 2nd cache state machine (more details later). This makes the code overly complicated, and suboptimal.
  • The current API (TSCacheUrlSet()) to modify the cache key is inflexible and slow. The reason for this is two fold:
    1. The API takes a string, which has to be parsed as a URL creating a new URL object internally. This is not only slow, but means the cache key has to be a parseable URL.
    2. To create the cache key (an INK_MD5), we have to stringify this URL object again (undoing the parsing), and do an MD5 over this string. These are simply wasted cycles.
  • There is no easy way to make a small modification to the cache key. You have to create an entirely new URL cache key.

Proposal

I'm breaking this proposal up into three sections, since it involves pretty invasive changes to not only the core, but also the elimination of an experimental API.

API additions

The followin APIs would be added:

  TSReturnCode TSHttpCacheKeySet(void *data, size_t len);
  TSReturnCode TSHttpCacheKeyUpdate(void *data, size_t len);

The Set() API will perform the internal hashing function of the data, replacing the old cache key completely. The Update() API will modify the currently active cache key, defaulted by the core as a hash of the remapped request URL, using the data provided. This is the best way to make incremental changes to the cache key, e.g. in a plugin implementing generation IDs over a set of URLs.

API removal

I'm proposing that we eliminate the 2nd cache state machine, and all the APIs related to this. This includes

  tsapi TSReturnCode TSHttpTxnNewCacheLookupDo(TSHttpTxn txnp, TSMBuffer bufp, TSMLoc url_loc);
  tsapi TSReturnCode TSHttpTxnSecondUrlTryLock(TSHttpTxn txnp);

There's currently only one plugin using these APIs, the rfc5861 plugin. Instead of the old functionality, I propose we make two changes to the core (and additional new APIs):

  • Allow the cache core to tag a cache entry to be served stale, and for how long. This functionality would be exposed through three paths:
  • *# Explicitly set via a plugin using new APIs.
    1. Implicitly set via the stale-while-revalidate Cache-Control: header as specific in RFC5861. This could be overriden by a plugin above.
    2. Defaulted via records.config and/or hosting.config settings. If in records.config, it could be configured per-remap via conf_remap plugin.
  • Allow the HTTP State Machine to restart the cache-sm any number of times (not just once or twice). This requires both changes to the core, and additions to the APIs. It's a more generic implementation to the current 2nd cache state machine.

It's not within the scope of this cleanup project to implement both of these new features.

Code cleanup

There's a large amount of code around this. I'm suggesting we remove the 2nd cache state machine, and reduce the number of URL objects stored. Instead, we store a single INK_MD5 for the cache key. We should also investigate how the existing cache key (TSCacheKey, for non-HTTP caches) can possibly interact with this.

  • No labels