Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

An extreme example of this happens when users share a limited, possibly unreliable internet Internet connection, as is common in parts of Africa for example.

How to cache openSUSE repositories with Squid is another, different example of a use case where picking a URL that's already cached is valuable.

...

When it sees a response with a "Location: ..." header and a "Digest: SHA-256=..." header, it checks if the URL in the Location header is already cached. If it isn't, then it tries to find a URL that is cached to use instead. It looks in the cache for some object that matches the digest in the Digest header and if it succeeds, then it rewites rewrites the Location header with that object's URL.

This way a client should get sent to a URL that's already cached and won't download the file again.

...

Just build the plugin and add it to your plugin.config file.

The code is distributed along with recent versions of Traffic Server, in the " plugins/experimental/metalink " directory. To build it, pass the " --enable-experimental-plugins " option to the configure script when you build Traffic Server:

Code Block
languagebash

When you're done building Traffic Server, add "metalink.so" to your plugin.config file to start using the plugin.

It implements TS_HTTP_SEND_RESPONSE_HDR_HOOK to check and potentially rewrite the " Location : ..." and " Digest : SHA-256=..." headers after responses are cached. It doesn't do it before they're cached because the contents of the cache can change after responses are cached. It uses TSCacheRead() to check if the URL in the " Location : ..." header is already cached. In future, the plugin should also check if the URL is fresh or not.

It implements TS_HTTP_READ_RESPONSE_HDR_HOOK and a null transformtransformation to compute the SHA-256 digest for content as it's added to the cache. It uses SHA256_Init(), SHA256_Update(), and SHA256_Final() from OpenSSL to compute the digest, then it uses TSCacheWrite() to associate the digest with the request URL. This adds a new cache object where the key is the digest and the object is the request URL.

To check if the cache already contains content that matches a digest, the plugin must call TSCacheRead() with the digest as the key, read the URL stored in the resultant object, and then call TSCacheRead() again with this URL as the key. This is probably inefficient and should be improved.

...

The "Digest: SHA-256=..." header is also more efficient than "Link: <...>; rel=duplicate" headers because it involves a constant number of cache lookups. RFC 6249 requires a "Digest: SHA-256=..." header or "Link: <...>; rel=duplicate" headers MUST be ignored:

If Instance Digests are not provided by the Metalink servers, the Link header fields pertaining to this specification MUST be ignored.
Metalinks contain whole file hashes as described in Section 6, and MUST include SHA-256, as specified in [FIPS-180-3].

Alex Rousskov pointed out a project for Squid to implement Duplicate Transfer Detection:

Per Jessen is working on another project for Squid with a similar goal: http://wiki.jessen.ch/index/How_to_cache_openSUSE_repositories_with_Squid