What is HTTP Health Checking for?

  • HTTP Health Checking is a Layer 7 health check method, it makes a special HTTP request to the origin server, checking for a well defined healthy result, typically for HTTP response code.
  • Why is L7 health checking required?
    • maintenance requirement: nowadays, most sites have more than one origin server for load balancing and service redundancy. As servers are expected to have maintenance down-times, there needs to be a good way to pull down some server off the service in a way that is easy to manage/change.
    • service broken: there is chance that when the L4 port is open you can still not get any HTTP request handled.
    • back-end service broken: when your origin server using a LAMP like system and when your MySQL is down, you will probably get bad result.

What has been done in TS?

  • L4 health checking:
    • when failed in connection/reading/writing, it will trigger up the L4 fail function
    • the failed host will be put into the ring to retry, after the failure timeouts
  • L4 health status stored in HostDB
  • HostDB information sharing in the cluster, when enabled

What do we need to do implement for L7 checking?

  • in the beginning we'll start off by using HTTP services, excluding raw-link services like HTTPS.
  • We'll merge it into the server-session manager
  • L7 status will be stored in HostDB info struct
  • We must reuse server session as much as possible.
  • We'll create a a dedicated config-file, with a config mapped to the origin hostname
  • L7 health checking should be triggered by user a request on the the origin hostname
  • Like with L4 health-checks, it should should be possible to store a hashed L7 status in the cluster ENV, and share it in the cluster.

how to use the L7 OS health checking?

  • the Syntax explain, there is full document in health_check.config:
    • Each directive may have the following tags:
      • hostname: hostname to set health check, use Origin Server name here
      • port: checking tcp port, default to http port 80
      • interval: check interval in seconds, default to 5s
      • request_method: http method for url query, default to GET
      • request_path: the http request path to send to the origin server, default to /status.html
      • request_host: the http request HOST: header, default to hostname
    • The "hostname" tag indicate the start of one config directive.
  • examples:
    hostname=trafficserver.apache.org;request_path=/status.html
    that will enable L7 health checking for OS: trafficserver.apache.org, will affect all request in TS that mapped to trafficserver.apache.org
  • comments:
    • we do not start healthchecking unless the DNS(HostDB) have the target OS dns in it. that is, the start of healthchking is drive by user request, we don't waste resources unless it is needed.
    • the L7 health status is verified during during ServerSession choosing & openning.
    • when all the servers checked with status fail, it is fall back as all good, make sure request can have some chance to get processed.
  • No labels