Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

  • Should provide a way for operators to monitor the framework QPS.
    • So they'll have some idea about what values to use to configure the limiter, how much traffic a framework has put on the Master and see how frameworks react to the throttling.
  • (Stage 2, not in current scope) Should support online tuning of rate limits (add/delete/update).
  • (Stage 3, not in current scope) The rate limit configuration should survive Master failover.

Usage

RateLimits Configuration

User specifies the input via Master flags in the JSON format.

mesos-master --rate_limits=rates.json

JSON format

Code Block
languagejs
titlesample rates.json
{
    "limits": [
        {
            "principal": "foo",
            "qps": 55.5
            "capacity": 100000
        },
        {
            "principal": "bar",
        }
    ],
    "aggregate_default_qps": 33.3
    "aggregate_default_capacity": 1000000
}

The JSON contains a list of (principal, rate) tuples and an aggregate_default_qps field.

  • principal: (Required) Frameworks are identified by their principal, a la --credentials.
    • In the current implementation the principal uniquely identifies the throttled entity, in the future there might be finer-grained entities under the principal.
    • You can have multiple frameworks use the same principal (e.g. some Mesos users launch a framework instance for each job), in which case the combined traffic from all frameworks using the same principal are throttled at the specified QPS.
  • qps: (Optional) Queries per Second, i.e., the rate.
    • Once set, Master guarantees that it does not process messages from this principal higher than this rate.
    • QPS is optional so that when it is not present, this principal is given unlimited rate, i.e., not throttled.
  • capacity: (Optional) Number of outstanding messages frameworks of the this principal can put on Master.
    • If not specified, this principal has unlimited capacity. (It's possible the messages queued up use too much memory that it OOMs the master.)

The JSON also has a field to safeguard the Master from unspecified frameworks.

  • aggregate_default_qps and aggregate_default_capacityAll the frameworks not specified in 'limits' get this default rate.
    • The rate and capacity are an aggregate value for all of them, i.e., their combined traffic is throttled together.
    • If these fields are not present, the unspecified frameworks are not throttled.
    • Same as above, if aggregate_default_qps is not specified, aggregate_default_capacity is ignored.

The usage notes are published here.

Design for the Current Scope

...

The user-specified configuration in JSON format is converted into a protobuf object which is used to initialize the RateLimiters.

...

titleRateLimits ProtoBuf definition

...

The ProtoBuf format is published here.

Rate Limiting in Master

Mesos already has a RateLimiter implementation that queues up incoming requests via a Future-based interface and service them at a configured rate. The majority of the work for MESOS-1306 is to have the limiters correctly configured for each framework, use them in Message handling, plus a few improvements on RateLimiter.

...