...
- Should provide a way for operators to monitor the framework QPS.
- So they'll have some idea about what values to use to configure the limiter, how much traffic a framework has put on the Master and see how frameworks react to the throttling.
(Stage 2, not in current scope) Should support online tuning of rate limits (add/delete/update).(Stage 3, not in current scope) The rate limit configuration should survive Master failover.
Usage
RateLimits Configuration
User specifies the input via Master flags in the JSON format.
mesos-master --rate_limits=rates.json
JSON format
Code Block | ||||
---|---|---|---|---|
| ||||
{
"limits": [
{
"principal": "foo",
"qps": 55.5
"capacity": 100000
},
{
"principal": "bar",
}
],
"aggregate_default_qps": 33.3
"aggregate_default_capacity": 1000000
} |
The JSON contains a list of (principal, rate) tuples and an aggregate_default_qps field.
- principal: (Required) Frameworks are identified by their principal, a la --
credentials
.- In the current implementation the principal uniquely identifies the throttled entity, in the future there might be finer-grained entities under the principal.
- You can have multiple frameworks use the same principal (e.g. some Mesos users launch a framework instance for each job), in which case the combined traffic from all frameworks using the same principal are throttled at the specified QPS.
- qps: (Optional) Queries per Second, i.e., the rate.
- Once set, Master guarantees that it does not process messages from this principal higher than this rate.
- QPS is optional so that when it is not present, this principal is given unlimited rate, i.e., not throttled.
- capacity: (Optional) Number of outstanding messages frameworks of the this principal can put on Master.
- If not specified, this principal has unlimited capacity. (It's possible the messages queued up use too much memory that it OOMs the master.)
The JSON also has a field to safeguard the Master from unspecified frameworks.
- aggregate_default_qps and aggregate_default_capacity: All the frameworks not specified in 'limits' get this default rate.
- The rate and capacity are an aggregate value for all of them, i.e., their combined traffic is throttled together.
- If these fields are not present, the unspecified frameworks are not throttled.
- Same as above, if aggregate_default_qps is not specified, aggregate_default_capacity is ignored.
The usage notes are published here.
Design for the Current Scope
...
The user-specified configuration in JSON format is converted into a protobuf object which is used to initialize the RateLimiters.
...
title | RateLimits ProtoBuf definition |
---|
...
The ProtoBuf format is published here.
Rate Limiting in Master
Mesos already has a RateLimiter implementation that queues up incoming requests via a Future-based interface and service them at a configured rate. The majority of the work for MESOS-1306 is to have the limiters correctly configured for each framework, use them in Message handling, plus a few improvements on RateLimiter.
...