Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The existing GET /admin/loggers and GET /admin/loggers/{logger} endpoints will be augmented to provide a timestamp for when the logging level for each namespace was last modified. The timestamp will be a standard Unix timestamp with millisecond precision–that is, it will be the number of milliseconds that have elapsed between January 1st, 1970 and when the namespace was modified on the worker. Timestamps will be updated regardless of whether the namespace update was applied using scope=worker or scope=cluster.

Modification times will be tracked by when they are applied by the worker, as opposed to when they are requested by the user or persisted to the config topic (details below). If no modifications to the namespace have been made since the worker was started, they will be null.

...

  • Ensure that records produced to the config topic have the expected format
  • Ensure that updates to a logging level are reported with the correct last modified timestamp
  • Ensure that logging levels that have not been updated have a null last modified timestamp
  • Ensure that distributed workers that have completed startup correctly handle logging adjustment config topic records
  • Ensure that distributed workers that have not completed startup ignore logging adjustment config topic records
  • Ensure that requests to the existing PUT /admin/loggers/{logger} endpoint with no scope query parameter, and with scope=worker result in the same herder-level behavior as before (mostly likely accomplished by verifying that no interactions with the Herder object have taken place)

System tests

A single test will be added that runs through this series of scenarios and assertions:

  • Ensure that cluster-scoped requests with invalid logging levels are rejected with a 4xx response
  • Ensure that repeated requests to set the same logging level for a namespace do not cause its last modified timestamp to be updated

Integration tests

A new integration test will be added for standalone mode, which will run through this series of scenarios and assertions:

  1. Start a standalone Connect worker
    1. Ensure that the last modified timestamp for all reported logging namespaces is null
  2. Modify the logging level for a specific namespace with no scope parameter
    1. Ensure that the response body is non-empty and matches the same format it had prior to this KIP
    2. Ensure that the last modified timestamp for that namespace is non-null and at least as recent as the time at which the request was issued
    3. Ensure that the logging level for that namespace is correct
    4. Ensure that the last modified timestamp for all other namespaces is still null
    5. Ensure that no other namespaces have been modified
  3. Modify the logging level for a specific namespace with scope=worker
    1. Ensure that the response body is non-empty and matches the same format it had prior to this KIP
    2. Ensure that the last modified timestamp for that namespace is non-null and at least as recent as the time at which the request was issued
    3. Ensure that the logging level for that namespace is correct
    4. Ensure that the last modified timestamp for all other namespaces is still null
    5. Ensure that no other namespaces have been modified
  4. Issue a second request to set the same logging level for the same namespace with scope=worker
    1. Ensure that the last modified timestamp for that namespace is not updated
  5. Modify the logging level for a different namespace with scope=cluster
    1. Ensure that the response body is empty
    2. Ensure that the last modified timestamp and level for that namespace are correct
    3. Ensure that the last modified timestamp and level for all other namespaces remain unchanged

System tests

A single test will be added that runs through this series of scenarios and assertions:

  1. Start a distributed Connect cluster with three workers 
    1. Ensure that the last modified timestamp for all reported logging namespaces is null
  2. Modify the logging level for a specific namespace for single worker
    1. Ensure that the response body is non-empty and matches the same format it had prior to this KIP 
    2. Ensure that the last modified timestamp for that namespace on the affected worker is non-null and at least as recent as the time at which the request was issued (some margin of error may be necessary in the highly unlikely but technically possible event that the node responsible for running tests and the one running the worker have skewed clocks)
    3. Ensure that the logging level for that namespace on the affected worker is reported (via the admin REST API) with the correct level
    4. Ensure that the last modified timestamp for that namespace on all other workers is still null
    5. Ensure that the logging level for that namespace on all other workers is unchanged
  3. Modify the logging level for the root namespace for all workers (using scope=cluster)
    1. Ensure that the response body is empty 
    2. Ensure that, after a reasonable timeout, the logging level for all reported namespaces on all workers is reported with the correct level
  4. Start a Connect cluster with three workers 
    1. Ensure that the last modified timestamp for all reported logging namespaces is null
  5. Modify the logging level for a specific namespace for single worker
    1. Ensure that the last modified timestamp for that namespace on the affected worker all namespaces on all workers is non-null and at least as recent as the time at which the request was issued (some margin of error may be necessary in the highly unlikely but technically possible event that the node responsible for running tests and the one running the worker have skewed clocksat least as recent as the time at which the request was issued
  6. Modify the logging level for a specific namespace for all workers (using scope=cluster)
    1. Ensure that the response body is empty 
    2. Ensure that, after a reasonable timeout, the logging level for that namespace on the affected worker all workers is reported (via the admin REST API) with the correct level
    3. Ensure that the last modified timestamp for that namespace on all other workers is still nullEnsure that the non-null and at least as recent as the time at which the request was issued
  7. Issue a second request to set the same logging level for that the same namespace on for all workers (using scope=cluster)
    1. No assertions will be made for this step other workers is unchanged
  8. Modify the logging level for a different specific namespace for all workers (using scope=cluster)
    1. Ensure that, after a reasonable timeout, the logging level for that namespace on all workers is reported with the correct level
    2. Ensure that the last modified timestamp for that namespace on all workers is non-null and at least as recent as the time at which the request was issued
    Modify the logging level for the root namespace for all workers (using scope=cluster)
    1. Ensure that , after a reasonable timeout, the logging level for all reported namespaces on all workers is reported with the correct levelEnsure that the last modified timestamp for all namespaces on all workers is non-null and at least as recent as the time at which the request was issuedthe namespace affected in steps 4 and 5 is unchanged from when it was tested in step 4 (i.e., the second request in step 5 did not affect it)
  9. Modify the logging level for a specific namespace for a single worker (again)
    1. Ensure that the response body is non-empty and matches the same format it had prior to this KIP 
    2. Ensure that the last modified timestamp for that namespace on the affected worker is at least as recent as the time at which the request was issued
    3. Ensure that the logging level for that namespace on the affected worker is reported with the correct level
    4. Ensure that the last modified timestamp for all namespaces except the modified namespace on the affected worker, and all namespaces for all other workers, is unchanged since the root level was modified for all workers*
    5. Ensure that the logging levels for all namespaces except the modified namespace on the affected worker, and all namespaces for all other workers, is unchanged since the root level was modified for all workers*

...

A system test is used here instead of one or more integration tests because the latter colocate workers with the same JVM, making it difficult to distinguish between changes to the logging levels of a single worker and the whole cluster.

Rejected Alternatives

Request-time modified timestamp tracking

Instead of tracking the last modified timestamp for a logging namespace based on when it was applied by a worker, we could track it by when the request was received, or when it was written to the config topic. This would provide at least one advantage: assuming all workers are caught up on the config topic, every worker would give the exact same response for requests to view the levels of loggers. However, it would also be less accurate: users may be dismayed to see that the logging level for a given namespace had a last modified time of T, but that the actual level of logs emitted by that worker for that namespace was different until time T+n, for some non-negative number n.

...