...
As there aren't any details as to how the separation will work, specifically whether a TaskManager -> WebInterface heartbeat will exist, i will assume that there is no message that we can piggyback on.
As such the WebInterface will regularly query The WebRuntimeMonitor will contain a MetricFetcher which queries the JobManager for all available TaskManagers, and then query each of them for a metric dump. Metrics are only fetched if they actually accessed via REST calls, with a minimum time period (10 seconds) between updates.
This will be done in with a separate Thread TimerTask inside the WebRuntimeMonitorMetricFetcher, which also has the responsibility to merge the returned dumps.
The merged dump is kept in a central location inside the WebRuntimeMonitorMetricFetcher, available to different handlers.
...
MetricStore {
void addMetric(String name, Object value);
JobManagerMetricStore
jobManager;jobMan
class JobManagerMetricStore
{
Map<String, Object> metrics;
}
Map<String, TaskManagerMetricStore
> taskmanagers;
class TaskManagerMetricStore
{
Map<String, Object> metrics;
}
Map<String, JobMetricStore
> jobs;
class JobMetricStore
{
Map<String, Object> metrics;
Map<String, TaskMetricStore
> tasks;
}
class TaskMetricStore
{
Map<String, Object> metrics;
Map<String, SubtaskMetricStore
>;
}
class SubtaskMetricStore
{
Map<String, Object> metrics;
}
}
...
Everything can be tested with unit tests.
Rejected Alternatives
-