Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We propose to introduce the blacklist mechanism to solve this problem. Blacklist is a mechanism to filter out problematic resources. Once a resource is judged to be abnormal, it will be blacklisted to avoid assigning tasks to it. We will introduce following two ways to specify blacklisted blacklist resources:

  1. Manually specify the blacklisted resources through REST API. When users find abnormal nodes/TMs, they can manually blacklist them inby in this way.
  2. Automatically detect abnormal resources and blacklist them. Users can specify a blacklist strategy , and Flink will automatically blacklist which identifies abnormal resources according to the strategyreceived exception and related locations.

Public Interfaces

We propose to introduce following configuration options for blacklist:

...

Code Block
titleResourceManagerGateway
public interface ResourceManagerGateway {
   CompletableFuture<BlacklistInfo> requestBlacklist(@RpcTimeout Time timeout);
   // ...
}


GET: http://{jm_rest_address:port}/blacklist

Request: {}

Response:

Code Block
titleResponse
{
  /** This group only contains directly blacklisted task managers */
  "blacklistedTaskManagers": [
      {
          "id" : "container_XXX_000002",
          "timestamp" : "XXX",
          "action" : "MARK_BLACKLISTED"
      },
      {
          "id" : "container_XXX_000003",
          "timestamp" : "XXX",
          "action" : "MARK_BLACKLISTED"
      }, 
      ...
  ],
  "blacklistedNodes": [
      {
          "id" : "node1",
          "timestamp" : "XXX",
          "action" : "MARK_BLACKLISTED"
          "taskManagers" : [“container_XXX_000004”, “container_XXX_000005”, …]
      },
      ...
  ]
}

Add

POST: http://{jm_rest_address:port}/blacklist/add

Request:

Code Block
titleResponseRequest
{
  "newBlacklistedTaskManagers": [
      {
          "id" : "container_XXX_000002",
          "action" : "MARK_BLACKLISTED"
      },
      {
          "id" : "container_XXX_000003",
          "action" : "MARK_BLACKLISTED"
      }, 
      ...
  ],
  "newBlacklistedNodes": [
      {
          "id" : "node1",
          "action" : "MARK_BLACKLISTED"
      },
      ...
  ]
}

...