Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In this design, only support manually specifying blocked resources via the REST API, an auto-detection may be introduced in the future.

...

  1. REST API for querying blocklist information.
  2. REST API for adding new blocked items.
  3. REST API for removing existing blocked items.

Query

Add a REST API to obtain blocklist information. Each request will return all current blocked items, which are obtained from ResourceManagerBlocklistHandler.

GET: http://{jm_rest_address:port}/blocklist

...

Code Block
titleResponse
{
  /** This group only contains directly blocked task managers */
  "blockedTaskManagers": [
      {
          "id" : "container_XXX_000002container1",
          "timestampaction" : "XXXMARK_BLOCKED",
          "actionstartTimestamp" : "MARK_BLOCKEDXXX",
      },
    "endTimestamp" : {"XXX",
          "idcause" : "container_XXX_000003",
      },
    "timestamp" : "XXX",{
          "actionid" : "MARK_BLOCKED"
      }, container2",
      ...
  ],
  "blockedNodesaction": [
      { : "MARK_BLOCKED_AND_EVACUATE_TASKS",
          "idstartTimestamp" : "node1XXX",
          "timestampendTimestamp" : "XXX",
          "actioncause" : "MARK_BLOCKEDXXX"
      }, 
    "taskManagers" : [“container_XXX_000004”, “container_XXX_000005”, …]
  ...
  ],
  "blockedNodes": [
      {
          "id" : }"node1",
        ...
  ]
}

Add

POST: http://{jm_rest_address:port}/blocklist/nodes

POST: http://{jm_rest_address:port}/blocklist/taskmanagers

Request:

Code Block
titleRequest
{
  [
  "action" : "MARK_BLOCKED",
          {"startTimestamp" : "XXX",
          "idendTimestamp" : "nodeXXX/container_XXX",
          "actioncause" : "MARK_BLOCKEDXXX",
      },
    /** The {task  managers  on  this  blocked  node */
          "idtaskManagers" : ["nodeXXX/container_XXXcontainer3", "container4"]
       },
   "action"  : "MARK_BLOCKED" ...
      }, 
      ...
  ]
}

Response: {}

]
} 

Field meanings in responses:

  1. id: The identifier of the blocked task manager or node.
  2. action: The block action when a task manager/node is marked as blocked.
  3. startTimestamp: The timestamp of creating this item.
  4. endTimestamp: The timestamp at which the item should be removed.
  5. cause: The cause for creating this item.

Add

POST: http://{jm_rest_address:port}/blocklist/nodes

POST: http://{jm_rest_address:port}/blocklist/taskmanagers

Request:

Code Block
titleRequest
{
    [
        {
            "id" : "node1/container1",
            "action" : "MARK_BLOCKED",
            "timeout" : "XXX",
            "endTimestamp" : "XXX",
            "cause" : "XXX",
        },
        {
            "id" : "node2/container2",
            "action" : "MARK_BLOCKED",
            "timeout" : "XXX",
            "endTimestamp" : "XXX",
            "cause" : "XXX",
        }, 
        ...
    ]
}

Response: {}

Field meanings in requests:

  1. id: The identifier of the blocked task manager or node.
  2. action: The block action when a task manager/node is marked as blocked.
  3. timeout(optional): The timestamp of creating this item.
  4. endTimestamp(optional): The timestamp at which the item should be removed.
  5. cause: The cause for creating this item.

Note that the field timeout and endTimestamp are optional. If neither is specified, it means that the block item is permanent and will not be removed. If both are specified, the minimum of current+tiemout and endTimestamp will be used as the time to remove the blocked item.


Remove

DELETE: http://{jm_rest_address:port}/blocklist/node/<id>/<action>

...

  1. type: The blocked item type, TASK_MANAGER or NODE
  2. id: The identifier of the blocked task manager or node.
  3. action: The block action when a task manager/node is marked as blocked, MARK_BLOCKED or MARK_BLOCKED_AND_EVACUATE_TASKS
  4. startTimestamptimestamp: The timestamp of creating this item.
  5. endTimestamp: The timestamp at which the item should be removed.
  6. cause: The cause for creating this item.
  7. identifier: The identifier of the blocked task manager or node.
  8. action: The action when a task manager/node is marked as blocked.


Code Block
titleBlocklistedItem
/**
 * This class represents a blocked item.
 *
 * @param <ID> Identifier of the blocked item.
 */
public abstract class BlockedItem<ID> {
    public BlockedItemType getType();

    public longBlockAction getTimestampgetAction();
    
    public long getEndTimestampgetStartTimestamp();
    
    public BlockActionlong getActiongetEndTimestamp();
    
    public ThrowableString getCause();

    public abstract ID getIdentifiergetId();
}

/** This class represents a blocked node. */
public class BlockedNode extends BlockedItem<String> {
}

/** This class represents a blocked task manager. */
public class BlockedTaskManager extends BlockedItem<ResourceID> {
}

...

Code Block
titleBlocklistHandler & JobMasterBlocklistHandler
public interface BlocklistHandler extends BlocklistTracker {
    /** Add a new blocked node. */
    void blockNode(String nodeId, BlockAction action, ThrowableString cause, long timestampstartTimestamp, long endTimestamp);

    /** Add a new blocked task manager. */
    void blockTaskManager(ResourceID taskManagerId, BlockAction action, ThrowableString cause, long timestampstartTimestamp, long endTimestamp);
 }

public interface JobMasterBlocklistHandler extends BlocklistHandler {
}

...