You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

(warning) Work in Progress (warning)

Overview

The sequence diagrams below are intended to be a very detailed description of the interactions that occur during the process of defining, submitting and executing a map reduce job on a secure Hadoop 2.x cluster. Different phases of the overall process are covered in each diagram.

  • Bootstrap
  • Job Definition
  • Job Submission
  • Job Initiation
  • Map Task Execution
  • Reduce Task Execution
  • Job Completion
  • Client Monitoring

Legend

The descriptions of the interactions in the sequence diagrams below take this form.

message [Protocol] ( input ) : output

The [Protocol] portion describes the protocol, authentication mechanism and identities exchanged.

Abbreviation

Description

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="4461cf26-2d6f-4b30-88a7-fbd96a93053a"><ac:plain-text-body><![CDATA[

[KRB]

Kerberos Protocol

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="90587019-0748-431e-a690-1f33fd54ae73"><ac:plain-text-body><![CDATA[

[RSKT:{kerberos-service-ticket}]

RPC protocol with SASL mutual authentication using Kerberos tickets.

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="53ec0a05-5c04-49fe-84a3-06506c767497"><ac:plain-text-body><![CDATA[

[RSAT:{access-token}]

RPC protocol with SASL mutual authentication using delegation tokens.

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="8553b5bb-61f2-4242-ae98-17d9892ada97"><ac:plain-text-body><![CDATA[

[RSDT:{delegation-token}]

RPC protocol with SASL mutual authentication using delegation tokens.

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="6bbffa8f-2c83-4c3e-be40-aa53cf98a0b7"><ac:plain-text-body><![CDATA[

[STP]

Shuffle data transfer protocol between ShuffleService and ReduceTask. HTTP protocol with TODO.

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="38a73d3f-6abc-4c05-80d4-045c2411b7c2"><ac:plain-text-body><![CDATA[

[DTP]

Block data transfer protocol between the DataNode and a client. HTTP protocol with block tokens plus SHA1 hash exchange.

]]></ac:plain-text-body></ac:structured-macro>

Suffixes are used in many cases to denote type.

Abbreviation

Description

tgt

Kerberos Ticket Granting Ticket

kt

Kerberos Service Ticket: u-jt-kt = A Kerberos Ticket for User u to access the JobTracker jt

kp

Kerberos Principal: nn-kp = The Kerberos principal for the NameNode nn

dt

Delegation Token: c-nn-dt = A delegation token for identity of the Client that can be presented to the NameNode.

tkn

Access Token: am-tkn = An access token that can be presented to the ApplicationMaster for access.

tkn-sk

Token Secret Key

id

Identifier: job-id = Job Identifier

Kerberos principals use the principal abbreviation and the kp suffix.

Abbreviation

Description

nn-kp

NameNode's Kerberos Principal

dn-kp

DataNode's Kerberos Principal (Unique principal for each DataNode on every node)

jt-kp

JobTracker's Kerberos Principal

tt-kp

TaskTracker's Kerberos Principal (Unique principal for each TaskTracker on every node)

Kerberos tickets use the consumer principal abbreviation, provider principal abbreviation and kt suffix.

Abbreviation

Description

u-nn-kt

Kerberos service ticket for User u to access NameNode nn

u-jt-kt

Kerberos service ticket for User u to access JobTracker jt

dn-nn-kt

Kerberos service ticket for DataNode dn to access NameNode nn

jt-nn-kt

Kerberos service ticket for JobTracker dn to access NameNode nn

tt-jt-kt

Kerberos service ticket for TaskTracker tt to access JobTracker jt


Secure MapReduce2 - Bootstrap

This diagram illustrates the interactions that occur when a Hadoop system is starting up and stabilizing. It involves various master components generating secret keys and slave components registering with the masters to receive these secret keys.


Secure MapReduce2 - Job Definition

This diagram illustrates the steps taken by a client to define a MapReduce job that will later be submitted.


Secure MapReduce2 - Job Submission

This diagram illustrates the steps taken during the submission of a MapReduce job.


Secure MapReduce2 - Job Initiation

This diagram illustrates the steps taken when a MapReduce job is scheduled for execution.


Secure MapReduce2 - Map Task Execution

This diagram illustrates the steps taken when the Map portion of a MapReduce job is executed.


Secure MapReduce2 - Reduce Task Execution

This diagram illustrates the steps taken when the Reduce portion of a MapReduce job is executed.


Secure MapReduce2 - Job Completion

This diagram illustrates the steps taken a MapReduce job has completed.


Secure MapReduce2 - Client Monitoring

This diagram illustrates the steps taken by a Client to monitor the status of a Job throughout the Job's life-cycle. The timeframe for this diagram span several of the diagrams above starting from Job Submission all the way through Job Completion.

  • No labels