This information is specific to the "Jobs" feature of Ambari when used with the HDP 1.x Stack.

Why are my workflows displayed as individual MapReduce jobs on the Ambari apps page?

Your workflows need to implement the DAG specification below.

How to let Ambari know about your DAGs

Many applications construct workflows that consist of multiple MapReduce jobs. To enable logging and analysis of whole workflows, applications may set the following properties for each MapReduce job configuration. The first three properties, mapreduce.workflow.id, mapreduce.workflow.name, and mapreduce.workflow.adjacency.* should be identical for all jobs in a particular run of the workflow. Each job in a workflow will have a unique value for the property mapreduce.workflow.node.name indicating which node of the workflow it is. The last property, mapreduce.workflow.tags, is an additional way of filtering together sets of jobs so you can analyze them as a group.

  • mapreduce.workflow.id - a unique ID for the workflow, ideally prepended with the application name
    e.g. appname_UID or pig_efca34ea-7496-446a-b5aa-df502cd5a5be
  • mapreduce.workflow.name - a name for the workflow, to distinguish this workflow from other workflows and to group different runs of the same workflow
    e.g. a hive query or the name of a pig script
  • mapreduce.workflow.adjacency.* - an adjacency list for the workflow graph, encoded as mapreduce.workflow.adjacency.<source node> = <comma-separated list of target nodes>
  • mapreduce.workflow.node.name - the name of the node corresponding to this MapReduce job in the workflow adjacency list
Example

For all MRs in the workflow:

conf.set("mapreduce.workflow.id", "appname_run0001");
conf.set("mapreduce.workflow.name", "workflow1");
conf.setStrings("mapreduce.workflow.adjacency.A", new String[]{"B", "C"});
conf.setStrings("mapreduce.workflow.adjacency.B", new String[]{"C"});

For the first MR in the workflow (labeled "A"):

conf.set("mapreduce.workflow.node.name", "A");

For the second MR in the workflow (labeled "B"):

conf.set("mapreduce.workflow.node.name", "B");

For the third MR in the workflow (labeled "C"):

conf.set("mapreduce.workflow.node.name", "C");

To see how these properties are added for Pig and Hive, see PIG-3048 and HIVE-3708.

  • No labels