Page History

...

Code Block

title	Spark Application's job info
collapse	true

[
	{
		jobId: 1,
		name: "sum at <stdin>:1",
		submissionTime: "2016-01-13T09:56:43.335GMT",
		completionTime: "2016-01-13T09:56:43.710GMT",
		stageIds: [
			1
		],
		status: "FAILED",
		numTasks: 2,
		numActiveTasks: 1,
		numCompletedTasks: 0,
		numSkippedTasks: 0,
		numFailedTasks: 7,
		numActiveStages: 0,
		numCompletedStages: 0,
		numSkippedStages: 0,
		numFailedStages: 1
	},
	{
		jobId: 0,
		name: "count at <stdin>:1",
		submissionTime: "2016-01-13T09:56:07.496GMT",
		completionTime: "2016-01-13T09:56:09.299GMT",
		stageIds: [
			0
		],
		status: "SUCCEEDED",
		numTasks: 2,
		numActiveTasks: 0,
		numCompletedTasks: 2,
		numSkippedTasks: 2,
		numFailedTasks: 0,
		numActiveStages: 0,
		numCompletedStages: 1,
		numSkippedStages: 0,
		numFailedStages: 0
	}
]

Notes

Spark History Server reply on logs written by spark applications to report applications' status

But sometime logs may not be correctly updated by spark jobs, for example the following job is actually completed, but the logs on hdfs shows it's still in progress(not completed), which cause spark history server report wrong status

ID	User	Name	Application Type	Queue	StartTime	FinishTime	State	FinalStatus	Progress	Tracking UI
application_1452593058395_0006	root	PySparkShell	SPARK	default	Tue, 12 Jan 2016 15:27:54 GMT	Tue, 12 Jan 2016 18:05:49 GMT	FINISHED	SUCCEEDED		History

hdfs dfs -ls /directory/
Found 4 items
-rwxrwx--- 3 root supergroup 13227 2016-01-12 15:27 /directory/application_1452593058395_0005
-rwxrwx--- 3 root supergroup 13227 2016-01-12 18:05 /directory/application_1452593058395_0006.inprogress
-rwxrwx--- 3 root supergroup 51025 2016-01-13 09:48 /directory/application_1452593058395_0007
-rwxrwx--- 3 root supergroup 67994 2016-01-13 09:57 /directory/application_1452593058395_0008

Page tree

Versions Compared

Old Version 10

New Version 11

Key

Notes