...

Page properties

...

Discussion

...

thread

...

Vote thread

...

JIRA

...


Release	1.4

Motivation

The current architecture around the BLOB server and cache components seems rather patched up and has some issues regarding concurrency ([FLINK-6380]), cleanup, API inconsistencies / currently unused API ([FLINK-6329], [FLINK-6008]). These make future integration with FLIP-6 or extensions like offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore propose an improvement on the current architecture as described below which tackles these issues, provides some cleanup, and enables further BLOB server use cases.

Public Interfaces

The proposed changes mainly affect the back-end and are not user-facing.
Currently, we also do not plan any changes to the configuration or the monitoring information, except for:

...

Gliffy Diagram

name	blob-store-architecture
pagePin	8

BlobServer

offers file upload and download facilities based on jobId and BlobKey
local store (file system): read/write access, using "<path>/<jobId>/<BlobKey>"
HA store: read/write access for high availability, using "<path>/<jobId>/<BlobKey>"
responsible for cleanup of local and HA storage
upload to local store, then to HA (possibly in parallel, but waiting for both to finish before acknowledging)
downloads will be served from local storage only
on recovery (HA): download needed files from HA to local store, take cleanup responsibility for all other files on the path, i.e. orphaned files, too! (see below)

...

During recovery, the JobManager (or the Dispatcher for FLIP-6) will:

fetch all jobs to recover
download their BLOBs lazily and increase reference counts appropriately (at the JobManager only after successful job submission)
put any other, i.e. orphaned, file in the configured storage path into staged cleanup

...

Page tree

Versions Compared

Old Version 9

New Version Current

Key

Motivation

Contents

Public Interfaces

BlobServer

Page tree

Page History

Versions Compared

Old Version 9

New Version Current

Key

Motivation

Contents

Public Interfaces

BlobServer