Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Accumulo manager currently has the following management functionality.

Primary functionSub functionStoragePrimary KeyDescription
Tablet server heart beat


Periodically pings all tablet servers and gathers stats that are aggregated for the monitor.  This entire functionality needs to be reevaluated.
Tablet managementFind tablets needing attentionZK/RT/MT<table id>;<end row>
 assignment

Finds tablets that need split, compaction, assignment, assignment, unassignment,

handle assignment

are assigned to

dead tserver, tablet balancing, split initiation, compaction initiation

a dead tserver.  This functionality already scales out in that this search is run in parallel in tablet server iterators.

Assign and balance tabletsZK/RT/MT<table id>;<end row>The balancer plugin is currently called in the manager process to get tablet assignments and tablet migrations.  This plugin is given per tablet sever stats. The manager tracks active migrations in memory. The manager makes RPCs to tablet servers to load and unload tablets and also writes location information to the metadata table.
Compaction coordinationjob priority queuesMemory<queue id>:<priority>Stores compaction jobs in multiple priority queues used for compactors to find work
reservation/commitZK/RT/MT<table id>;<end row>Atomically reserves and commits files in compaction job for a tablet on behalf of compactors (maybe compactors could commit their self?)
refreshZK/RT/MT~refresh<???>Ensures that after a system compaction happens that hosted tablets are reliably asked to reload their metadata
FATEExecutionZK<64bit int>Reliably executes (even in the case of process faults) a multistep internal operation in Accumulo.  Each fate transaction has a random 64bit id and in zookeeper it stores an operation stack under this id.
Table locksZK<table id>Some fate transactions acquire a table lock that give it exclusive access to the table.  This lock is stored in ZK for the lifetime of the fate transaction.
Service client requestmanage FATEsZK<64bit int>The internal implementation of many Accumulo client operations use common RPCs to initiate and get the status of FATE ops.
Modify ZKZKvarious(todo expand?)There are some client request RPC the manager handles that are not FATEs.
AdvertisementZK<host>:<port>The manager process advertises itself in ZK.
Upgrade
ZK/RT/MTvariousUpdates internal metadata in ZK/RT/MT when a new version of Accumulo starts for the first time.
GCcandidate collectionZK/RK/MT~del<hash><file>Loads a batch of candidate files to delete (todo use % for GC batch size) into memory

candidate use detectionZK/RK/MT<table id>Scans all in use files removing them from the in memory candidate set

file deletionDFS<file path>Deletes any candidates that are in use.  Currently a thread pool is used for this to process the deletions in parallel in the single process.

Accumulo's management functionality deals with meta data, meta operations (like creating a table), and managing the worker processes.  GC and the Monitor have historically been separate processes from the manger process, however their functionality is definitely related to meta data and meta operations in Accumulo.  Since a complete redesign of the manager is being considered, the GC and monitor are included as management functional components for consideration.  The non management components of Accumulo are those that deal directly with data and are the worker processes (compactors, tablet servers, and scan servers).  

The functional management components have the following dependencies on each other.

FunctionDependencyDescription
UpgradeTablet managmentThe upgrade process relies on tablet management to assign tablets during the upgrade process
Tablet managmentUpgradeThe tablet management can not begin to bring a data level of Accumulo online until it has been upgraded
FATETablet managmentSome FATE operations (like split, merge, and delete table) need tablets to be unassigned during the fate transaction. These operation rely on the tablet management functionality to do this.  The merge code is currently too tightly coupled with tablet management, it would be nice to have it less tightly coupled like the new split code.
service client requestFATEMany (todo use % for GC batch size) client request to the manager initiate and check on FATEs
Monitortablet server heart beatThe monitor ask the manger for statistics about the current tservers.
Tablet managementtablet server heart beatThe balancer plugin is given stats gathered from the tablet server heart beat

Multiple active manager processes

Warning, this is an incomplete idea of how to scale the manager and should not be viewed as a definite way forward.

For scalability, Accumulo has a need for multiple processes executing management functionality.  One way to achieve this to take each of the functional components above and to make them an independently scalable service inside of Accumulo.  These independent services would have a internal APIs (not SPI or public API) that dependent services could use. For example the Tablet Management service in Accumulo could have an internal API for requesting tablet unassignment that the split, merge, and delete table fate operations running FATE could call.  The callers of this internal API would not know or care how or where that functionality is processed.  It could be processed by tablet management code running in another manager process.

...