Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The following is a possible execution path for how a compaction might be found and run.

  1. In the manager the TabletGroupWatcher sets up a batch scanner with the TabletManagementIterator to look for tablets that need compaction, split, merge, assignment.  This Using the batch scanner this iterator can run on many tserver tservers in parallel allowing tablet inspection to execute in parallel.  Note the TabletManagementIterator is completely new in the elasticity branch, see #3409.
  2. The TabletManagementIterator calls the CompactionDispatcher and CompactionPlanner plugins for each tablet to see if any compaction jobs are produced.  If any jobs are produced, then the tablet is returned to the manager as needing compaction.  This assumes the selected set of files is stored in the metadata table.
  3. On the manager, for each tablet returned from the TabletManagementIterator needing compaction the compaction jobs are recomputed.
  4. If the tablets has any jobs in the designated priority queue that differ from the jobs produced, then the old jobs in the queue are removed and the new ones enqueued.
  5. Once the job is in a compaction queue, compactors can take the highest priority job. The manager will need to create an entry for the compaction running on a compactor using a conditional mutation, this is currently done in the tserver (using in memory locks instead of conditional mutations).
  6. Once a compactor is finished with a compaction it will need to report this to the manager.  The manager can use a conditional mutation to commit the compaction.  The manager will need to reliably inform hosted tablets that the tablets files changed like bulk import does in the elasticity branch, this notification can not be lost AND must be delivered for before user compaction API calls returns.

...