Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The earlier approach consists of pulling the remote log segment metadata from remote log storage APIs as mentioned in the earlier RemoteStorageManager_Old section. This approach worked fine for storages like HDFS. One of the problems of relying on the remote storage to maintain metadata is that tiered-storage needs to have that as strongly consistent, with an impact not only on the metadata itself (e.g. LIST in S3) but also on the segment data (e.g. GET after a DELETE in S3). Additionally to consistency and availability, the cost (and to a lesser extent performance) of maintaining metadata in remote storage needs to be factored in. This is true in the case of S3, LIST APIs incur huge costs. 

So, we uncoupled the remote storage is uncoupled from the remote log metadata store and introduced remote log storage with RemoteStorageManager and remote log metadata storage as RemoteLogMetadataManager respectivelyYou can see the discussion details in the doc located here.

...