Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion thread: here

JIRAhere 

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

Code Block
languagescala
themeConfluence
linenumberstrue
def calculateTotalSizecalculateRemoteTierSize() {
  // Find the leader epochs from leader epoch cache.
  val validLeaderEpochs = fromLeaderEpochCacheToEpochs(log)
  // For each leader epoch in current lineage, calculate size of log
  val totalSizeremoteLogSizeBytes = validLeaderEpochs.map(epoch => rlmm.getRemoteLogSize(epoch)).sum
  totalSize
}remoteLogSizeBytes
}// the new  API would be used for size based retention as:

val totalLogSize = remoteLogSizeBytes + log.localOnlyLogSegmentsSizevar remainingSize = if (shouldDeleteBySize) totalLogSize - retentionSize else 0val segmentsIterator = remoteLogMetadataManager.listRemoteLogSegmentwhile (remainingSize > 0 && segmentsIterator.hasNext) {        // delete segments }

Code changes

  1. Add the new API to RemoteLogMetadataManager
  2. Implement the new API at TopicBasedRemoteLogMetadataManager (with unit tests)
  3. Add the new metric when code for RemoteLogManager has been merged.

...

This KIP proposes to add a new metric RemoteLogSizeBytes which tracks the size of data stored in remote tier for a topic partition.
This metric will be useful both for the admin and the user to monitor in real time the volume of the more tiered data. It would be used in future to add the size of remote tier in response to DescribeLogDirs API call. RemoteLogSizeBytes will be updated using the values obtained from getRemoteLogSize API call on every attempt to compute remote segments eligible for deletion by the RemoteLogManagereach time we run the log retention check (that is, log.retention.check.interval.ms) and when user explicitly call getRemoteLogSize().

Compatibility, Deprecation, and Migration Plan

...