Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Unfortunately, the script uses the does not use the slice method of FileRecords object which reads the whole segment log files(s) and then it outputs the resultcould partially read a part of the  segment log fiel instead of the whole file.

Reading the whole file(s) drastically reduce the usage of this script as this potentially affect a production environment when reading several files in a short period of time, and at the end just reading a few MB or batches will give us the needed information of the current pattern on the topic.

...

The kafka-dump-log.sh uses the DumpLogSegments.scala which uses the object FileRecords for reading the content of the segment log.

I added a new open the slice method which supports to open a new segment log passing the end parameter (already supported by FileRecords object)passing the amount of bytes to be read (end ) parameter.

Then the end the batch iterator will return the InputStream only reading the amount of bytes passed as parameter instead of the whole file

...

As I mentioned above the change requires a new open to call the slice method in FileRecords class to allow to pass the end parameter.

Code Block
languagejava
titleFileRecords open
   val  /**
     * Allows to limit the batches on the file record in bytes
     */
    public static FileRecords open(File file, int end) throws IOException {
        FileChannel channel = openChannelfileRecords = FileRecords.open(file, false, false, 0, false);
        return new FileRecords(file, channel, 0, end, false);
    }).slice(startOffset, maxMessageSize)




  • The code changes can be seen here in the open PR

...

  • There is not any impact on existing users, it is adding a new feature
  • I am using Integer.MAX_VALUE as a default value as FileRecords accepts Integer as end parameter, that means we previously had a limitation of this value. So when the new parameter is not passed it will send the integer.MAX_VALUE which will be the equivalent of reading the whole file (if it does not exceeds the 2 GB) right now based on FileRecord class this limitation already exists