Status

Discussion threadhttps://lists.apache.org/thread/3gcxhnqpsvb85golnlxf9tv5p43xkjg 
Vote threadhttps://lists.apache.org/thread/99cw4y50xhvc1h9z7v07j5v1krqcxr27
JIRA

Unable to render Jira issues macro, execution error.

Release

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The Path class is currently mutable to support IOReadableWritable serialization. However, many parts of the code assume that the Path is immutable. By making the Path class immutable, we can ensure that paths are stored correctly without the possibility of mutation and eliminate the occurrence of subtle errors.

Public Interfaces

Modify the Path class to no longer implement the IOReadableWritable interface.

Add two static methods to the Path class to support serializing to DataInputView and deserializing from DataOutputView:

public static Path deserializeFromDataInputView(DataInputView in) throws IOException

public static void serializeToDataOutputView(Path path, DataOutputView out) throws IOException

Proposed Changes

There are two steps to modify the Path class to no longer implement the IOReadableWritable interface.

First, add the two static methods to the Path class to support serialization and deserialization. Currently, three classes need to serialize/deserialize the Path using the IOReadableWritable interface: FileSourceSplitSerializer/TestManagedSinkCommittableSerializer/TestManagedFileSourceSplitSerializer. Modify these classes to serialize/deserialize the Path using the two static methods instead of IOReadableWritable.

Second, mark the implementation methods of IOReadableWritable in the Path class as deprecated and remove them in the next major release.

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
  • For the users who rely on IOReadableWritable to serialize/deserialize the Path will need to migrate to using the serializeToDataOutputView/deserializeFromDataInputView methods in the Path class. This migration shouldn't be difficult.

  • As the implementation method of IOReadableWritable in the Path class is part of the public API, we will deprecate the method 2 releases before removing it.