Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The organizational form of Option1 will be clearer, and user dependence will be clearer; but an additional module will be added.
Option2 has no changes to the current module organization, but the general interface implementation is placed in the hudi-hive-sync module. The user-defined implementation depends on this module, which is a bit semantically strange.

I personally prefer Option2Option1.

2.2 Code (class) structure

Image Addedimage.pngImage Removed

Among them, AbstractHoodieSyncClient is an abstract synchronization client, the default implementation is HoodieHiveClient; users can customize to implement Client.
The abstract methods in AbstractHoodieSyncClient are below

public abstract void createTable(String tableName, MessageType storageSchema,
String inputFormatClass, String outputFormatClass, String serdeClass);
public abstract boolean doesTableExist(String tableName);

public abstract Option<String> getLastCommitTimeSynced(String tableName);

public abstract void updateLastCommitTimeSynced(String tableName);

public abstract void addPartitionsToTable(String tableName, List<String> partitionsToAdd);

public abstract void updatePartitionsToTable(String tableName, List<String> changedPartitions);

If option1 is used, then AbstractHoodieSyncClient will be put into the hudi-common-sync module; if option2 is used, then AbstractHoodieSyncClient will be put into hudi-hive-sync.
Image Removed
Image Added

AbstractSyncTool is an abstract synchronization tool. All synchronization tools must inherit this class, and the default implementation is HiveSyncTool.

...