THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
Description | Remark | ||||||
---|---|---|---|---|---|---|---|
--source-base-path | Base path for the source Hudi dataset to be snapshotted | required | |||||
--target-base-path | Base path for the target output files (snapshots) | required | |||||
--snapshot-prefix | Snapshot prefix or directory under the target base path in order to segregate different snapshots | optional; may default to provide a daily prefix at run time like 2019/11/12/ | |||||
--output-format | "HUDI", "PARQUET" | required; When "HUDI", behaves the same as HoodieSnapshotCopier ; may support more data formats in the future | |||||
--output-partition-field | A field to be used by Spark repartitioning | optional; Ignored when "HUDI" or when The output dataset's default partition field will inherent from the source Hudi dataset. When this argument is specified, the provided value will be used for both in-memory Spark repartitioning and output file partition. See the code snippet below
In case of more flexibility needed for repartitioning, use | |||||
--output-partitioner | A class to facilitate custom repartitioning | optional; Ignored when "HUDI" |
...