...
if Hive Sync is enabled in the deltastreamer tool or datasource, the dataset is available in Hive as a couple of tables, that can now be read using HiveQL, Presto or SparkSQL. See here for more.
...
Hudi provides built in support for rewriting your entire dataset into Hudi one-time using the HDFSParquetImporter
tool available from the hudi-cli . You could also do this via a simple read and write of the dataset using the Spark datasource APIs. Once migrated, writes can be performed using normal means discussed here. This topic is discussed in detail here, including only doing partial migrations.
How can I pass hudi configurations to my spark job?
...
Can I register my Hudi dataset with Apache Hive metastore?
<Answer WIP>Yes. This can be performed either via the standalone Hive Sync tool or using options in deltastreamer tool or datasource.
How does the Hudi indexing work & what are its benefits?
...
What's Hudi's schema evolution story?
<Answer WIP>
...
How do I run compaction for a MOR dataset?
Simplest way to cio
Performance
...