Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Does AWS GLUE  support Hudi ?

AWS Glue does not have official support for Hudi. So you may possibly hit runtime issues which you would have to workaround by yourself. Please look at #1977 for more contextjobs can write, read and update Glue Data Catalog for hudi tables. In order to successfully integrate with Glue Data Catalog, you need to subscribe to one of the AWS provided Glue connectors named "AWS Glue Connector for Apache Hudi". Glue job needs to have "Use Glue data catalog as the Hive metastore" option ticked. Detailed steps with a sample scripts is available on this article provided by AWS - https://aws.amazon.com/blogs/big-data/writing-to-apache-hudi-tables-using-aws-glue-connector/.

In case if your using either notebooks or Zeppelin through Glue dev-endpoints, your script might not be able to integrate with Glue DataCatalog when writing to hudi tables.

Why partition fields are also stored in parquet files in addition to the partition path ?

...