Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The fix for this is to try and create uber schema using all the schema versions evolved so far for the concerned event and use this uber schema as the target schema. One of the good approaches can be fetching schema from hive metastore and merging it with the current schema.

Sample stacktrace where a field named "toBeDeletedStr" was omitted from new batch of updates : https://gist.github.com/nsivabalan/cafc53fc9a8681923e4e2fa4eb2133fe

1.2 Caused by: java.lang.UnsupportedOperationException: org.apache.parquet.avro.AvroConverters$FieldIntegerConverter

...

For such errors, please try to ensure only valid data type conversions are happening in your primary data source from where you are trying to ingest. 

Sample stacktrace when trying to evolve a field from Long type to Integer type with Hudi : https://gist.github.com/nsivabalan/0d81cd60a3e7a0501e6a0cb50bfaacea

1.3 org.apache.hudi.exception.SchemaCompatabilityException: Unable to validate the rewritten record <record> against schema <schema>
at org.apache.hudi.common.util.HoodieAvroUtils.rewrite(HoodieAvroUtils.java:215)

This can possibly occur if your schema has some non-nullable field whose value is not present or is null. It is recommended to evolve schema in backwards compatible ways. In essence, this means either have every newly added field as nullable or define default values for every new field. In case if you are relying on default value for your field, as of Hudi version 0.5.1, this is not handled

1.4 hudi consumes too much space in a temp folder while upsert 

When upsert large input data, hudi will spills part of input data to disk when reach the max memory for mergeif there is enough memory, please increase spark executor's memory and  "hoodie.memory.merge.fraction" option, for example

Code Block
option("hoodie.memory.merge.fraction", "0.8")  //    

Ingestion

2.1 Caused by: java.io.EOFException: Received -1 when reading from channel, socket has likely been closed.
at kafka.utils.Utils$.read(Utils.scala:381)
at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)

...

Code Block
languagejava
Steps: (build with hive-2.1.0)
1. git clone git@github.com:apache/incubator-hudi.git
2. rm hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/hive/HoodieCombineHiveInputFormat.java
3. mvn clean package -DskipTests -DskipITs -Dhive.version=2.1.0

3.5 Caused by : java.lang.UnsupportedOperationException: Table rename is not supported

This issue could occur when syncing to hive. Possible reason is that, hive does not play well if your table name has upper and lower case letter. Try to have all lower case letters for your table name and it should likely get fixed. Related issue: https://github.com/apache/hudi/issues/2409


Running from IDE

4.1 "java.lang.IllegalArgumentException: Unsupported class file major version 56".

please use java 8, and not java 11