Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Efficient predicate pushdown. Can store all columns metadata together. Only query desired columns efficiently
  • Parallel processing of splits natively supported
  • Can provide UDF support if needed.
    • This may be useful for geo queries commonly used
    (users store
    • Example: Table has latitude/longitude columns in table. But we can query data in hexagon/quad-tree efficiently using data skipping index. )
  • Better storage compression
  • We can try different layouts by sorting data on different parameters (partition/fileId/columnBeingIndexed etc)

...

  • Doesn’t work well with hudi metadata table (because metadata table base format is HFile. HUDI table cannot support different file formats for different partitions)
  • No fast single key lookup. So, may not be ideal for other types of index like UUID lookup?

...