Page History

...

Carbondata supports customised column compressor so that user can add their own implementation of compressor. To customise compressor, user can directly use its full class name while creating table or setting it to carbon property.

Performance Improvements

Optimised

...

carbondata scan performance

Carbondata scan performance is improved by avoiding multiple data copies in case of vector flow. This is achieved through short circuit the read and vector filling, it means fill the data directly to vector after reading the data from file with out any intermediate copies.

Row Filter pruning Now row level filter processing is handled in execution engine after pruning the , only blocklet and pages using the filter in carbonpage pruning is handled in CarbonData for vector flow. This is controlled by property carbon.push.rowfilters.for.vector and default it is false.

...

To enable integration with non java based execution engines, CarbonData supports C++ writer JNI wrapper to write the CarbonData files. These writers It can be integrated with any execution engine and write data to CarbonData files without the dependency on Spark or Hadoop.

...

Added more CLI enhancements by adding more options.
Supported fallback mechanism, when offheap memory is not enough then switch to onheap
Enable Local dictionary by default.
Make inverted index false by default.
instead of failing the job
Supported a separate audit log.
Support read batch row in CSDK to improve performance.

Behaviour Change

Enable Local dictionary by default.
Make inverted index false by default.

New Configuration Parameters

...

Page tree

Versions Compared

Old Version 3

New Version 4

Key

Performance Improvements

Optimised

carbondata scan performance

Behaviour Change

New Configuration Parameters