Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In this version of CarbonData, following are the new features added for performance improvements, compatibility, and usability of CarbonData.

Carbon Core

...

Improved Data Load performance

Data loading performance has been improved dramatically due to various enhancements, including sorting temp file improvement, sort boundary mechanism, direct write without data move, and others. In one of the production environment, we have observed as much as 300% improvement comparing to last version, from 35MB/s/node to 102MB/s/node data loading throughput.

Improved Compaction performance

By employing data prefetching and various improvement in vectorized reader during compaction, compaction operation on CarbonData table is improved up to 5 times compare to last version.

DataMap Management Enhancement

...

Provided Carbon SDK to write and read CarbonData files through Java API without Hadoop and Spark dependency, user can use this SDK in standalone Java application . Supporting writing CarbonData files fromto convert existing data into CarbonData files. It supports write to local disk or cloud storage, from following formats.

  1. CSV data, schema specified by user.
  2. JSON data, schema defined by Avro.

...