Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In this version of CarbonData, more than 240 JIRA tickets for new feature, improvement and bugs has been resolved. Following are the summary.

Carbon Core

Improved Data Load

...

Performance

Data loading performance has been improved dramatically due to various enhancements, including sorting temp file improvement, sort boundary mechanism, direct write without data move, and others. In one of the production environment, we have observed as much as 300% improvement comparing to last version, from 35MB/s/node to 102MB/s/node data loading throughput.

Improved Compaction

...

Performance

By employing data prefetching and various improvement in vectorized reader during compaction, compaction operation on CarbonData table is improved up to 500% compare to last version. In one of the production environment, application can support 5 minutes data loading (100s of GB) while maintaining second level query performance by automatic compaction for every 30 and 60 minutes (configured with "carbon.compaction.level.threshold" set to "6,2") to reduce number of segments.

...

You can specify Cloud Storage as external table location, such as storing in AWS S3, HuaweiCloud OBS, etc.

Supports SDK for Standalone Application

Provided Carbon SDK to write and read CarbonData files through Java API without Hadoop and Spark dependency, user can use this SDK in standalone Java application to convert existing data into CarbonData files. It supports write to local disk or cloud storage, from following formats.

...