Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Please find the detailed JIRA list: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12341005&styleName=Html&projectId=12320220&version=12342754

...

Create=Create&atl_token=A5KQ-2QAV-T4JA-FDED%7C72f8d21d9927bf947fc8c0dfb7f69263d4048efb%7Clout

Sub-task

  • [CARBONDATA-1522] - 6. Loading aggregation tables for streaming data tables.
  • [CARBONDATA-1575] - Support large scale data on DataMap
  • [CARBONDATA-1601] - Add carbon store module
  • [CARBONDATA-1998] - Support FileReader Java API for file level carbondata
  • [CARBONDATA-2165] - Remove spark dependency in carbon-hadoop module
  • [CARBONDATA-2189] - Support add and drop interface
  • [CARBONDATA-2206] - Integrate lucene as datamap
  • [CARBONDATA-15062247] - SDV tests error in CISupport writing index in CarbonWriter
  • [CARBONDATA-1763] - Carbon1.3.0-Pre-AggregateTable - Recreating a failed pre-aggregate table fails due to table exists
  • [CARBONDATA-2098] - Add documentation for pre-aggregate tables
  • [CARBONDATA-2119] - CarbonDataWriterException thrown when loading using global_sort
  • [CARBONDATA-2131] - Alter table adding long datatype is failing but Create table with long type is successful, in Spark 2.1
  • [CARBONDATA-2133] - Exception displays after performing select query on newly added Boolean data type
  • [CARBONDATA-2134] - Prevent implicit column filter list from getting serialized while submitting task to executor
  • [CARBONDATA-2142] - Fixed aggregate data map creation issue in case of hive metastore
  • 2294] - Support preaggregate table creation on partition tables
  • [CARBONDATA-2301] - Support query interface in CarbonStore
  • [CARBONDATA-2359] - Support applicable load options and table properties for Non Transactional table
  • [CARBONDATA-2360] - Insert into and Insert Into overwrite support for Non Transactional table
  • [CARBONDATA-2361] - Refactor Read Committed Scope implementation.
  • [CARBONDATA-2369] - Add a document for Non Transactional table with SDK writer guide
  • [CARBONDATA-2388] - Avro Nested Datatype Support
  • [CARBONDATA-2423] - CarbonReader Support To Read Non Transactional Table
  • [CARBONDATA-2430] - Reshuffling of Columns given by user in SDK
  • [CARBONDATA-2433] - Executor OOM because of GC when blocklet pruning is done using Lucene datamap
  • [CARBONDATA-2443] - Multi Level Complex Type Support for AVRO SDK
  • [CARBONDATA-2457] - Add converter to get Carbon SDK Schema from Avro schema directly.
  • [CARBONDATA-2474] - Support Modular Plan
  • [CARBONDATA-2475] - Support Materialized View query rewrite
  • [CARBONDATA-2484] - Refactor the datamap code and clear the datamap from executor on table drop

Bug

  • [CARBONDATA-1114] - Failed to run tests in windows env
  • [CARBONDATA-1990] - Null values shown when the basic word count example is tried on carbon streaming table
  • [CARBONDATA-2002] - Streaming segment status is not getting updated to finished or success
  • [CARBONDATA-2056] - Hadoop Configuration with access key and secret key should be passed while creating InputStream of distributed carbon file.
  • [CARBONDATA-2080] - Hadoop Conf not propagated from driver to executor in S3
  • [CARBONDATA-2085] - It's different between load twice and create datamap with load again after load data and create datamap
  • [CARBONDATA-2130] - Find some Spelling error in CarbonData[CARBONDATA-2143] - Fixed query memory leak issue for task failure during initialization of record reader
  • [CARBONDATA-2147] - Exception displays while loading data with streaming
  • [CARBONDATA-21492152] - Displayed complex type data is error when use DataFrame to write complex type dataMin function working incorrectly for string type with dictionary include in presto.
  • [CARBONDATA-2150] - Unwanted updatetable status files are being generated for the delete operation where no records are deleted[CARBONDATA-2151] - Filter query on Timestamp/Date column of streaming table throwing exception2155] - IS NULL not working correctly on string datatype with dictionary_include in presto integration
  • [CARBONDATA-2161] - Compacted Segment of Streaming Table should update "mergeTo" column
  • [CARBONDATA-2182] - add one more param called ExtraParmas in SessionParams for session Level operations[CARBONDATA-2183] - fix compaction when segment is delete during compaction and remove unnecessary parameters in functions2194] - Exception message is improper when use incorrect bad record action type
  • [CARBONDATA-21852198] - add InputMetrics for Streaming ReaderStreaming data to a table with bad_records_action as IGNORE throws ClassCastException
  • [CARBONDATA-2199] - Exception occurs when change the datatype of measure having sort_column[CARBONDATA-2200] - Like operation on streaming table throwing Exception
  • [CARBONDATA-2207] - TestCase Fails using Hive Metastore
  • [CARBONDATA-2208] - Pre aggregate datamap creation is failing when count(*) present in query
  • [CARBONDATA-2209] - Rename table with partitions not working issue and batch_sort and no_sort with partition table issue
  • [CARBONDATA-2211] - Alter Table Streaming DDL should blocking DDL like other DDL ( All DDL are blocking DDL)
  • [CARBONDATA-22122213] - Wrong version in datamap example module cause compilation failure
  • [CARBONDATA-2216] - Error in compilation and execution in sdvtest] - Event should be fired from Stream before and after updating the status
  • [CARBONDATA-2217] - nullpointer issue drop partition where column does not exists and clean files issue after second level of compaction
  • [CARBONDATA-2219] - Add validation for external partition location to use same schema
  • [CARBONDATA-2221] - Drop table should throw exception when metastore operation failed
  • [CARBONDATA-2222] - Update the FAQ doc for some mistakes
  • [CARBONDATA-2229] - Unable to save dataframe as carbontable with specified external database path
  • [CARBONDATA-2232] - Wrong logic in spilling unsafe pages to disk
  • [CARBONDATA-2235] - add system configuration to filter datamaps from show tables command
  • [CARBONDATA-2236] - Add SDV Test Cases for Standard Partition
  • [CARBONDATA-2237] - Scala Parser failures are accumulated into memory form thread local
  • [CARBONDATA-2241] - Wrong Query written in Preaggregation Document
  • [CARBONDATA-2244] - When there are some invisibility INSERT_IN_PROGRESS/INSERT_OVERWRITE_IN_PROGRESS segments on main table, it can not create preaggregate table on it.
  • [CARBONDATA-2248] - Removing parsers thread local objects after parsing of carbon query
  • [CARBONDATA-2249] - Not able to query data through presto with local carbondata-store
  • [CARBONDATA-2261] - Support Set segment command for Streaming Table
  • [CARBONDATA-2264] - There is error when we create table using CarbonSource
  • [CARBONDATA-2265] - [DFX]-Load]: Load job fails if 1 folder contains 1000 files
  • [CARBONDATA-2266] - All Examples are throwing NoSuchElement Exception in current master branch
  • [CARBONDATA-2274] - Partition table having more than 4 column giving zero record
  • [CARBONDATA-2275] - Query Failed for 0 byte deletedelta file
  • [CARBONDATA-2277] - Filter on default values are not working
  • [CARBONDATA-2287] - Add event to alter partition table
  • [CARBONDATA-2289] - If carbon merge index is enabled then after IUD operation if some blocks of a segment is deleted, then during query and IUD operation the driver is throwing FileNotFoundException while preparing BlockMetaInfo.
  • [CARBONDATA-2302] - Fix some bugs when separate visible and invisible segments info into two files
  • [CARBONDATA-2303] - If dataload is failed for parition table then cleanup is not working.
  • [CARBONDATA-2307] - OOM when using DataFrame.coalesce
  • [CARBONDATA-2308] - Compaction should be allow when loading is in progress
  • [CARBONDATA-2314] - Data mismatch in Pre-Aggregate table after Streaming load due to threadset issue
  • [CARBONDATA-2319] - carbon_scan_time and carbon_IO_time are incorrect in task statistics
  • [CARBONDATA-2320] - Fix error in lucene coarse grain datamap suite
  • [CARBONDATA-2321] - Selecton after a Concurrent Load Failing for Partition columns
  • [CARBONDATA-2327] - invalid schema name _system shows when executed show schemas in presto
  • [CARBONDATA-2329] - Non Serializable extra info in session is overwritten by values from thread
  • [CARBONDATA-2333] - Block insert overwrite on parent table if any of the child tables are not partitioned on the specified partition columns
  • [CARBONDATA-2335] - Autohandoff is failing when preaggregate is created on streaming table
  • [CARBONDATA-2337] - Fix duplicately acquiring 'streaming.lock' error when integrating with spark-streaming
  • [CARBONDATA-2343] - Improper filter resolver cause more filter scan on data that could be skipped
  • [CARBONDATA-2346] - Dropping partition failing with null error for Partition table with Pre-Aggregate tables
  • [CARBONDATA-2347] - Fix Functional issues in LuceneDatamap in load and query and make stable
  • [CARBONDATA-2350] - Fix bugs in minmax datamap example
  • [CARBONDATA-2364] - Remove useless and time consuming code block
  • [CARBONDATA-2366] - Concurrent Datamap creation is failing when using hive metastore
  • [CARBONDATA-2374] - Fix bugs in minmax datamap example
  • [CARBONDATA-2386] - Query on Pre-Aggregate table is slower
  • [CARBONDATA-2391] - Thread leak in compaction operation if prefetch is enabled and compaction process is killed
  • [CARBONDATA-2394] - Setting segments in thread local space but not getting reflected in the driver
  • [CARBONDATA-2401] - Date and Timestamp options are not working in SDK
  • [CARBONDATA-2406] - Dictionary Server and Dictionary Client MD5 Validation failed with hive.server2.enable.doAs = true
  • [CARBONDATA-2408] - Before register to master, the master maybe not finished the start service.
  • [CARBONDATA-2410] - Error message correction when column value length exceeds 320000 charactor
  • [CARBONDATA-2413] - After running CarbonWriter, there is null directory about datamap
  • [CARBONDATA-2417] - SDK writer goes to infinite wait when consumer thread goes dead
  • [CARBONDATA-2419] - sortColumns Order we are getting wrong as we set for external table is fixed
  • [CARBONDATA-2426] - IOException after compaction on Pre-Aggregate table on Partition table
  • [CARBONDATA-2427] - Fix SearchMode Serialization Issue during Load
  • [CARBONDATA-2431] - Incremental data added after table creation is not reflecting while doing select query.
  • [CARBONDATA-2432] - BloomFilter DataMap should be contained in carbon assembly jar
  • [CARBONDATA-2435] - SDK dependency Spark jar
  • [CARBONDATA-2436] - Block pruning problem post the carbon schema restructure.
  • [CARBONDATA-2437] - Complex Type data loading is failing is for null values
  • [CARBONDATA-2438] - Remove spark/hadoop related classes in carbon assembly
  • [CARBONDATA-2439] - Update guava version for bloom datamap
  • [CARBONDATA-2440] - In SDK user can not specified the Unsafe memory , so it should take complete from Heap , and it should not be sorted using unsafe.
  • [CARBONDATA-2441] - Implement distribute interface for bloom datamap
  • [CARBONDATA-2442] - Reading two sdk writer output with differnt schema should prompt exception
  • [CARBONDATA-2463] - if two insert operations are running concurrently 1 task fails and causes wrong no of records in select
  • [CARBONDATA-2464] - Fixed OOM in case of complex type
  • [CARBONDATA-2465] - Improve the carbondata file reliability in data load when direct hdfs write is enabled
  • [CARBONDATA-2468] - sortcolumns considers all dimension also if few columns specified for sort_columns prop
  • [CARBONDATA-2469] - External Table must show its location instead of default store path in describe formatted
  • [CARBONDATA-2472] - Refactor NonTransactional table code for Index file IO performance
  • [CARBONDATA-2476] - Fix bug in bloom datamap cache
  • [CARBONDATA-2477] - No dictionary Complex type with double/date/decimal data type table creation is failing
  • [CARBONDATA-2479] - Multiple issue in sdk writer and external table flow
  • [CARBONDATA-2480] - Search mode RuntimeException: Error while resolving filter expression
  • [CARBONDATA-2486] - set search mode information is not updated in the documentation
  • [CARBONDATA-2487] - Block filters for lucene with more than one text_match udf
  • [CARBONDATA-2489] - Fix coverity reported warnings
  • [CARBONDATA-2492] - Thread leak issue in case of any data load failure
  • [CARBONDATA-2493] - DataType.equals() failes for complex types
  • [CARBONDATA-2498] - Change CarbonWriterBuilder interface to take schema while creating writer
  • [CARBONDATA-2503] - Data write fails if empty value is provided for sort columns in sdk
  • [CARBONDATA-2520] - datamap writers are not getting closed on task failure
  • [CARBONDATA-2538] - No exception is thrown if writer path has only lock files
  • [CARBONDATA-2545] - Fix some spell error in CarbonData
  • [CARBONDATA-2552] - Fix Data Mismatch for Complex Data type Array of Timestamp with Dictionary Include
  • [CARBONDATA-2555] - SDK Reader should have isTransactionalTable = false by default, to be inline with SDK writer

New Feature

  • [CARBONDATA-1516] - Support pre-aggregate tables and timeseries in carbondata
  • [CARBONDATA-2055] - Support integrating Streaming table with Spark Streaming

...

  • [CARBONDATA-2242] - Support materialized view
  • [CARBONDATA-2253] - Support write JSON/Avro data to carbon files
  • [CARBONDATA-2262] - Create table should support using carbondata and stored as carbondata
  • [CARBONDATA-2267] - Implement Reading Of Carbon Partition From Presto
  • [CARBONDATA-2276] - Support SDK API to read schema in data file and schema file
  • [CARBONDATA-2278] - Save the datamaps to system folder of warehouse
  • [CARBONDATA-2291] - Add datamap status and refresh command to sync data manually to datamaps
  • [CARBONDATA-2296] - Test famework should take the location of local module target folder if not integrtion module
  • [CARBONDATA-2297] - Support SEARCH_MODE for basic filter query
  • [CARBONDATA-2312] - Support In Memory catalog
  • [CARBONDATA-2323] - Distributed search mode using gRPC
  • [CARBONDATA-21032371] - Avoid 2 time lookup in ShowTables Add Profiler output in EXPLAIN command
  • [CARBONDATA-21372373] - Delete query is taking more time while processing the carbondata.Add bloom filter datamap to support precise query
  • [CARBONDATA-2378] - Support enable/disable search mode in ThriftServer
  • [CARBONDATA-2380] - Support visible/invisible datamap for performance tuning
  • [CARBONDATA-2415] - All DataMap should support REFRESH command
  • [CARBONDATA-2416] - Index DataMap should support immediate load and deferred load when creating the DataMap

Improvement

  • [CARBONDATA-1663] - Decouple spark in carbon modules
  • [CARBONDATA-2018] - Optimization in reading/writing for sort temp row during data loading
  • [CARBONDATA-21442032] - Skip writing final data files to local disk to save disk IO in data loading
  • [CARBONDATA-2099] - Refactor on query scan process to improve readability
  • [CARBONDATA-2139] - Optimize CTAS documentation and test case
  • [CARBONDATA-2140] - Presto Integration - Code Refactoring] - There are some improper place in pre-aggregate documentation
  • [CARBONDATA-2148] - Use Row parser to replace current default parser:CSVStreamParserImp
  • [CARBONDATA-2159] - Remove carbon-spark dependency for sdk module
  • [CARBONDATA-2168] - Support global sort on partition tables
  • [CARBONDATA-2184] - Improve memory reuse for heap memory in `HeapMemoryAllocator`
  • [CARBONDATA-2187] - Restructure the partition folders as per the standard hive folders
  • [CARBONDATA-2196] - during stream sometime carbontable is null in executor side
  • [CARBONDATA-2204] - Access tablestatus file too many times during query
  • [CARBONDATA-2223] - Adding Listener Support for Partition
  • [CARBONDATA-2226] - Refactor UT's to remove duplicate test scenarios to improve CI time for PreAggregate create and drop feature
  • [CARBONDATA-2201] - firing the LoadTablePreExecutionEvent before streaming causes NPE
  • [CARBONDATA-2204] - Access tablestatus file too many times during query
  • [CARBONDATA-2223] - Adding Listener Support for Partition

Task

  • -2227] - Add Partition Values and Location information in describe formatted for Standard partition feature
  • [CARBONDATA-2230] - Add a path into table path to store lock files and delete useless segment lock files before loading
  • [CARBONDATA-2231] - Refactor FT's to remove duplicate test scenarios to improve CI time for Streaming feature
  • [CARBONDATA-2234] - Support UTF-8 with BOM encoding in CSVInputFormat
  • [CARBONDATA-2250] - Reduce massive object generation in global sort
  • [CARBONDATA-2251] - Refactored sdv failures running on different environment
  • [CARBONDATA-2254] - Optimize CarbonData documentation
  • [CARBONDATA-2255] - Should rename the streaming examples to make it easy to understand
  • [CARBONDATA-2256] - Adding sdv Testcases for SET_Parameter_Dynamically_Feature
  • [CARBONDATA-2258] - Separate visible and invisible segments info into two files to reduce the size of tablestatus file.
  • [CARBONDATA-2260] - CarbonThriftServer should support S3 carbon table
  • [CARBONDATA-2271] - Collect SQL execution information to driver side
  • [CARBONDATA-2285] - spark integration code refactor
  • [CARBONDATA-2295] - Add UNSAFE_WORKING_MEMORY_IN_MB as a configuration parameter in presto integration
  • [CARBONDATA-2298] - Delete segment lock files before update metadata
  • [CARBONDATA-2299] - Support showing all segment information(include visible and invisible segments)
  • [CARBONDATA-2304] - Enhance compaction performance by enabling prefetch
  • [CARBONDATA-2310] - Refactored code to improve Distributable interface
  • [CARBONDATA-2315] - DataLoad is showing success and failure message in log,when no data is loaded into table during LOAD
  • [CARBONDATA-2316] - Even though one of the Compaction task failed at executor. All the executor task is showing success in UI and Job fails from driver.
  • [CARBONDATA-2317] - concurrent datamap with same name and schema creation throws exception
  • [CARBONDATA-2324] - Support config ExecutorService in search mode
  • [CARBONDATA-2325] - Page level uncompress and Query performance improvement for Unsafe No Dictionary
  • [CARBONDATA-2338] - Add example to upload data to S3 by using SDK
  • [CARBONDATA-2341] - Add CleanUp for Pre-Aggregate table
  • [CARBONDATA-2353] - Add cache for DataMap schema provider to avoid IO for each read
  • [CARBONDATA-2357] - Add column name and index mapping in lucene datamap writer
  • [CARBONDATA-2358] - Dataframe overwrite does not work properly if the table is already created and has deleted segments
  • [CARBONDATA-2365] - Add QueryExecutor in SearchMode for row-based CarbonRecordReader
  • [CARBONDATA-2375] - Add CG prune before FG prune
  • [CARBONDATA-2376] - Improve Lucene datamap performance by eliminating blockid while writing and reading index.
  • [CARBONDATA-2379] - Support Search mode run in the cluster and fix some error
  • [CARBONDATA-2381] - Improve compaction performance by filling batch result in columnar format and performing IO at blocklet level
  • [CARBONDATA-2384] - SDK support write/read data into/from S3
  • [CARBONDATA-2390] - Refresh Lucene data map for the exists table with data
  • [CARBONDATA-2392] - Add close method for CarbonReader
  • [CARBONDATA-2396] - Add CTAS support for using DataSource Syntax
  • [CARBONDATA-2404] - Add documentation for using carbondata and stored as carbondata
  • [CARBONDATA-2407] - Removed All Unused Executor BTree code
  • [CARBONDATA-2414] - Optimize documents for sort_column_bounds
  • [CARBONDATA-2422] - Search mode Master port should be dynamic
  • [CARBONDATA-2448] - Adding compacted segments to load and alter events
  • [CARBONDATA-2454] - Add false positive probability property for bloom filter datamap
  • [CARBONDATA-2455] - Fix _System Folder creation and lucene AND,OR,NOT Filter fix
  • [CARBONDATA-2458] - Remove unnecessary TableProvider interface
  • [CARBONDATA-2459] - Support cache for bloom datamap
  • [CARBONDATA-2467] - Null is printed in the SDK writer logs for operations logged
  • [CARBONDATA-2470] - Refactor AlterTableCompactionPostStatusUpdateEvent usage in compaction flow
  • [CARBONDATA-2473] - Support Materialized View as enhanced Preaggregate DataMap
  • [CARBONDATA-2494] - Improve Lucene datamap size and performnace.
  • [CARBONDATA-2495] - Add document for bloomfilter datamap
  • [CARBONDATA-2496] - Chnage the bloom implementation to hadoop for better performance and compression
  • [CARBONDATA-2524] - Support create carbonReader with default projection

Test

Task