Page History

...

For instance, we might want to create an empty table backed by Druid using a CREATE TABLE statement and then append and overwrite data using INSERT and INSERT OVERWRITE Hive statements, respectively.

Code Block

	sql
	sql

CREATE EXTERNAL TABLE druid_table_1
(`__time` TIMESTAMP, `dimension1` STRING, `dimension2` STRING, `metric1` INT, `metric2` FLOAT)
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler';

Another possible scenario is that our data is stored in Hive tables and we want to preprocess it and create Druid datasources from Hive to accelerate our SQL query workload. We can do that by executing a Create Table As Select (CTAS) statement. For example:

Code Block

	sql
	sql

CREATE EXTERNAL TABLE druid_table_1
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
AS
<select `timecolumn` as `___time`, `dimension1`, `dimension2`, `metric1`, `metric2`....>;

...

In both statements, the column types (either specified statically for CREATE TABLE statements or inferred from the query result for CTAS statements) are used to infer the corresponding Druid column category.

Further, note that if we do not use EXTERNAL tables, we do not specify the value for the druid.datasource property. In fact, Hive automatically uses the fully qualified name of the table to create the corresponding datasource with the same name.

Info

title	Version Info

Version 3.0.0-: CREATE TABLE syntax when data is managed via hive.

CREATE TABLE druid_table_1
(`__time` TIMESTAMP, `dimension1` STRING, `dimension2` STRING, `metric1` INT, `metric2` FLOAT)
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler';

NOTE - Before Hive 3.0.0, we do not use EXTERNAL tables and do not specify the value for the druid.datasource property.

Druid kafka ingestion from Hive

...

Space shortcuts

Child pages

Versions Compared

Old Version 19

New Version 20

Key

Druid kafka ingestion from Hive