Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Fix HIVE-19291

...

For instance, we might want to create an empty table backed by Druid using a CREATE TABLE statement and then append and overwrite data using INSERT and INSERT OVERWRITE Hive statements, respectively.

Code Block
sql
sql
CREATE EXTERNAL TABLE druid_table_1
(`__time` TIMESTAMP, `dimension1` STRING, `dimension2` STRING, `metric1` INT, `metric2` FLOAT)
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler';

Another possible scenario is that our data is stored in Hive tables and we want to preprocess it and create Druid datasources from Hive to accelerate our SQL query workload. We can do that by executing a Create Table As Select (CTAS) statement. For example:

Code Block
sql
sql
CREATE EXTERNAL TABLE druid_table_1
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
AS
<select `timecolumn` as `___time`, `dimension1`, `dimension2`, `metric1`, `metric2`....>;

...

In both statements, the column types (either specified statically for CREATE TABLE statements or inferred from the query result for CTAS statements) are used to infer the corresponding Druid column category.

Further, note that if we do not use EXTERNAL tables, we do not specify the value for the druid.datasource property. In fact, Hive automatically uses the fully qualified name of the table to create the corresponding datasource with the same name.

Info
titleVersion Info

Version 2.2.0: CREATE TABLE syntax when data is managed via hive.


CREATE TABLE druid_table_1
(`__time` TIMESTAMP, `dimension1` STRING, `dimension2` STRING, `metric1` INT, `metric2` FLOAT)
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler';


 NOTE - Before Hive 3.0.0, we do not use EXTERNAL tables and do not specify the value for the druid.datasource property.

For versions 3.0.0+, All Druid tables are EXTERNAL (HIVE-20085).


Druid kafka ingestion from Hive

...

INSERT, INSERT OVERWRITE and DROP statements

Info
titleVersion Info

Version 2.2.0 : These statements are supported by Hive managed tables (not external) backed by Druid.

For versions 3.0.0+, All Druid

...

tables are EXTERNAL (HIVE-20085) and these statements are supported for any table.

Querying Druid from Hive

Once we have created our first table stored in Druid using the DruidStorageHandler, we are ready to execute our queries against Druid.

...