Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: trivial edits in "Create Druid datasources from Hive"

...

Another possible scenario is that our data is stored in Hive tables and we want to preprocess it and create Druid datasources from Hive to accelerate our SQL query workload. We can do that by executing a Create Table As Select (CTAS) statement. In the following we provide multiple examples for each of these statements.For example:

Code Block
sql
sql
CREATE TABLE druid_table_1
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
AS
<select `timecolumn` as `___time`, `dimension1`, `dimension2`, `metric1`, `metric2`....>;

Observe that we still create three different groups of columns corresponding to the Druid categories: the timestamp column (__time) mandatory in Druid, the dimension columns (whose type is STRING), and the metrics columns (all the rest).

In both statements, the column types (either specified statically for CREATE TABLE statements or infer inferred from the query result for CTAS statements) are used to infer the corresponding Druid column category.

...