Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Name → local name to be able to reference the connector with. This name is shown in "show connectors" and will be used in connector DDLs like drop/alter and also in "create remote database .." statements.
  • TYPE → "HIVEJDBC" so Hive Metastore knows that Connector class to use.
  • URL → JDBC URL for the remote HiveServer instance.
  • DCPROPERTIES → This is a freeform list that contains other info like credentials and other optional properties. These properties will be passed onto the table definitions for the databases created using this connector

Note: Data Connectors in Hive are currently only used to read the data from remote sources. Write operations are not supported.

How do I use it?

  1. Create a connector first.
        CREATE CONNECTOR hiveserver_connector TYPE 'hivejdbc' URL 'jdbc:hive2://<maskedhost>:10000' 
    WITH DCPROPERTIES ("hive.sql.dbcp.username"="hive", "hive.sql.dbcp.password"="hive");

  2. Create a database of type REMOTE in hive using the connector from Step 1. This maps a remote database named "default" to a hive database named "hiveserver_remote" in hive.

...

33 rows selected (6.099 seconds)


    4. Offload the remote table to local cluster, run CTAS (example below pulls in all the data into the local table, 
but you can pull in select columns and rows by applying predicates)

...

INFO  : Completed executing command(queryId=ngangam_20240129182647_7544c9d1-c68b-4a34-b6b0-910945a1dba5); Time taken: 2.344 seconds

INFO  : OK

     +------+
| _c0  |
+------+
|   |
+------+

1 row selected (8.795 seconds)0: jdbc:hive2://localhost:10000>


5. To fetch data from the remote tables, run SELECT queries using column spec and predicates as you would
normally with any SQL tables.

...