Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: link to lock section in Transactions doc, minor edits

Hive Concurrency Model

Table of Contents

Use Cases

Concurrency support (http://issues.apache.org/jira/browse/HIVE-1293) is a must in databases and their use cases are well understood. At a minimum, we want to support concurrent readers and writers whenever possible. It would be useful to add a mechanism to discover the current locks which have been acquired. There is no immediate requirement to add an API to explicitly acquire any locks, so all locks would be acquired implicitly.

...

The compatibility matrix is as follows:

***

Lock
Compatibility 

Existing Lock

S

 

X

Requested
Lock

  • *
  • *

S

True

False

X
  • *

False

False

For some operations, locks are hierarchical in nature -- for eg. example for some partition operations, the table is also locked (for eg. to make sure that the table cannot be dropped while a new partition is being created).

...

Hive Command

Locks Acquired

select .. T1 partition P1

S on T1, T1.P1

insert into T2(partition P2) select .. T1 partition P1

S on T2, T1, T1.P1 and X on T2.P2

insert into T2(partition P.Q) select .. T1 partition P1

S on T2, T2.P, T1, T1.P1 and X on T2.P.Q

alter table T1 rename T2

X on T1

alter table T1 add cols

X on T1

alter table T1 replace cols

X on T1

alter table T1 change cols

X on T1

alter table T1 add partition P1

S on T1, X on T1.P1

alter table T1 drop partition P1

S on T1, X on T1.P1

alter table T1 touch partition P1

S on T1, X on T1.P1

*alter table T1 set serdeproperties *

S on T1

*alter table T1 set serializer *

S on T1

*alter table T1 set file format *

S on T1

*alter table T1 set tblproperties *

X on T1

drop table T1

X on T1

In order to avoid deadlocks, a very simple scheme is proposed here. All the objects to be locked are sorted lexicographically, and the required mode lock is acquired. Note that in some cases, the list of objects may not be known -- for eg. example in case of dynamic partitions, the list of partitions being modified is not known at compile time -- so, the list is generated conservatively. Since the number of partitions may not be known, an exclusive lock is taken on the table, or the prefix that is known.

...

The proposed scheme starves the writers for readers. In case of long readers, it may lead to starvation for writers.

The default hive Hive behavior will not be changed, and concurrency will not be supported.

Turn

...

Off Concurrency

You can turn off concurrency by setting the following variable to false: hive.support.concurrency.

Debugging

You can see the locks on a table by issuing the following command:

  • SHOW LOCKS <TABLE_NAME>;
  • SHOW LOCKS <TABLE_NAME> extendedEXTENDED;
  • SHOW LOCKS <TABLE_NAME> PARTITION (<PARTITION_DESC>);
  • SHOW LOCKS <TABLE_NAME> PARTITION (<PARTITION_DESC>) extended;EXTENDED;

Configuration

Configuration properties for Hive locking are described in Locking.

Locking in Hive Transactions

Hive 0.13.0 adds transactions with row-level ACID semantics, using a new lock manager. For more information, see: