Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added note about port restrictions in JDK 1.7u51+

Hive Using Derby in Server Mode

Table of Contents

Hive in embedded mode has a limitation of one active user at a time. You may want to run Derby as a Network Server, this way multiple users can access it simultaneously from different systems.

...

My structure looks like this:

Code Block

/opt/hadoop/hadoop-0.17.2.1
/opt/hadoop/db-derby-10.4.1.3-bin
/opt/hadoop/hive
Code Block

cd /opt/hadoop
<download>
tar -xzf db-derby-10.4.1.3-bin.tar.gz
mkdir db-derby-10.4.1.3-bin/data

...

/etc/profile.d/derby.sh

Code Block

DERBY_INSTALL=/opt/hadoop/db-derby-10.4.1.3-bin
DERBY_HOME=/opt/hadoop/db-derby-10.4.1.3-bin
export DERBY_INSTALL
export DERBY_HOME

...

/etc/profile.d/hive.sh

Code Block

HADOOP=/opt/hadoop/hadoop-0.17.2.1/bin/hadoop
export HADOOP

...

Likely you are going to want to run Derby when Hadoop starts up. An interesting place for this other than as an lsb-init-script might be alongside Hadoop scripts like start-dfs. By default Derby will create databases in the directory it was started from.

Code Block

cd /opt/hadoop/db-derby-10.4.1.3-bin/data
 
# If your using JDK 1.7u51+, you'll need to either specifying an ephemeral port (typically between 49152 and 65535)
# or add a grant to your JDK version's java.policy file
# See http://stackoverflow.com/questions/21154400/unable-to-start-derby-database-from-netbeans-7-4 for details.
nohup /opt/hadoop/db-derby-10.4.1.3-bin/startNetworkServer -h 0.0.0.0 &

...

/opt/hadoop/hive/conf/hive-site.xml

Code Block

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:derby://hadoop1:1527/metastore_db;create=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>org.apache.derby.jdbc.ClientDriver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>

...

      Version: JPOX properties are NOT used in Hive 5.0 or later.
      JPOX properties can be specified in hive-site.xml. Normally jpox.properties changes are not required.

Code Block

javax.jdo.PersistenceManagerFactoryClass=org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema=false
org.jpox.validateTables=false
org.jpox.validateColumns=false
org.jpox.validateConstraints=false
org.jpox.storeManagerType=rdbms
org.jpox.autoCreateSchema=true
org.jpox.autoStartMechanismMode=checked
org.jpox.transactionIsolation=read_committed
javax.jdo.option.DetachAllOnCommit=true
javax.jdo.option.NontransactionalRead=true
javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL=jdbc:derby://hadoop1:1527/metastore_db;create=true
javax.jdo.option.ConnectionUserName=APP
javax.jdo.option.ConnectionPassword=mine

...

Now since there is a new client you MUST make sure Hive has these jar files in the lib directory or in the classpath. The same would be true if you used MySQL or some other DB.

Code Block

cp /opt/hadoop/db-derby-10.4.1.3-bin/lib/derbyclient.jar /opt/hadoop/hive/lib
cp /opt/hadoop/db-derby-10.4.1.3-bin/lib/derbytools.jar /opt/hadoop/hive/lib

If you receive the error "javax.jdo.JDOFatalInternalException: Error creating transactional connection factory" where the stack trace originates at "org.datanucleus.exceptions.ClassNotResolvedException: Class 'org.apache.derby.jdbc.ClientDriver' was not found in the CLASSPATH. Please check your specification and your CLASSPATH", you may benefit from putting the Derby jar files directly in the Hadoop lib directory:

Code Block

cp /opt/hadoop/db-derby-10.4.1.3-bin/lib/derbyclient.jar /opt/hadoop/hadoop-0.17.2.1/lib
cp /opt/hadoop/db-derby-10.4.1.3-bin/lib/derbytools.jar /opt/hadoop/hadoop-0.17.2.1/lib

...

The metastore will not be created until the first query hits it.

Code Block

cd /opt/hadoop/hive
bin/hive
hive> show tables;

...