HBase Java API with Oozie
One of the ways to access HBase is through Java API. It can be done in multiple ways, depending on the case and tools used. Here’s how to achieve that with Oozie Java action and Pig action with UDF doing lookups in HBase.
Contents
HOW TO
I used CDH 5.8.4 with Kerberos.
For some HBase 1.2.0 Java API basics take a look here.
Pig UDF for HBase lookups can be found here.
Oozie
Oozie allows us to execute different actions, but none of them is dedicated strictly for HBase. To use the HBase Java API you can connect to this database with Java or Pig action. In order to make it work you need to:
- add hbase credentials
12<credential name="hbase" type="hbase"></credential>
- attach hbase-site.xml as a file option. It should contain at least those parameters:
- hbase.security.authentication=kerberos, if you use the Kerberos
- hbase.zookeeper.quorum
- base.master.kerberos.principal
- hbase.regionserver.kerberos.principal
- the same parameters need to be specified as action configuration options. Otherwise the action won’t be able to kick in.
- attach all needed HBase jar files. These will depend on the functionality you implement in the action. I needed:
- for pig udf
– hbase-client.jar
– hbase-protocol.jar
– hbase-server.jar
– hbase-common.jar - for java action
– hbase-common.jar
– htrace-core.jar
And don’t forget to attach the program compiled jar 🙂
Here’s how Pig action looks like:1234567891011121314151617181920212223242526272829303132333435363738394041424344454647<workflow-app name="pig_hbase_udf" xmlns="uri:oozie:workflow:0.4"><global><job-xml>job-conf.xml</job-xml></global><credentials><credential name="hbase" type="hbase"></credential></credentials><start to="pig_action"/><action name="pig_action" cred="hbase"><pig><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><configuration><property><name>hbase.security.authentication</name><value>kerberos</value></property><property><name>hbase.zookeeper.quorum</name><value>${zookeeper_quorum}</value></property><property><name>hbase.master.kerberos.principal</name><value>${hbase_master_principal}</value></property><property><name>hbase.regionserver.kerberos.principal</name><value>${hbase_region_server_principal}</value></property></configuration><script>pig_script.pig</script><file>hbase-site.xml</file><file>hbase-client.jar</file><file>hbase-protocol.jar</file><file>hbase-server.jar</file><file>hbase-common.jar</file><file>pig-udf.jar</file></pig><ok to="end"/><error to="kill"/></action><kill name="kill"><message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><end name="end"/></workflow-app>And Java action:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546<workflow-app name="hbase_java" xmlns="uri:oozie:workflow:0.4"><global><job-xml>job-conf.xml</job-xml></global><credentials><credential name="hbase" type="hbase"></credential></credentials><start to="hbase_action"/><action name="hbase_action" cred="hbase"><java><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><configuration><property><name>hbase.security.authentication</name><value>kerberos</value></property><property><name>hbase.zookeeper.quorum</name><value>${zookeeper_quorum}</value></property><property><name>hbase.master.kerberos.principal</name><value>${hbase_master_principal}</value></property><property><name>hbase.regionserver.kerberos.principal</name><value>${hbase_region_server_principal}</value></property></configuration><main-class>HBaseManager</main-class><arg>tableName</arg><file>jar-file.jar</file><file>hbase-site.xml</file><file>hbase-common.jar</file><file>htrace-core.jar</file></java><ok to="end"/><error to="kill"/></action><kill name="kill"><message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><end name="end"/></workflow-app> - for pig udf