HBase in Oozie shell action
You can manipulate HBase tables through Oozie with Java or Shell actions. In order to use Shell action, you of course need to prepare a Shell script.
Contents
HOW TO
I used CDH 5.8.4 with Kerberos.
If you use Kerberos, first thing you need to do is the kinit command. You can either run it manually in the console before processing or include it at the very beginning of your shell script.
1 |
kinit -kt ${path_to_keytab}/${user_name}.keytab -V ${user_name} |
Within the shell script you can run any command available in HBase shell (I found this blog very useful). You just need to follow the pattern below:
1 |
echo "hbase shell command" | hbase shell |
Example
1 |
echo "create '${namespace}:${table_name}', 'column_family1’, 'column_family2'" | hbase shell |
Inside the script you may of course add any non-HBase logic like loops or conditions. Just remember to finish your script with exit command.
Example script
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
#!/usr/bin/env bash #obtain a Kerberos ticket kinit -kt ${path_to_keytab}/${user_name}.keytab -V ${user_name} #create table if not exists echo "exists '${namespace}:${table_name}" | hbase shell > log cat log | grep "the table already exists" if [ $? != 0 ];then echo "${namespace}:${table_name}', 'column_family1’, 'column_family2" | hbase shell fi #and drop the table echo "drop '${namespace}:${table_name}" | hbase shell exit |
Now that you have a script you need an Oozie action:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
<workflow-app name="hbase_shell" xmlns="uri:oozie:workflow:0.4"> <global> <job-xml>job-conf.xml</job-xml> </global> <start to="hbase_shell"> <action name="hbase_shell" cred=""> <shell xmlns="uri:oozie:shell-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <exec>hbase_script.sh</exec> <file>hbase_script.sh</file> <file>${user_name}.keytab</file> </shell> <ok to="end"> <error to="kill"> </error></ok></action> <kill name="kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"> </end></start></workflow-app> |