Skip to content

Tag: Oozie

HBase Java API with Oozie

One of the ways to access HBase is through Java API. It can be done in multiple ways, depending on the case and tools used. Here’s how to achieve that with Oozie Java action and Pig action with UDF doing lookups in HBase.

Read more

Sqoop with HCatalog and Oozie

Sqoop may use HCatalog to import and export data directly into/from Hive tables. It uses HCatalog to read table’s structure, data formats, partitions and then imports/exports data appropriately. It’s very useful combination for efficient data move, but requires matching column names on both sides. Here’s how to make Sqoop with HCatalog work through Oozie.

Read more

Pig with HCatalog + Oozie

HCatalog enables Pig to read and write directly to Hive metastore. Pig dynamically determines structure of the table allowing easier data manipulation. Here’s how to make Pig work with HCatalog and how to run such jobs through Oozie.

Read more

HBase + Pig + Oozie

Although HBase is mostly used for lookups, sometimes there comes a need to perform bulk reads and writes. Doing that through Pig is very convenient. Here’s how to establish Pig-HBase communication.

Read more

HBase + Sqoop + Oozie

Sqoop can be used to import data from the relational database into HBase. Although exporting data from HBase is not natively supported you can still manage it by putting Hive and HCatalog between HBase and Sqoop. Here’s how to do both importing and exporting with Oozie in Kerberised environment.

Read more