Skip to content

Tag: Hive

Spark and Hive

Spark gives us the ability to use SQL for data processing. With that we can connect with JDBC and ODBC to pretty much any database or use structured data formats like avro, parquet, orc. We can also connect to Hive and use all the structures we have there. In Spark 2.0 entry points to SQL (SQLContext) and Hive (HiveContext) were substituted with one object – SparkSession. SparkSession allows you to read and write to Hive, use HiveSQL language and Hive UDFs.

Read more