Data works summit 2017 in Munich

08/04/2017 3:10 PM
Alice
Tags: conferences, Hadoop
0

Few days ago I attended Data works summit in Munich. Used to be called Hadoop summit but apparently Hadoop itself moved a bit into the background, giving space to other cool technologies, emerging at very fast pace. Big data processing is no longer the key conference topic. The main focus is now around the AI, personalized Machine Learning, Deep Learning, IoT. Driving question is “what more can we squeeze out the data and how?”. Goal is to expand AI and at the same time to move it into the background so that we don’t feel that we actually use it. Self-driving cars, patterns recognition, speach-driven applications are just few concepts that companies turn their focus to. They aim to identify and respond to emotions and needs before we’re even able to recognise them by ourselves. This leads us more to data science than big data concepts and solutions, although they often come together either way. On the other hand concern was raised regarding what will happen when AI will take over all repeatable processes and many of our jobs. But we’ll have to see what the future will bring us.

Nevertheless, new trends didn’t stop Hadoop 3 from emerging. 3.0.0-alpha2 version was released at the beginning of this year. Except some fixes, few cool features will be covered there:

Erasure coding – as an alternative for 3x replication, resulting in less space used
3 NameNodes support
Optimisations for storing small files – which up till now was not advised
Docker support

Although Hadoop stopped playing key conference part, it’s still utilized to create applications using Spark, NiFi or TensorFlow, build data warehouses or data processing.

Few session that I found worth attending and recommend taking a look at the slide decks:

Hadoop 3 in a Nutshell – brief summary of planned Hadoop 3 features
An Apache Hive Based Data Warehouse – nice overview of around big data warehousing technologies and security
Unified, Efficient and Portable Data Processing with Apache Beam – if you want to find out what Beam is about
Using SparkR to Scale Data Science Applications in Production – good “how to” in regards to Spark and R
HBase in Practice – practical HBase overview

Apache Zeppelin – build from source »

Data works summit 2017 in Munich

Data works summit 2017 in Munich

Leave a Reply Cancel reply