Data works summit 2017 in Munich

Few days ago I attended Data works summit in Munich. Used to be called Hadoop summit but apparently Hadoop itself moved a bit into the background, giving space to other cool technologies, emerging at very fast pace. Big data processing is no longer the key conference topic. The main focus is now around the AI, personalized Machine Learning, Deep Learning, IoT. Driving question is “what more can we squeeze out the data and how?”. Goal is to expand AI and at the same time to move it into the background so that we don’t feel that we actually use it. Self-driving cars, patterns recognition, speach-driven applications are just few concepts that companies turn their focus to. They aim to identify and respond to emotions and needs before we’re even able to recognise them by ourselves. This leads us more to data science than big data concepts and solutions, although they often come together either way. On the other hand concern was raised regarding what will happen when AI will take over all repeatable processes and many of our jobs. But we’ll have to see what the future will bring us.

Nevertheless, new trends didn’t stop Hadoop 3 from emerging. 3.0.0-alpha2 version was released at the beginning of this year. Except some fixes, few cool features will be covered there:

  • Erasure coding – as an alternative for 3x replication, resulting in less space used
  • 3 NameNodes support
  • Optimisations for storing small files – which up till now was not advised
  • Docker support

Although Hadoop stopped playing key conference part, it’s still utilized to create applications using Spark, NiFi or TensorFlow, build data warehouses or data processing.

Few session that I found worth attending and recommend taking a look at the slide decks:

