Apache Zeppelin – build from source
Zeppelin is a notebook based framework for data analysis and data visualizations.
Although building it from the source is quite straightforward I had a few configuration issues that made the process longer.
I used CDH 5.8.0. VM.
Contents
TROUBLE SHOOTING
1. ERROR:
1 |
ERROR: spawn npm ENOENT. |
SOLUTION:
npm is not installed. To solve this issue just install npm on your node. i.e with yum:
1 |
sudo yum install npm |
2. ERROR:
1 2 3 4 |
Failed to execute goal on project zeppelin-server: Could not resolve dependencies for project org.apache.zeppelin:zeppelin-server:jar:0.8.0-SNAPSHOT: Could not find artifact org.apache.zeppelin:zeppelin-zengine:jar:0.8.0-SNAPSHOT in cloudera |
SOLUTION:
Two things may occur here. First – jar file can’t be found in the specified repo. Try changing it to the default one by skipping the -Pvendor-repo option and run failing installation step again without it.
Other cause may be that you try to use outdated or not existing zeppelin-zengine.jar. Change the version to the most recent one in your pom.xml file. For reference go to Maven repository.
1 |
../Zeppelin/incubator-zeppelin/zeppelin-zengine/pom.xml |
In my case:
1 2 3 4 5 6 |
<groupid>org.apache.zeppelin</groupid> <artifactid>zeppelin-zengine</artifactid> <packaging>jar</packaging> <version>0.7.0</version> <name>Zeppelin: Zengine</name> <description>Zeppelin Zengine</description> |
3. ERROR:
1 2 3 4 5 |
Could not resolve dependencies for project org.apache.zeppelin:zeppelin-server:jar:0.8.0-SNAPSHOT: Failure to find org.apache.zeppelin:zeppelin-zengine:jar:0.8.0-SNAPSHOT in http://repository.apache.org/snapshots was cached in the local repository, resolution will not be reattempted until the update interval of apache.snapshots has elapsed or updates are forced |
SOLUTION:
Specified zeppelin-zengine.jar can’t be found. It’s either outdated or doesn’t exist. Check up the most recent one in Maven repository and update your pom.xml file for Zeppelin Server.
1 |
../Zeppelin/incubator-zeppelin/zeppelin-server/pom.xml |
In my case:
1 2 3 4 5 |
<dependency> <groupid>${project.groupId}</groupid> <artifactid>zeppelin-zengine</artifactid> <version>0.7.0</version> </dependency> |
4. ERROR:
1 2 3 |
Compilation failure:[ERROR] /home/cloudera/Zeppelin/incubator-zeppelin /zeppelin-server/src/main/java/org/apache/zeppelin/server/ZeppelinServer.java:[36,34] package org.apache.zeppelin.helium does not exist |
SOLUTION:
To skip this error run command:
1 |
git config --global url."https://".insteadOf git:// |
and rerun failing step once again.
5. ERROR:
1 2 3 4 |
Failed to execute goal on project zeppelin-distribution: Could not resolve dependencies for project org.apache.zeppelin:zeppelin-distribution:pom:0.8.0-SNAPSHOT: Could not find artifact org.apache.zeppelin:zeppelin-web:war:0.8.0-SNAPSHOT in apache.snapshots (http://repository.apache.org/snapshots) |
SOLUTION:
In this case you need to manually update packaged versions of zeppelin-server.jar and zeppelin-web.war in your zeppelin-distribution pom.xml file.
1 |
../Zeppelin/incubator-zeppelin/zeppelin-distribution/pom.xml |
In my case:
1 2 3 4 5 6 7 8 9 10 11 |
<dependency> <artifactid>zeppelin-server</artifactid> <groupid>${project.groupId}</groupid> <version>0.7.0</version> </dependency> <dependency> <artifactid>zeppelin-web</artifactid> <groupid>${project.groupId}</groupid> <version>0.7.0</version> <type>war</type> </dependency> |
HOW TO
How to build Zeppelin from source with Maven 3.3.9 from the command line:
1. Get the source from the repository and clone it into desired folder:
1 2 |
cd /home/Zeppelin git clone https://github.com/apache/incubator-zeppelin |
Note: Make sure that your Maven version is at least 3.1.0.
2. Build Zeppelin with Maven:
1 2 3 |
cd /home/Zeppelin/incubator-zeppelin mvn clean package -Dhadoop.version=2.6.0-cdh5.8.0 -Pspark-1.6 -Phadoop-2.6 -Pvendor-repo -DskipTests |
I was building with Spark 1.6 and Hadoop 2.6. You may of course add some other features like SparkR or PySpark support (for more info refer to the official webpage).
3. Start Zeppelin daemons:
1 |
./bin/zeppelin-daemon.sh start |
By default it runs on http://localhost:8080.