Installing and Running Apache NiFi on your HDP Cluster

nifi

Hey everyone,

I learned today about a cool ETL/data pipeline/make your life easier tool that was recently released by the NSA (not kidding) as a way to manage the flow of data in and out of system: Apache NiFi. To me, that functionality seems to match PERFECTLY with what people like to do with Hadoop. This guide will just set up NiFi, not do anything with it (that’ll come later!)

Things you’ll need:

  • HDP2.3
  • Maven > 3.1

You don’t even need root access!


Here’s how to get it running on your HDP2.3 cluster:

First we need to actually get the source code:

wget https://github.com/apache/nifi/archive/master.zip

unzip master.zip

cd nifi-master

export MAVEN_OPTS="-Xms1024m -Xmx3076m -XX:MaxPermSize=256m"

mvn -T C2.0 clean install

(Takes about 8 minutes to run all the tests)

After that’s done,

cd nifi-assembly/target

tar -zxvf nifi-0.3.1-SNAPSHOT-bin.tar.gz

cd nifi-0.3.1-SNAPSHOT

vi conf/nifi.properties

On line 106 ( :106 in vim)

nifi.web.http.port=9000

bin/nifi.sh install

service nifi start

Then navigate to http://localhost:9000/nifi

There you have it! With any luck, you now have NiFi installed!