Apache NiFi Overview

Apache NiFi
Apache NiFi

What is Apache NiFI?

Apache NiFi is a robust open-source Data Ingestion and Distribution framework and more. It can propagate any data content from any source to any destination.

NiFi is based on a different programming paradigm called Flow-Based Programming (FBP). I’m not going to explain the definition of Flow-Based Programming. Instead, I will tell how NiFi works, and then you can connect it with the definition of Flow-Based Programming.

How NiFi Works?

NiFi consists of atomic elements which can be combined into groups to build simple or complex dataflow.

NiFi has Processors & Process Groups.

What is a Processor in NiFi?

A Processor is an atomic element in NiFi which can do some specific task.

The latest version of NiFi have around 280+ processors, and each has its responsibility.

Ex. The GetFile processor can read a file from a specific location, whereas PutFile processor can write a file to a particular location. Like this, we have many other processors, each with its unique aspect.

We have processors to Get Data from various data sources and processors to Write Data to various data sources.

The data source can be almost anything.

It can be any SQL database server like Postgres, or Oracle, or MySQL, or it can be NoSQL databases like MongoDB, or Couchbase, it can also be your search engines like Solr or Elastic Search, or it can be your cache servers like Redis or HBase. It can even connect to Kafka  Messaging Queue.

NiFi also has a rich set of processors to connect with Amazon AWS entities likes S3 Buckets and DynamoDB.

NiFi have a processor for almost everything you need when you typically work with data. We will go deep into various types of processors available in NiFi in later videos. Even if you don’t find a right processor which fit your requirement, NiFi gives a simple way to write your custom processors.

Now let’s move on to the next term, FlowFile.

What is a FlowFile in NiFi?

The actual data in NiFi propagates in the form of a FlowFile. The FlowFile can contain any data, say CSV, JSON, XML, Plaintext, and it can even be SQL Queries or Binary data.

The FlowFile abstraction is the reason, NiFi can propagate any data from any source to any destination. A processor can process a FlowFile to generate new FlowFile.

The next important term is Connections.

What is a Connection in NiFi?

In NiFi all processors can be connected to create a data flow. This link between processors is called Connections. Each connection between processors can act as a queue for Flow Files as well.

The next one is Process Group and Input or Output port.

What are Process Group, Input Port & Output Port in NiFi?

In NiFi, one or more processors are connected and combined into a Process Group. When you have a complex dataflow, it’s better to combine processors into logical process groups. This helps in better maintenance of the flows.

Process Groups can have input and output ports which are used to move data between them.

The last and final term you should know for now is the Controller Services.

What is a Controller Service in NiFi?

Controller Services are shared services that can be used by Processors. For example, a processor which gets and puts data to a SQL database can have a Controller Service with the required DB connection details.

Controller Service is not limited to DB connections.

To learn more about Apache NiFi, kindly visit my YouTube Channel. I have created a Playlist, especially for Beginners.

After finishing my YouTube tutorial, if you wish to dive deep into the advanced topic, you can opt my Udemy course.

Apache NiFi – The Complete Guide

10$ Udemy NiFi Course

16 comments On Apache NiFi Overview

  • hi,
    i am using NIFI on windows 10.but can not open nifi UI.
    i wrote
    i checked port number in nifi properties.
    some times open
    also have problem if i have run-nifi.bat
    i got bellow messages

    The JAVA_HOME environment variable is not defined correctly.
    Instead the PATH will be used to find the java executable.

    2019-04-04 01:19:45,999 INFO [main] org.apache.nifi.bootstrap.Command Starting Apache NiFi…
    2019-04-04 01:19:46,015 INFO [main] org.apache.nifi.bootstrap.Command Working Directory: C:\Users\shailesh\Downloads\nifi-1.9.1
    2019-04-04 01:19:46,015 INFO [main] org.apache.nifi.bootstrap.Command Command: java -classpath C:\Users\shailesh\Downloads\nifi-1.9.1\.\conf;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\javax.servlet-api-3.1.0.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\jcl-over-slf4j-1.7.25.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\jetty-schemas-3.1.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\jul-to-slf4j-1.7.25.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\log4j-over-slf4j-1.7.25.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\logback-classic-1.2.3.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\logback-core-1.2.3.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\nifi-api-1.9.1.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\nifi-framework-api-1.9.1.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\nifi-nar-utils-1.9.1.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\nifi-properties-1.9.1.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\nifi-runtime-1.9.1.jar;C:\Users\shailesh\Downloads\nifi-1.9.1\.\lib\slf4j-api-1.7.25.jar -Dorg.apache.jasper.compiler.disablejsr199=true -Xmx512m -Xms512m -Djavax.security.auth.useSubjectCredsOnly=true -Djava.security.egd=file:/dev/urandom -Dsun.net.http.allowRestrictedHeaders=true -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true -XX:+UseG1GC -Djava.protocol.handler.pkgs=sun.net.www.protocol -Dnifi.properties.file.path=C:\Users\shailesh\Downloads\nifi-1.9.1\.\conf\nifi.properties -Dnifi.bootstrap.listen.port=52347 -Dapp=NiFi -Dorg.apache.nifi.bootstrap.config.log.dir=C:\Users\shailesh\Downloads\nifi-1.9.1\bin\..\\logs org.apache.nifi.NiFi
    2019-04-04 01:19:48,763 WARN [main] org.apache.nifi.bootstrap.Command Failed to set permissions so that only the owner can read pid file C:\Users\shailesh\Downloads\nifi-1.9.1\bin\..\run\nifi.pid; this may allows others to have access to the key needed to communicate with NiFi. Permissions should be changed so that only the owner can read this file
    2019-04-04 01:19:48,966 WARN [main] org.apache.nifi.bootstrap.Command Failed to set permissions so that only the owner can read status file C:\Users\shailesh\Downloads\nifi-1.9.1\bin\..\run\nifi.status; this may allows others to have access to the key needed to communicate with NiFi. Permissions should be changed so that only the owner can read this file
    2019-04-04 01:19:49,276 INFO [main] org.apache.nifi.bootstrap.Command Launched Apache NiFi with Process ID 6652

  • Hi..
    I am creating a flow, NiFi should fetch data from Splunk and send to kafka and then postgres.
    Problem is, NiFi runs continuously and it is keep fetching the same data repeatedly.
    Can you pls suggest mechanism where NiFi should not fetch the same data from splunk

    • Hi Chandan,

      I’m not so familiar with Splunk. But at the high-level, I can see it’s having a “State” option if I right-click “GetSplunk” processor. It means it’s having the intelligence to maintain the last fetch timestamp. Try playing around with “Earliest Time” or “Latest Time” property with the field name in Splunk which is having the “last modified on” or “created on” timestamp. It should solve your problem work.


  • Hello Manoj GT,

    Nice blog! I am editor at Java Code Geeks (www.javacodegeeks.com). We have the JCG program (see //www.javacodegeeks.com/join-us/jcg/), that I think you’d be perfect for.

    If you’re interested, send me an email to eleftheria.drosopoulou@javacodegeeks.com and we can discuss further.

    Best regards,
    Eleftheria Drosopoulou

  • Hi Manoj,
    Thanks for all the information that you have been publishing for NiFi. It has been really useful. I was working on a NiFi project wherein I have to performance test NiFi flows from various sources. Are you aware of any tools that can be used to performance test NiFi flows and how can we collect metrics if there are any errors during the data transfer using NiFi flows.

    Any direction on this will be really helpful.
    Thanks in advance !!!!
    Looking forward for your response

  • Hi Manoj,

    Need help from you regarding making nifi secure with https i have applied the client provided certs on server end , i made a client certificate by my own but its not autenticating to nifi.

    Can you please help me with this , i can send you the commands i used for making the client certificates.

  • Hello Manoj,

    Can you advise what is wrong in the below? It says as syntax of the command is incorrect…

    Directory of C:\nifi-1.10.0\bin
    12/10/2019 03:55 PM .
    12/10/2019 03:55 PM ..
    10/29/2019 08:12 AM 1,872 dump-nifi.bat
    12/10/2019 04:02 PM 1,128 nifi-env.bat
    10/29/2019 08:12 AM 1,664 nifi-env.sh
    10/29/2019 08:12 AM 12,965 nifi.sh
    10/29/2019 08:12 AM 1,871 run-nifi.bat
    10/29/2019 08:12 AM 1,832 status-nifi.bat
    The syntax of the command is incorrect.
    C:\nifi-1.10.0\bin>echo %JAVA_HOME%
    C:\Program Files (x86)\Java\jdk-13.0.1″

  • Awesome post! Keep up the great work! 🙂

  • Hi Manoj,

    Thank you. Your nifi classes are very helpful to professional career,
    I am currently working on Nifi, I have usecase for my client.
    Usecase : Connect Nifi to Oracle database do the incremental loads to HDFS (Hadoop) directory
    Source is oracle table CLOB column , within in the CLOB xml file, The requirement to pull xml file from clob to HDFS directory using nifi tool
    Do the incremental loads based on transaction dates from same table.
    I am not finding any class regarding this i was looking for this from longtime can you please help me with this ?
    I tried using Execute SQL Processor:
    Controller Service:
    Database Connection URL: jdbc:oracle:thin:xxxxxx/xxxx.com
    Database Driver Class Name: oracle.jdbc.driver.OracleDriver
    Database Driver Location(s): ?
    Database User: xxxx
    Password: xxxx

  • Hi Manoj ,

    Make a video on nifi for ssl setup.

  • Thank Dear,
    For update me. But i tell you my problem. i have a google compute server and i setup nifi. It’s a basic nifi setup. I tell you what i want.

    1) Setup LDAP server with user name and password for nifi.
    2) Setup SSL for nifi.
    So please help me out for this. Also provide me your contact number so i can talk with you. Also tell me your charges for above 2 points.

    Thanks & Regards
    Ashish Jain

  • Hi Manoj,

    I mean want to run NiFi with SSL as a standalone instance and also Ldap Server steup for nifi.

Leave a reply:

Your email address will not be published.

Site Footer