Kafka Nifi HDFS

Posted by : at

Category : hadoop


Requirement

Setup a distributed kafka cluster with 3 nodes (brokers) and (preferably) external zookeepers.Create topics in kafka (with 2 partitions at least each and replication as 3) that will contain data from a processed log file. Setup Nifi distributed cluster using the same zookeeper quorum as that used by Kafka.

Zookeper status to show initial state of machine

starting zookeeper Start Zookeeper

status of zookeeper Status of Zookeeper

See running java processes jps

So, how to configure a zookeeper? and why do we needed it? zookeeper config

Whenever we need to manage cluster. Zookeeper comes handy.

Start Kafka Server start Kafka

Now lets check if its running jps

As the requirement demands we need some topics. So to create them kafka create topic

Another topic kafka create topic

Nifi initial status Nifi empty

Nifi Cluster which is configured using zookeeper Nifi Cluster

Nifi Dataflow overflow Nifi Cluster

Nifi Split Text to show the dataflow ina better way Nifi Split Text

Topics Created By Nifi according to the Logs Topics Created By Nifi

Flow files Flow files

HDFS ls HDFS ls

Nifi Consume Kafka Nifi Consume Kafka

HDFS Data ingestion HDFS Data ingestion

sample json sample json

replace text nifi replace text nifi

nifi process group nifi process group

setting file name using variable setting file name using variable

setting fnm using variable setting fnm using variable

working working

full flow full flow

output output


Advertisement
About Mohit Manna

Hi, my name is Mohit Manna. I am a Data Engineer who knows about coding and other stuffs

Star
Tags