Requirement
Setup a distributed kafka cluster with 3 nodes (brokers) and (preferably) external zookeepers.Create topics in kafka (with 2 partitions at least each and replication as 3) that will contain data from a processed log file. Setup Nifi distributed cluster using the same zookeeper quorum as that used by Kafka.
Zookeper status to show initial state of machine
starting zookeeper
status of zookeeper
See running java processes
So, how to configure a zookeeper? and why do we needed it?
Whenever we need to manage cluster. Zookeeper comes handy.
Start Kafka Server
Now lets check if its running
As the requirement demands we need some topics. So to create them
Another topic
Nifi initial status
Nifi Cluster which is configured using zookeeper
Nifi Dataflow overflow
Nifi Split Text to show the dataflow ina better way
Topics Created By Nifi according to the Logs
Flow files
HDFS ls
Nifi Consume Kafka
HDFS Data ingestion
sample json
replace text nifi
nifi process group
setting file name using variable
setting fnm using variable
working
full flow
output