Simple Kafka Twittter Streaming Which Streams in from Apache Kafka And Stores Data into HBASE inside HDFS File System
- HADOOP & HDFS
- HBASE (localhost:9000)
- YARN
- ZOOKEEPER
- KAFKA
- HADOOP HBASE
- YARN
- HBASE Server
- Zookeeper (localhost:2181)
- Hbase Thrift
- KAFKA SERVER FOR PRODUCER Streaming to TOPIC ['myWorld']
- KAFKA PRODUCER CONSOLE @ localhost:9092 Streaming to TOPIC ['myWorld']
- KAFKA CONSUMER CONSOLE listining in to zookeeper @ localhost:2181 for Streaming to TOPIC ['myWorld']
create an api-keys.py file in the directory as per ( YOU CAN RENAME 'api-keys.py.example')
KEY = 'wjRs...............fKef'
SECRET = '3xB5n................................GWEV6HMDbPhth'
TOKEN = '14959................................7QwUIBKyRZB2QN'
TOKEN_SECRET = 'ysLFG2v..............................CA8p1GXo'
HAVE YOUR OWN TOKENS BY CREATING A TWITTER APP FROM [https://apps.twitter.com/]APPS.TWITTER.COM
Run ( EACH ON EACH TERMINAL )
- Zookeeper [
./$ZOOKEEPER_HOME/bin/zkServer.sh start], - Kafka Server [
./$KAFKA_HOME/bin/kafka-server-start.sh config/server.properties], - Create Kafka Topic [
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic myWorld] - Kafka Producer Console [
./$KAFKA_HOME/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic myWorld] , - Kafka Consumer Console [
./$KAFKA_HOME/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic myWorld], - Hbase Thrift [
$HBASE_HOME/bin/hbase thrift start] - Producer : [
python producer.py] - Consumer : [
python consumer.py]
| TOPIC | 'myWorld' |
|---|---|
| TABLE NAME | 'tweet-table' |
| COLUMN FAMILY | 'json' |
| COLUMN NAME | 'data' |
| --- | COLUMN FAMILY |
|---|---|
| KEY | COLUMN NAME |
| ------- | ------------------------------------------------- |
| row 1 | Data 1 |
| row 2 | Data 2 |
| row 3 | Data 3 |
HBASE STORAGE STRUCTURE