Skip to content

Commit 0df409b

Browse files
authored
added (#1169)
1 parent bc66191 commit 0df409b

File tree

4 files changed

+109
-63
lines changed

4 files changed

+109
-63
lines changed

docs/en/guides/40-load-data/02-load-db/kafka.md

Lines changed: 3 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -28,65 +28,7 @@ To download databend-kafka-connect and learn more about the plugin, visit the [G
2828

2929
To download bend-ingest-kafka and learn more about the tool, visit the [GitHub repository](https://github.com/databendcloud/bend-ingest-kafka) and refer to the README for detailed instructions.
3030

31-
## Examples
31+
## Tutorials
3232

33-
This example assumes your data in Kafka appears as follows and explains how to load data into Databend using the tool bend-ingest-kafka.
34-
35-
```json
36-
{
37-
"employee_id": 10,
38-
"salary": 30000,
39-
"rating": 4.8,
40-
"name": "Eric",
41-
"address": "123 King Street",
42-
"skills": ["Java", "Python"],
43-
"projects": ["Project A", "Project B"],
44-
"hire_date": "2011-03-06",
45-
"last_update": "2016-04-04 11:30:00"
46-
}
47-
```
48-
49-
### Step 1. Create Table in Databend
50-
51-
Before ingesting data, you need to create a table in Databend that matches the structure of your Kafka data.
52-
53-
```sql
54-
CREATE TABLE employee_data (
55-
employee_id Int64,
56-
salary UInt64,
57-
rating Float64,
58-
name String,
59-
address String,
60-
skills Array(String),
61-
projects Array(String),
62-
hire_date Date,
63-
last_update DateTime
64-
);
65-
```
66-
67-
### Step 2. Run bend-ingest-kafka
68-
69-
Once the table is created, execute the bend-ingest-kafka command with the required parameters to initiate the data loading process. The command will start the data ingester, which continuously monitors your Kafka topic, consumes the data, and inserts it into the specified table in Databend.
70-
71-
```bash
72-
bend-ingest-kafka \
73-
--kafka-bootstrap-servers="127.0.0.1:9092,127.0.0.2:9092" \
74-
--kafka-topic="Your Topic" \
75-
--kafka-consumer-group="Consumer Group" \
76-
--databend-dsn="http://root:[email protected]:8000" \
77-
--databend-table="default.employee_data" \
78-
--data-format="json" \
79-
--batch-size=100000 \
80-
--batch-max-interval=300s
81-
```
82-
83-
| Parameter | Description |
84-
|--------------------------- |----------------------------------------------------------------------------------------------------- |
85-
| --kafka-bootstrap-servers | Comma-separated list of Kafka bootstrap servers to connect to. |
86-
| --kafka-topic | The Kafka topic from which the data will be ingested. |
87-
| --kafka-consumer-group | The consumer group for Kafka consumer to join. |
88-
| --databend-dsn | The Data Source Name (DSN) to connect to Databend. Format: `http(s)://username:password@host:port`. |
89-
| --databend-table | The target Databend table where the data will be inserted. |
90-
| --data-format | The format of the data being ingested. |
91-
| --batch-size | The number of records per batch during ingestion. |
92-
| --batch-max-interval | The maximum interval (in seconds) to wait before flushing a batch. |
33+
- [Loading from Kafka with bend-ingest-kafka](/tutorials/load/kafka-bend-ingest-kafka)
34+
- [Loading from Kafka with databend-kafka-connect](/tutorials/load/kafka-databend-kafka-connect)
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
---
2+
title: Loading from Kafka with bend-ingest-kafka
3+
---
4+
5+
In this tutorial, we'll guide you through setting up a Kafka environment using Docker and loading messages from Kafka into Databend Cloud with [bend-ingest-kafka](https://github.com/databendcloud/bend-ingest-kafka).
6+
7+
### Step 1: Setting up Kafka Environment
8+
9+
Run the Apache Kafka Docker container on port 9092:
10+
11+
```shell
12+
MacBook-Air:~ eric$ docker run -d \
13+
> --name kafka \
14+
> -p 9092:9092 \
15+
> apache/kafka:latest
16+
Unable to find image 'apache/kafka:latest' locally
17+
latest: Pulling from apache/kafka
18+
690e87867337: Pull complete
19+
5dddb19fae62: Pull complete
20+
86caa4220d9f: Pull complete
21+
7802c028acb4: Pull complete
22+
16a3d1421c02: Pull complete
23+
ab648c7f18ee: Pull complete
24+
a917a90b7df6: Pull complete
25+
4e446fc89158: Pull complete
26+
f800ce0fc22f: Pull complete
27+
a2e5e46262c3: Pull complete
28+
Digest: sha256:c89f315cff967322c5d2021434b32271393cb193aa7ec1d43e97341924e57069
29+
Status: Downloaded newer image for apache/kafka:latest
30+
0261b8f3d5fde74f5f20340b58cb85d29d9b40ee4f48f1df2c41a68b616d22dc
31+
```
32+
33+
### Step 2: Create Topic & Produce Messages
34+
35+
1. Access the Kafka container:
36+
37+
```shell
38+
MacBook-Air:~ eric$ docker exec --workdir /opt/kafka/bin/ -it kafka sh
39+
```
40+
41+
2. Create a new Kafka topic named `test-topic`:
42+
43+
```shell
44+
/opt/kafka/bin $ ./kafka-topics.sh --bootstrap-server localhost:9092 --create --topic test-topic
45+
Created topic test-topic.
46+
```
47+
48+
3. Produce messages to the test-topic using the Kafka console producer:
49+
50+
```shell
51+
/opt/kafka/bin $ ./kafka-console-producer.sh --bootstrap-server localhost:9092 --topic test-topic
52+
```
53+
54+
4. Enter messages in JSON format:
55+
56+
```json
57+
{"id": 1, "name": "Alice", "age": 30}
58+
{"id": 2, "name": "Bob", "age": 25}
59+
```
60+
61+
5. Stop the producer with Ctrl+C once done.
62+
63+
### Step 3: Create Table in Databend Cloud
64+
65+
Create the target table in Databend Cloud:
66+
67+
```sql
68+
CREATE DATABASE doc;
69+
70+
CREATE TABLE databend_topic (
71+
id INT NOT NULL,
72+
name VARCHAR NOT NULL,
73+
age INT NOT NULL
74+
) ENGINE=FUSE;
75+
```
76+
77+
### Step 4: Install & Run bend-ingest-kafka
78+
79+
1. Install the bend-ingest-kafka tool by running the following command:
80+
81+
```shell
82+
go install github.com/databendcloud/bend-ingest-kafka@latest
83+
```
84+
85+
2. Run the following command to ingest messages from the `test-topic` Kafka topic into the target table in Databend Cloud:
86+
87+
```shell
88+
MacBook-Air:~ eric$ bend-ingest-kafka \
89+
> --kafka-bootstrap-servers="localhost:9092" \
90+
> --kafka-topic="test-topic" \
91+
> --databend-dsn="<your-dsn>" \
92+
> --databend-table="doc.databend_topic" \
93+
> --data-format="json"
94+
INFO[0000] Starting worker worker-0
95+
WARN[0072] Failed to read message from Kafka: context deadline exceeded kafka_batch_reader=ReadBatch
96+
2024/08/20 15:10:15 ingest 2 rows (1.225576 rows/s), 75 bytes (45.959100 bytes/s)
97+
```
98+
99+
3. In Databend Cloud, verify that the data has been successfully loaded:
100+
101+
![alt text](../../../../static/img/documents/tutorials/kafka-6.png)
102+
103+
104+

docs/en/tutorials/load/kafka-plugin.md renamed to docs/en/tutorials/load/kafka-databend-kafka-connect.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Loading Data from Kafka
2+
title: Loading from Kafka with databend-kafka-connect
33
---
44

55
In this tutorial, we'll establish a connection between Kafka in Confluent Cloud and Databend Cloud using the Kafka Connect sink connector plugin, [databend-kafka-connect](https://github.com/databendcloud/databend-kafka-connect). Then, we'll demonstrate how to produce messages and load them into Databend Cloud.
@@ -10,7 +10,7 @@ Before you begin, ensure that your Kafka environment is properly set up in Confl
1010

1111
1. Sign up for a free Confluent Cloud account. Once you've registered and created your account, [sign in](https://confluent.cloud/login) to your Confluent Cloud account.
1212

13-
2. Follow the [Confluent Quick Start](https://docs.confluent.io/cloud/current/get-started/index.html#step-1-create-a-ak-cluster-in-ccloud) create and launch a basic Kafka cluster inside your default environment.
13+
2. Follow the [Confluent Quick Start](https://docs.confluent.io/cloud/current/get-started/index.html#step-1-create-a-ak-cluster-in-ccloud) to create and launch a basic Kafka cluster inside your default environment.
1414

1515
3. Follow the [Install Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html) guide to install the Confluent CLI on your local machine. After installation, log in to your Confluent Cloud account to connect to Confluent Cloud:
1616

219 KB
Loading

0 commit comments

Comments
 (0)