Skip to content

Commit ebf47fb

Browse files
authored
Merge pull request #10 from vectordotdev/csv-enrichment-schema
Add CSV enrichment example with schema
2 parents b2d7c25 + 5159fb3 commit ebf47fb

File tree

3 files changed

+66
-15
lines changed

3 files changed

+66
-15
lines changed

csv-enrichment/README.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,24 @@
11
# CSV enrichment in Vector
22

3-
**Note:** This feature is unreleased and still under active development. However, if you want to try it out, we would
4-
love any feedback!
3+
> **Note:** This feature is unreleased and still under active development. However, if you want to try it out, we would
4+
> love any feedback!
55
66
## How to start it
77

8-
`docker-compose up`
8+
```shell
9+
docker-compose up
10+
```
911

1012
## What it does
1113

12-
This demo shows an example of using CSV enrichment in Vector to enrich events with data from a CSV file.
14+
This demo shows an example of using CSV enrichment in Vector to enrich events with data from two CSV files.
1315

14-
The CSV file ([`./data/users.csv`](./data/users.csv) contains a list of users with their phone numbers and addresses.
16+
The [`users.csv`](./data/users.csv) file contains a list of users with their phone numbers and addresses, while the
17+
[`codes.csv`](./data/codes.csv) file contains a list of common Unix exit codes (with code, tag, and message info).
1518

16-
The `vector.toml` configuration contains:
19+
The [`vector.toml`](./vector.toml) configuration contains:
1720

18-
* A `random` source in the `vector.toml` that simply generates some fake events using names that match users in the CSV file
19-
* A `remap` transform then parses these events and then looks up the corresponding record from the CSV file to enrich the event with more metadata
21+
* Two `generator` sources in the that generate fake events that can be enriched using the CSV files (user data and coded error messages
22+
respectively)
23+
* Two `remap` transforms that parse the events and then look up the corresponding records from the CSV files to enrich the events with more metadata
2024
* A `console` sink to print out the events

csv-enrichment/data/codes.csv

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
code,tag,message
2+
1,"EPERM","Operation not permitted"
3+
2,"ENOENT","No such file or directory"
4+
3,"ESRCH","No such process"

csv-enrichment/vector.toml

Lines changed: 50 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,26 @@
1+
# User info table (all fields are string, thus no schema)
12
[enrichment_tables.users]
23
type = "file"
3-
file.path = "/var/lib/vector/data/users.csv"
4-
file.encoding.type = "csv"
54

6-
[sources.random]
5+
[enrichment_tables.users.file]
6+
path = "/var/lib/vector/data/users.csv"
7+
encoding = { type = "csv" }
8+
9+
# Error codes table (with specified schema)
10+
[enrichment_tables.codes]
11+
type = "file"
12+
13+
[enrichment_tables.codes.file]
14+
path = "/var/lib/vector/data/codes.csv"
15+
encoding = { type = "csv" }
16+
17+
[enrichment_tables.codes.schema]
18+
code = "integer"
19+
tag = "string"
20+
message = "string"
21+
22+
# Generate user info messages
23+
[sources.user_info_messages]
724
type = "generator"
825
format = "shuffle"
926
lines = [
@@ -15,15 +32,41 @@ lines = [
1532
]
1633
interval = 2
1734

18-
[transforms.remap]
35+
# Generate coded error messages
36+
[sources.coded_error_messages]
37+
type = "generator"
38+
format = "shuffle"
39+
lines = [
40+
'{"code":1,"device_id":"e5ad503d","timestamp":"2021-10-18T15:35:09.158139Z"}',
41+
'{"code":2,"device_id":"a5b2401e","timestamp":"2021-10-18T15:35:28.517210Z"}',
42+
'{"code":3,"device_id":"b48f41aa","timestamp":"2021-10-18T15:35:37.846783Z"}'
43+
]
44+
interval = 2
45+
46+
[transforms.remap_user_info]
1947
type = "remap"
20-
inputs = ["random"]
48+
inputs = ["user_info_messages"]
2149
source = """
2250
. = parse_json(.message) ?? {}
2351
. |= get_enrichment_table_record("users", { "last_name": .last_name, "first_name": .first_name }) ?? {}
2452
"""
2553

54+
[transforms.remap_coded_errors]
55+
type = "remap"
56+
inputs = ["coded_error_messages"]
57+
source = """
58+
. = parse_json!(.message)
59+
60+
row, err = get_enrichment_table_record("codes", { "code": del(.code) })
61+
62+
if err != null {
63+
log(err, level: "error")
64+
} else {
65+
. |= merge(., {"message": row.message, "tag": row.tag})
66+
}
67+
"""
68+
2669
[sinks.console]
2770
type = "console"
28-
inputs = ["remap"]
29-
encoding.codec = "json"
71+
inputs = ["remap_*"]
72+
encoding = { codec = "json" }

0 commit comments

Comments
 (0)