Skip to content

Commit bfcffe5

Browse files
committed
docs: simplify index refresh guidance
1 parent abcd5bf commit bfcffe5

File tree

9 files changed

+76
-53
lines changed

9 files changed

+76
-53
lines changed

docs/cn/sql-reference/10-sql-commands/00-ddl/01-table/90-alter-table.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
---
22
title: ALTER TABLE
33
sidebar_position: 4
4-
slug: /cn/sql/sql-commands/ddl/table/alter-table
54
---
65

76
import FunctionDescription from '@site/src/components/FunctionDescription';

docs/cn/sql-reference/10-sql-commands/00-ddl/07-aggregating-index/create-aggregating-index.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,9 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
1212
## 语法
1313

1414
```sql
15-
CREATE [ OR REPLACE ] [ ASYNC ] AGGREGATING INDEX <index_name> AS SELECT ...
15+
CREATE [ OR REPLACE ] AGGREGATING INDEX <index_name> AS SELECT ...
1616
```
1717

18-
- `ASYNC` 选项:添加 ASYNC 是可选的。它允许异步创建索引,即索引不会立即构建。要稍后构建,请使用 [REFRESH AGGREGATING INDEX](refresh-aggregating-index.md) 命令。
19-
2018
- 创建聚合索引(Aggregating Index)时,请将其使用限制在标准的[聚合函数](../../../20-sql-functions/07-aggregate-functions/index.md)(例如 AVG、SUM、MIN、MAX、COUNT 和 GROUP BY)内,同时请注意,不支持 GROUPING SETS、[窗口函数](../../../20-sql-functions/08-window-functions/index.md)[LIMIT](../../20-query-syntax/01-query-select.md#limit-clause)[ORDER BY](../../20-query-syntax/01-query-select.md#order-by-clause),否则将收到错误提示:`Currently create aggregating index just support simple query, like: SELECT ... FROM ... WHERE ... GROUP BY ...`
2119

2220
- 创建聚合索引(Aggregating Index)时定义的查询(Query)筛选范围应与实际查询(Query)的范围匹配或包含实际查询(Query)的范围。

docs/cn/sql-reference/10-sql-commands/00-ddl/07-aggregating-index/refresh-aggregating-index.md

Lines changed: 11 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -7,34 +7,30 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
77

88
<FunctionDescription description="Introduced or updated: v1.2.151"/>
99

10+
Databend 默认以 `SYNC` 模式自动维护聚合索引。只有当表里已有数据、而聚合索引是后来补建时,才需要运行 `REFRESH AGGREGATING INDEX` 来补齐这些历史行。
11+
1012
## 语法
1113

1214
```sql
13-
REFRESH AGGREGATING INDEX <index_name> [ LIMIT <limit> ]
15+
REFRESH AGGREGATING INDEX <index_name>
1416
```
1517

16-
"LIMIT" 参数允许您控制每次刷新操作可以更新的最大块数。 强烈建议使用此参数并定义限制以优化内存使用。 另请注意,设置限制可能会导致部分数据更新。 例如,如果您有 100 个块但设置的限制为 10,则单次刷新可能无法更新最新数据,可能会导致某些块未刷新。 您可能需要执行多次刷新操作以确保完全更新。
17-
18-
## 何时使用 REFRESH AGGREGATING INDEX
19-
20-
- **当自动更新失败时:** 如果默认自动更新(`SYNC` 模式)无法正常工作,请使用 `REFRESH AGGREGATING INDEX` 将任何遗漏的数据包含在索引中。
21-
- **对于 ASYNC 索引:** 如果使用 `ASYNC` 选项创建聚合索引,则它不会自动更新。 您需要使用 `REFRESH AGGREGATING INDEX` 手动刷新它。
22-
2318
## 示例
2419

25-
此示例创建并刷新名为 *my_agg_index* 的聚合索引
20+
以下示例演示:表先有数据,再创建聚合索引并通过 `REFRESH` 回填
2621

2722
```sql
28-
-- Prepare data
23+
-- 先建表并写入在索引创建前的数据
2924
CREATE TABLE agg(a int, b int, c int);
3025
INSERT INTO agg VALUES (1,1,4), (1,2,1), (1,2,4);
3126

32-
-- Create an aggregating index
27+
-- 声明聚合索引(现有数据尚未被索引)
3328
CREATE AGGREGATING INDEX my_agg_index AS SELECT MIN(a), MAX(c) FROM agg;
3429

35-
-- Insert new data
36-
INSERT INTO agg VALUES (2,2,5);
37-
38-
-- Refresh the aggregating index
30+
-- 回填历史数据
3931
REFRESH AGGREGATING INDEX my_agg_index;
32+
33+
-- 索引创建后再写入的数据会自动同步
34+
INSERT INTO agg VALUES (2,2,5);
35+
-- SYNC 模式会自动保持索引最新
4036
```

docs/cn/sql-reference/10-sql-commands/00-ddl/07-inverted-index/refresh-inverted-index.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
77

88
<FunctionDescription description="Introduced or updated: v1.2.405"/>
99

10-
刷新 Databend 中的倒排索引。在以下情况下,倒排索引需要刷新:
11-
12-
- 当在创建倒排索引之前将数据插入到表中时,创建后需要手动刷新倒排索引,才能有效地索引插入的数据。
13-
- 当倒排索引遇到问题或损坏时,需要刷新。如果由于某些块的倒排索引文件损坏而导致倒排索引中断,则诸如 `where match(body, 'wiki')` 之类的查询将返回错误。在这种情况下,您需要刷新倒排索引以解决此问题。
10+
倒排索引在默认的 `SYNC` 模式下会随着新数据写入自动刷新。仅在创建索引前表中已有数据、需要回填历史行时才需要执行 `REFRESH INVERTED INDEX`
1411

1512
## 语法
1613

@@ -25,6 +22,17 @@ REFRESH INVERTED INDEX <index> ON [<database>.]<table> [LIMIT <limit>]
2522
## 示例
2623

2724
```sql
28-
-- 刷新表 "customer_feedback" 的名为 "customer_feedback_idx" 的倒排索引
25+
-- 表中已有在创建索引之前写入的数据
26+
CREATE TABLE IF NOT EXISTS customer_feedback(id INT, body STRING);
27+
INSERT INTO customer_feedback VALUES
28+
(1, 'Great coffee beans'),
29+
(2, 'Needs fresh roasting');
30+
31+
-- 之后才创建倒排索引
32+
CREATE INVERTED INDEX customer_feedback_idx ON customer_feedback(body);
33+
34+
-- 通过 REFRESH 回填历史数据
2935
REFRESH INVERTED INDEX customer_feedback_idx ON customer_feedback;
36+
37+
-- 之后的新写入会在 SYNC 模式下自动刷新
3038
```

docs/cn/sql-reference/10-sql-commands/00-ddl/07-ngram-index/refresh-ngram-index.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
77

88
<FunctionDescription description="引入或更新于:v1.2.726"/>
99

10-
刷新表上已有的 NGRAM 索引(NGRAM INDEX
10+
在默认情况下,NGRAM 索引会在数据写入时自动刷新。只有在需要为历史数据回填索引时才需要执行 `REFRESH NGRAM INDEX`
1111

1212
## 语法
1313

@@ -18,8 +18,18 @@ ON [<database>.]<table_name>;
1818

1919
## 示例
2020

21-
以下示例刷新 `amazon_reviews_ngram` 表上的 `idx1` 索引:
22-
2321
```sql
22+
-- 表中已有在创建索引之前写入的数据
23+
CREATE TABLE IF NOT EXISTS amazon_reviews_ngram(review_id INT, review STRING);
24+
INSERT INTO amazon_reviews_ngram VALUES
25+
(1, 'coffee beans from Colombia'),
26+
(2, 'best roasting kit');
27+
28+
-- 随后才声明 NGRAM 索引
29+
CREATE NGRAM INDEX idx1 ON amazon_reviews_ngram(review) WITH (ngram_size = 3);
30+
31+
-- 通过刷新补齐历史数据
2432
REFRESH NGRAM INDEX idx1 ON amazon_reviews_ngram;
33+
34+
-- 之后的新写入会在 SYNC 模式下自动刷新
2535
```

docs/en/sql-reference/10-sql-commands/00-ddl/07-aggregating-index/create-aggregating-index.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,9 @@ Create a new aggregating index in Databend.
1212
## Syntax
1313

1414
```sql
15-
CREATE [ OR REPLACE ] [ ASYNC ] AGGREGATING INDEX <index_name> AS SELECT ...
15+
CREATE [ OR REPLACE ] AGGREGATING INDEX <index_name> AS SELECT ...
1616
```
1717

18-
- `ASYNC` Option: Adding ASYNC is optional. It allows the index to be created asynchronously. This means the index isn't built right away. To build it later, use the [REFRESH AGGREGATING INDEX](refresh-aggregating-index.md) command.
19-
2018
- When creating aggregating indexes, limit their usage to standard [Aggregate Functions](../../../20-sql-functions/07-aggregate-functions/index.md) (e.g., AVG, SUM, MIN, MAX, COUNT and GROUP BY), while keeping in mind that GROUPING SETS, [Window Functions](../../../20-sql-functions/08-window-functions/index.md), [LIMIT](../../20-query-syntax/01-query-select.md#limit-clause), and [ORDER BY](../../20-query-syntax/01-query-select.md#order-by-clause) are not accepted, or you will get an error: `Currently create aggregating index just support simple query, like: SELECT ... FROM ... WHERE ... GROUP BY ...`.
2119

2220
- The query filter scope defined when creating aggregating indexes should either match or encompass the scope of your actual queries.

docs/en/sql-reference/10-sql-commands/00-ddl/07-aggregating-index/refresh-aggregating-index.md

Lines changed: 11 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -7,34 +7,30 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
77

88
<FunctionDescription description="Introduced or updated: v1.2.151"/>
99

10+
Databend automatically maintains aggregating indexes in `SYNC` mode as new data is ingested. Run `REFRESH AGGREGATING INDEX` when you introduce an index on a table that already contains data so earlier rows are backfilled.
11+
1012
## Syntax
1113

1214
```sql
13-
REFRESH AGGREGATING INDEX <index_name> [ LIMIT <limit> ]
15+
REFRESH AGGREGATING INDEX <index_name>
1416
```
1517

16-
The "LIMIT" parameter allows you to control the maximum number of blocks that can be updated with each refresh action. It is strongly recommended to use this parameter with a defined limit to optimize memory usage. Please also note that setting a limit may result in partial data updates. For example, if you have 100 blocks but set a limit of 10, a single refresh might not update the most recent data, potentially leaving some blocks unrefreshed. You may need to execute multiple refresh actions to ensure a complete update.
17-
18-
## When to Use REFRESH AGGREGATING INDEX
19-
20-
- **When Automatic Updates Fail:** In cases where the default automatic updates (`SYNC` mode) do not work properly, use `REFRESH AGGREGATING INDEX` to include any missed data in the index.
21-
- **For ASYNC Indexes:** If aggregating index is created with the `ASYNC` option, it won't update automatically. You need to manually refresh it using `REFRESH AGGREGATING INDEX`.
22-
2318
## Examples
2419

25-
This example creates and refreshes an aggregating index named *my_agg_index*:
20+
This example creates an aggregating index on a table that already contains data, then runs `REFRESH` once to backfill those rows:
2621

2722
```sql
28-
-- Prepare data
23+
-- Prepare a table and load data before the index exists
2924
CREATE TABLE agg(a int, b int, c int);
3025
INSERT INTO agg VALUES (1,1,4), (1,2,1), (1,2,4);
3126

32-
-- Create an aggregating index
27+
-- Declare the aggregating index (existing rows are not indexed yet)
3328
CREATE AGGREGATING INDEX my_agg_index AS SELECT MIN(a), MAX(c) FROM agg;
3429

35-
-- Insert new data
36-
INSERT INTO agg VALUES (2,2,5);
37-
38-
-- Refresh the aggregating index
30+
-- Backfill previously inserted rows
3931
REFRESH AGGREGATING INDEX my_agg_index;
32+
33+
-- Insert new data after the index exists (no manual refresh needed)
34+
INSERT INTO agg VALUES (2,2,5);
35+
-- SYNC mode keeps the index current automatically
4036
```

docs/en/sql-reference/10-sql-commands/00-ddl/07-inverted-index/refresh-inverted-index.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
77

88
<FunctionDescription description="Introduced or updated: v1.2.405"/>
99

10-
Refreshes an inverted index in Databend. An inverted index requires refresh in the following scenarios:
11-
12-
- When data is inserted into the table before creating the inverted index, manual refreshing of the inverted index is necessary post-creation to effectively index the inserted data.
13-
- When the inverted index encounters issues or becomes corrupted, it needs to be refreshed. If the inverted index breaks due to certain blocks' inverted index files being corrupted, a query such as `where match(body, 'wiki')` will return an error. In such instances, you need to refresh the inverted index to fix the issue.
10+
Databend automatically refreshes inverted indexes in `SYNC` mode whenever new data is written. Use `REFRESH INVERTED INDEX` primarily to backfill rows that existed before the index was declared.
1411

1512
## Syntax
1613

@@ -25,6 +22,17 @@ REFRESH INVERTED INDEX <index> ON [<database>.]<table> [LIMIT <limit>]
2522
## Examples
2623

2724
```sql
28-
-- Refresh an inverted index named "customer_feedback_idx" for the table "customer_feedback"
25+
-- Existing table with data loaded before the index was declared
26+
CREATE TABLE IF NOT EXISTS customer_feedback(id INT, body STRING);
27+
INSERT INTO customer_feedback VALUES
28+
(1, 'Great coffee beans'),
29+
(2, 'Needs fresh roasting');
30+
31+
-- Create the inverted index afterward
32+
CREATE INVERTED INDEX customer_feedback_idx ON customer_feedback(body);
33+
34+
-- Backfill historical rows so the index covers earlier inserts
2935
REFRESH INVERTED INDEX customer_feedback_idx ON customer_feedback;
36+
37+
-- Future inserts refresh automatically in SYNC mode
3038
```

docs/en/sql-reference/10-sql-commands/00-ddl/07-ngram-index/refresh-ngram-index.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
77

88
<FunctionDescription description="Introduced or updated: v1.2.726"/>
99

10-
Refresh an existing NGRAM index from a table.
10+
Databend automatically refreshes NGRAM indexes when data is ingested. Use `REFRESH NGRAM INDEX` when you need to backfill data that existed before the index was defined.
1111

1212
## Syntax
1313

@@ -18,8 +18,18 @@ ON [<database>.]<table_name>;
1818

1919
## Examples
2020

21-
The following example refreshes the `idx1` index from the `amazon_reviews_ngram` table:
22-
2321
```sql
22+
-- Table already populated before the NGRAM index exists
23+
CREATE TABLE IF NOT EXISTS amazon_reviews_ngram(review_id INT, review STRING);
24+
INSERT INTO amazon_reviews_ngram VALUES
25+
(1, 'coffee beans from Colombia'),
26+
(2, 'best roasting kit');
27+
28+
-- Declare the NGRAM index afterward
29+
CREATE NGRAM INDEX idx1 ON amazon_reviews_ngram(review) WITH (ngram_size = 3);
30+
31+
-- Refresh so the pre-existing rows are indexed
2432
REFRESH NGRAM INDEX idx1 ON amazon_reviews_ngram;
33+
34+
-- Subsequent inserts refresh automatically in SYNC mode
2535
```

0 commit comments

Comments
 (0)