Skip to content

Commit 159aa43

Browse files
AndreKuraitkolchfa-awsnatebower
authored
Add built-in Migration Assistant field data type transformation documentation (#10649)
* Add flattened type handling documentation for Migration Assistant Signed-off-by: Andre Kurait <[email protected]> * Add flattened type conversion link for MA Signed-off-by: Andre Kurait <[email protected]> * Add string type deprecation page for MA Signed-off-by: Andre Kurait <[email protected]> * Ad dense_vector to knn transformation Signed-off-by: Andre Kurait <[email protected]> * Update for dense_vector support Signed-off-by: Andre Kurait <[email protected]> * Cleanup vale comments Signed-off-by: Andre Kurait <[email protected]> * Update for vale Signed-off-by: Andre Kurait <[email protected]> * Add section on identifying if cluster has field type Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/index.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/index.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/transform-flattened-flat-object.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/transform-string-text-keyword.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/transform-string-text-keyword.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/transform-string-text-keyword.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/index.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/index.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/index.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/index.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Update _migration-assistant/migration-phases/migrate-metadata/transform-dense-vector-knn-vector.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Improve documentation on KNN plugin for MA Transform Signed-off-by: Andre Kurait <[email protected]> * Improve MA documentation by linking to field type documentation in transforms documentation Signed-off-by: Andre Kurait <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Andre Kurait <[email protected]> * Adjust wording on metadata migrations Signed-off-by: Andre Kurait <[email protected]> * Update title Transform string fields to text/keyword Signed-off-by: Andre Kurait <[email protected]> * Cleanup string to text logic Signed-off-by: Andre Kurait <[email protected]> --------- Signed-off-by: Andre Kurait <[email protected]> Signed-off-by: Andre Kurait <[email protected]> Co-authored-by: kolchfa-aws <[email protected]> Co-authored-by: Nathan Bower <[email protected]>
1 parent a212360 commit 159aa43

File tree

6 files changed

+551
-0
lines changed

6 files changed

+551
-0
lines changed

_data/migration-scenarios.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,12 @@ scenarios:
2121
url: "/migration-assistant/migration-phases/migrate-metadata/handling-type-mapping-deprecation/"
2222
- title: "Handling breaking changes in field types"
2323
url: "/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/"
24+
- title: "Transform flattened to flat_object"
25+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-flattened-flat-object/"
26+
- title: "Transform string to text/keyword"
27+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-string-text-keyword/"
28+
- title: "Transform dense_vector to knn_vector"
29+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-dense-vector-knn-vector/"
2430
- title: "Backfill"
2531
url: "/migration-assistant/migration-phases/backfill/"
2632
- title: "Teardown"
@@ -48,6 +54,12 @@ scenarios:
4854
url: "/migration-assistant/migration-phases/migrate-metadata/handling-type-mapping-deprecation/"
4955
- title: "Handling breaking changes in field types"
5056
url: "/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/"
57+
- title: "Transform flattened to flat_object"
58+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-flattened-flat-object/"
59+
- title: "Transform string to text/keyword"
60+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-string-text-keyword/"
61+
- title: "Transform dense_vector to knn_vector"
62+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-dense-vector-knn-vector/"
5163
- title: "Replay captured traffic"
5264
url: "/migration-assistant/migration-phases/replay-captured-traffic/"
5365
- title: "Reroute traffic from the Capture Proxy to the target"
@@ -81,6 +93,12 @@ scenarios:
8193
url: "/migration-assistant/migration-phases/migrate-metadata/handling-type-mapping-deprecation/"
8294
- title: "Handling breaking changes in field types"
8395
url: "/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/"
96+
- title: "Transform flattened to flat_object"
97+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-flattened-flat-object/"
98+
- title: "Transform string to text/keyword"
99+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-string-text-keyword/"
100+
- title: "Transform dense_vector to knn_vector"
101+
url: "/migration-assistant/migration-phases/migrate-metadata/transform-dense-vector-knn-vector/"
84102
- title: "Backfill"
85103
url: "/migration-assistant/migration-phases/backfill/"
86104
- title: "Replay captured traffic"

_migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ The following script demonstrates how to perform common field type conversions,
7474
* Replacing the deprecated `string` type with `text`.
7575
* Converting `flattened` to `flat_object` and removing the `index` property if present.
7676

77+
7778
```javascript
7879
function main(context) {
7980
const rules = [

_migration-assistant/migration-phases/migrate-metadata/index.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,13 @@ There might be an error about being unable to update an ES 7.10.2 cluster, this
175175

176176
Metadata migration requires modifying data from the source to the target versions to recreate items. Sometimes these features are no longer supported and have been removed from the target version. Sometimes these features are not available in the target version, which is especially true when downgrading. While this tool is meant to make this process easier, it is not exhaustive in its support. When encountering a compatibility issue or an important feature gap for your migration, [search the issues and comment on the existing issue](https://github.com/opensearch-project/opensearch-migrations/issues) or [create a new](https://github.com/opensearch-project/opensearch-migrations/issues/new/choose) issue if one cannot be found.
177177

178+
For information about handling specific field type compatibility issues, see:
179+
- [Transform type mappings]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/handling-type-mapping-deprecation/) -- Handle deprecated mapping types from Elasticsearch 6.x.
180+
- [Transform field types]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/) -- Configure custom field type transformations.
181+
- [Transform `flattened` to `flat_object` fields]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/transform-flattened-flat-object/) -- Automatically transform `flattened` to `flat_object` fields.
182+
- [Transform `string` to `text`/`keyword` fields]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/transform-string-text-keyword/) -- Automatically transform `string` to `text`/`keyword` fields.
183+
- [Transform `dense_vector` to `knn_vector` fields]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/transform-dense-vector-knn-vector/) -- Automatically transform `dense_vector` to `knn_vector` fields.
184+
178185
#### Deprecation of Mapping Types
179186

180187
In Elasticsearch 6.8 the mapping types feature was discontinued in Elasticsearch 7.0+ which has created complexity in migrating to newer versions of Elasticsearch and OpenSearch, [learn more](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/removal-of-types.html) ↗.
Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
---
2+
layout: default
3+
title: Transform dense_vector fields to knn_vector
4+
nav_order: 5
5+
parent: Migrate metadata
6+
grand_parent: Migration phases
7+
permalink: /migration-assistant/migration-phases/migrate-metadata/transform-dense-vector-knn-vector/
8+
---
9+
10+
# Transform dense_vector fields to knn_vector
11+
12+
13+
This guide explains how Migration Assistant automatically handles the transformation of Elasticsearch's `dense_vector` field type to OpenSearch's `knn_vector` field type during migration.
14+
15+
## Overview
16+
17+
The `dense_vector` field type was introduced in Elasticsearch 7.x for storing dense vectors used in machine learning and similarity search applications. When migrating from Elasticsearch 7.x to OpenSearch, Migration Assistant automatically converts `dense_vector` fields to OpenSearch's equivalent `knn_vector` type.
18+
19+
This transformation includes mapping the vector configuration parameters and enabling the necessary OpenSearch k-NN plugin settings.
20+
21+
To determine whether an Elasticsearch cluster uses `dense_vector` field types, make a call to your source cluster's `GET /_mapping` API. In the migration console, run `console clusters curl source_cluster "/_mapping"`. If you see `"type":"dense_vector"`, then this transformation is applicable and these fields will be automatically transformed during migration.
22+
23+
## Compatibility
24+
25+
The `dense_vector` to `knn_vector` transformation applies to:
26+
- **Source clusters**: Elasticsearch 7.x+
27+
- **Target clusters**: OpenSearch 1.x+
28+
- **Automatic conversion**: No configuration required
29+
30+
## Automatic conversion logic
31+
32+
Migration Assistant performs the following transformations when converting `dense_vector` to `knn_vector` fields.
33+
34+
### Field type transformation
35+
- Changes `type: "dense_vector"` to `type: "knn_vector"`
36+
- Maps `dims` parameter to `dimension`
37+
- Converts similarity metrics to OpenSearch space types
38+
- Configures the Hierarchical Navigable Small World (HNSW) algorithm with the Lucene engine
39+
40+
### Similarity mapping
41+
The transformation maps Elasticsearch similarity functions to OpenSearch space types:
42+
- `cosine``cosinesimil`
43+
- `dot_product``innerproduct`
44+
- `l2` (default) → `l2`
45+
46+
### Index settings
47+
When `dense_vector` fields are converted, Migration Assistant automatically performs the following operations:
48+
- Enables the k-NN plugin by setting `index.knn: true`
49+
- Ensures proper index configuration for vector search
50+
51+
## Migration output
52+
53+
During the migration process, you'll see this transformation in the output:
54+
55+
```
56+
Transformations:
57+
dense_vector to knn_vector:
58+
Convert field data type dense_vector to OpenSearch knn_vector
59+
```
60+
61+
## Transformation behavior
62+
63+
<table style="border-collapse: collapse; border: 1px solid #ddd;">
64+
<thead>
65+
<tr>
66+
<th style="border: 1px solid #ddd; padding: 8px;">Source field type</th>
67+
<th style="border: 1px solid #ddd; padding: 8px;">Target field type</th>
68+
</tr>
69+
</thead>
70+
<tbody>
71+
<tr>
72+
<td style="border: 1px solid #ddd; padding: 8px;">
73+
<pre><code>{
74+
"properties": {
75+
"embedding": {
76+
"type": "dense_vector",
77+
"dims": 128,
78+
"similarity": "cosine"
79+
}
80+
}
81+
}</code></pre>
82+
</td>
83+
<td style="border: 1px solid #ddd; padding: 8px;">
84+
<pre><code>{
85+
"properties": {
86+
"embedding": {
87+
"type": "knn_vector",
88+
"dimension": 128,
89+
"method": {
90+
"name": "hnsw",
91+
"engine": "lucene",
92+
"space_type": "cosinesimil",
93+
"parameters": {
94+
"encoder": {
95+
"name": "sq"
96+
}
97+
}
98+
}
99+
}
100+
}
101+
}</code></pre>
102+
</td>
103+
</tr>
104+
</tbody>
105+
</table>
106+
107+
### HNSW algorithm parameters
108+
109+
The transformation automatically configures the HNSW algorithm with the following options:
110+
- `engine`: `lucene` (OpenSearch default)
111+
- `encoder`: `sq` (scalar quantization for memory efficiency)
112+
- `method`: `hnsw` (approximate nearest neighbor search)
113+
114+
### Index options mapping
115+
116+
Elasticsearch `index_options` are mapped to OpenSearch HNSW parameters:
117+
- `m``m` (maximum number of connections per node)
118+
- `ef_construction``ef_construction` (size of dynamic candidate list)
119+
120+
### Index settings
121+
122+
When any `dense_vector` fields are converted, the following index setting is automatically added:
123+
124+
```json
125+
{
126+
"settings": {
127+
"index.knn": true
128+
}
129+
}
130+
```
131+
132+
## Behavior differences
133+
134+
Migration Assistant automatically transforms all `dense_vector` fields during metadata migration. The k-NN plugin must be installed and enabled on the target OpenSearch cluster. Note: Most OpenSearch distributions include the k-NN plugin in which case no action is needed.
135+
136+
### Query compatibility
137+
138+
After migration, vector search queries need to be updated:
139+
- Elasticsearch uses `script_score` queries with vector functions.
140+
- OpenSearch uses native `knn` query syntax.
141+
142+
**Elasticsearch query example**:
143+
```json
144+
{
145+
"query": {
146+
"script_score": {
147+
"query": {"match_all": {}},
148+
"script": {
149+
"source": "cosineSimilarity(params.query_vector, 'embedding') + 1.0",
150+
"params": {"query_vector": [0.1, 0.2, 0.3]}
151+
}
152+
}
153+
}
154+
}
155+
```
156+
157+
**OpenSearch query example**:
158+
```json
159+
{
160+
"query": {
161+
"knn": {
162+
"embedding": {
163+
"vector": [0.1, 0.2, 0.3],
164+
"k": 10
165+
}
166+
}
167+
}
168+
}
169+
```
170+
171+
## Troubleshooting
172+
173+
If you encounter issues with `dense_vector` conversion:
174+
175+
1. **Verify the k-NN plugin** -- Ensure the k-NN plugin is installed and enabled on your target OpenSearch cluster:
176+
```bash
177+
GET /_cat/plugins
178+
```
179+
180+
2. **Check migration logs** -- Review the detailed migration logs for any warnings or errors:
181+
```bash
182+
tail /shared-logs-output/migration-console-default/*/metadata/*.log
183+
```
184+
185+
3. **Validate mappings** -- After migration, verify that the field types have been correctly converted:
186+
```bash
187+
GET /your-index/_mapping
188+
```
189+
190+
4. **Test vector search** -- Verify that vector search functionality works with sample queries:
191+
```bash
192+
POST /your-index/_search
193+
{
194+
"query": {
195+
"knn": {
196+
"embedding": {
197+
"vector": [0.1, 0.2, 0.3],
198+
"k": 5
199+
}
200+
}
201+
}
202+
}
203+
```
204+
205+
5. **Monitor performance** -- Vector search performance may differ between Elasticsearch and OpenSearch. Monitor query performance and adjust HNSW parameters if needed.
206+
207+
## Related documentation
208+
209+
- [Transform field types documentation]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/) -- Configure custom field type transformations.
210+
- [k-NN documentation]({{site.url}}{{site.baseurl}}/vector-search/vector-search-techniques/approximate-knn/) -- Approximate k-NN search documentation.
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
layout: default
3+
title: Transform flattened fields to flat_object
4+
nav_order: 3
5+
parent: Migrate metadata
6+
grand_parent: Migration phases
7+
permalink: /migration-assistant/migration-phases/migrate-metadata/transform-flattened-flat-object/
8+
---
9+
10+
# Transform flattened fields to flat_object
11+
12+
This guide explains how Migration Assistant automatically transforms the `flattened` field type during migration to OpenSearch.
13+
14+
## Overview
15+
16+
The `flattened` field type was introduced in Elasticsearch 7.3 as an X-Pack feature. It allows you to store an entire JSON object as a single field value, which can be useful for objects with a large or unknown number of unique keys.
17+
18+
When migrating to OpenSearch 2.7 or later, Migration Assistant automatically converts `flattened` field types to OpenSearch's equivalent `flat_object` type. This transformation requires no configuration or user intervention.
19+
20+
To determine whether an Elasticsearch cluster uses `flattened` field types, make a call to your source cluster's `GET /_mapping` API. In the migration console, run `console clusters curl source_cluster "/_mapping"`. If you see `"type":"flattened"`, then this transformation is applicable and these fields will be automatically transformed during migration.
21+
22+
## Compatibility
23+
24+
The `flattened` to `flat_object` field type transformation applies to:
25+
- **Source clusters**: Elasticsearch 7.3+
26+
- **Target clusters**: OpenSearch 2.7+
27+
- **Automatic conversion**: No configuration required during metadata
28+
29+
## Automatic migration
30+
31+
When migrating to OpenSearch 2.7 or later, Migration Assistant automatically detects `flattened` field types and converts them to `flat_object` fields. During the migration process, you'll see this transformation in the output:
32+
33+
```
34+
Transformations:
35+
flattened to flat_object:
36+
Convert field data type flattened to OpenSearch flat_object
37+
```
38+
39+
### Example transformation
40+
41+
<table style="border-collapse: collapse; border: 1px solid #ddd;">
42+
<thead>
43+
<tr>
44+
<th style="border: 1px solid #ddd; padding: 8px;">Source field type</th>
45+
<th style="border: 1px solid #ddd; padding: 8px;">Target field type</th>
46+
</tr>
47+
</thead>
48+
<tbody>
49+
<tr>
50+
<td style="border: 1px solid #ddd; padding: 8px;">
51+
<pre><code>{
52+
"properties": {
53+
"labels": {
54+
"type": "flattened"
55+
},
56+
"title": {
57+
"type": "text"
58+
}
59+
}
60+
}</code></pre>
61+
</td>
62+
<td style="border: 1px solid #ddd; padding: 8px;">
63+
<pre><code>{
64+
"properties": {
65+
"labels": {
66+
"type": "flat_object"
67+
},
68+
"title": {
69+
"type": "text"
70+
}
71+
}
72+
}</code></pre>
73+
</td>
74+
</tr>
75+
</tbody>
76+
</table>
77+
78+
## Transformation behavior across versions
79+
80+
Migration Assistant automatically converts all `flattened` fields to `flat_object` fields. No additional configuration is required.
81+
82+
If you're migrating to OpenSearch versions earlier than 2.7, indexes containing `flattened` field types will fail to migrate. You have several options:
83+
84+
1. **Upgrade target cluster**: Upgrade your target OpenSearch cluster to version 2.7 or later to support the automatic conversion.
85+
86+
2. **Custom transformation**: Use the [field type transformation framework]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/) to convert `flattened` to another supported type (for example, `object` or `nested`).
87+
88+
## Differences between flattened and flat_object
89+
90+
While `flat_object` in OpenSearch provides similar functionality to Elasticsearch's `flattened` type, there are some minor differences:
91+
92+
- **Query syntax**: Both support dot notation for accessing nested fields.
93+
- **Performance**: Similar performance characteristics for indexing and searching.
94+
- **Storage**: Both store the entire object as a single Lucene field.
95+
- **Limitations**: Both have similar limitations on aggregations and sorting.
96+
97+
## Troubleshooting
98+
99+
If you encounter issues with `flattened` field migration:
100+
101+
1. **Verify target version** -- Ensure your target OpenSearch cluster is running version 2.7 or later.
102+
103+
2. **Check migration logs** -- Review the detailed migration logs for any warnings or errors:
104+
```bash
105+
cat /shared-logs-output/migration-console-default/*/metadata/*.log
106+
```
107+
108+
3. **Validate mappings** -- After migration, verify that the field types have been correctly converted:
109+
```bash
110+
GET /your-index/_mapping
111+
```
112+
113+
## Related documentation
114+
115+
- [Transform field types documentation]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/) -- Configure custom field type transformations.
116+
- [flat_object field type documentation]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/flat-object/) -- Learn about flat_object field type.

0 commit comments

Comments
 (0)