How to report additional info in bulk API?

In #67, we decided *not* to allow additional fields in serialization.
This means that any additional data we want to transmit needs to be _around_ a chunk, like:

```json
{
  "chunk": {
    "serializationFormatVersion": "2023.1",
    "languages": [ ... ],
    "nodes": [ ... ]
  },
  "other": "data",
  "more": true
}
```

We probably want to report additional data in bulk API. Taking _retrieve_ as example, messages might include "you've asked for a unknown node id" or "invalid depthLimit".
For _createPartitions_, we might send messages like "Partition node id already exists" or "Partition node id not reserved for this client".

How do we want to encode such messages?

## Option A: Just text messages
We just send plain text messages without any additional structure.

```json
{
  "chunk": { ... },
  "messages": [
    "requested node id unknown: 2134",
    "node 123 mentions child abc, but node abc mentions parent 456",
    "node 23 contains property {myLang@23|visible} several times (values: 'hello', 'true')"
  ],
  "resultCode": 123
}
```

Pro:
* Very simple
* Flexible
* Such messages usually stem from an implementation bug, so it's only useful for programmers.
   They understand text; automatic processing is not necessary.

Con:
* Hard to process automatically
* How to distinguish "severity"? Asking for an unknown node id is ok, but trying to create a partition with an existing node id is a real issue.

## Option B: Generic structured messages
The wire format has a very generic structure, and clients need to make sense of it.

```json
{
  "chunk": { ... },
  "success": true,
  "messages": [
    {
      "kind": "unknownNodeId",
      "message": "requested node id unknown: 2134",
      "data": [
        "2134"
      ]
    },
    {
      "kind": "invalidTree",
      "message": "node 123 mentions child abc, but node abc mentions parent 456",
      "data": {
        "parentId": "123",
        "childId": "abc",
        "child-parentId": "456"
      }
    },
    {
      "kind": "duplicateProperty",
      "message": "node 23 contains property {myLang@23|visible} several times (values: 'hello', 'true')",
      "data": [
        "23",
        "myLang",
        "23",
        "visible",
        "hello",
        "true"
      ]
    }
  ]
}
```

Pro:
* Wire format not too complex, can be verified generically
* Seems to work for [EMF](https://download.eclipse.org/modeling/emf/emf/javadoc/2.10.0/org/eclipse/emf/common/util/BasicDiagnostic.html)
* Allows automatic processing, somewhat independent from implementation

Con:
* Need to know each _kind_ to make sense of _data_

## Option C: Specifically structured messages
The wire format specifies every possible message in detail with appropriate stucture

```json
{
  "chunk": { ... },
  "messages": [
    {
      "kind": "unknownNodeId",
      "message": "requested node id unknown: 2134",
      "nodeId": "2134"
    },
    {
      "kind": "invalidTree",
      "message": "node 123 mentions child abc, but node abc mentions parent 456",
      "parent-nodeId": "123",
      "parent-childId": "abc",
      "child-parentId": "456"
    },
    {
      "kind": "duplicateProperty",
      "message": "node 23 contains property {myLang@23|visible} several times (values: 'hello', 'true')",
      "nodeId": "23",
      "metaPointer": {
        "language": "myLang",
        "version": "23",
        "key": "visible"
      },
      "values": [
        "hello",
        "true"
      ]
    }
  ]
}
```

Pro:
* Easy to interpret
* Little ambiguity

Con:
* Verbose
* Inflates protocol (as we have to specify each message)

## Option D: Use validation findings
Send a second batch of nodes containing validation finding nodes

_Note:_ This chunk omits the nodes `hello-node-id` and `true-node-id` for brevity.

```json
{
  "chunk": { ... },
  "findings": {
    "serializationFormatVersion": "2023.1",
    "languages": [ ... ],
    "nodes": [
      {
        "id": "aaa",
        "classifier": {
          "language": "bulkApiLanguage",
          "version": "2024.1",
          "key": "unknownNodeId"
        },
        "properties": [
          {
            "property": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "message"
            },
            "value": "requested node id unknown: 2134"
          },
          {
            "property": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "nodeId"
            },
            "value": "2134"
          }
        ],
        "containments": [],
        "references": [],
        "annotations": [],
        "parent": "someId"
      },
      {
        "id": "bbb",
        "classifier": {
          "language": "bulkApiLanguage",
          "version": "2024.1",
          "key": "invalidTree"
        },
        "properties": [
          {
            "property": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "message"
            },
            "value": "node 123 mentions child abc, but node abc mentions parent 456"
          },
          {
            "property": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "parent-nodeId"
            },
            "value": "123"
          },
          {
            "property": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "parent-childId"
            },
            "value": "abc"
          },
          {
            "property": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "child-parentId"
            },
            "value": "456"
          }
        ],
        "containments": [],
        "references": [],
        "annotations": [],
        "parent": "someId"
      },
      {
        "id": "ccc",
        "classifier": {
          "language": "bulkApiLanguage",
          "version": "2024.1",
          "key": "duplicateProperty"
        },
        "properties": [
          {
            "property": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "message"
            },
            "value": "node 23 contains property {myLang@23|visible} several times (values: 'hello', 'true')"
          },
          {
            "property": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "nodeId"
            },
            "value": "23"
          }
        ],
        "containments": [
          {
            "containment": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "duplicateValues"
            },
            "children": [
              "hello-node-id",
              "true-node-id"
            ]
          }
        ],
        "references": [
        {
            "reference": {
              "language": "bulkApiLanguage",
              "version": "2024.1",
              "key": "feature"
            },
            "targets": [
              {
                "resolveInfo": "visible",
                "reference": "id-visible"
              }
            ]
          },
        ],
        "annotations": [],
        "parent": "someId"
      }
    ]
  }
}
```

Pro:
* Nicely integrates with serialization findings language (https://github.com/LionWeb-io/lionweb-mps/pull/89)
* No additional data structure

Con:
* _very_ verbose
* we probably don't need lots of node features (e.g. unique ids, tree shape, annotations)
* Danger of XML (only because we _can_ express everything in our format it doesn't mean we _should_)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to report additional info in bulk API? #236

Option A: Just text messages

Option B: Generic structured messages

Option C: Specifically structured messages

Option D: Use validation findings

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to report additional info in bulk API? #236

Description

Option A: Just text messages

Option B: Generic structured messages

Option C: Specifically structured messages

Option D: Use validation findings

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions