You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(input_schema): Enable sub-schemas in input-schema (#519)
# Input sub-schemas
> ~~**Note**: This is a proposal, not a final implementation intended
for immediate merging in this state. The purpose of this PR is to
present a suggested solution for discussion and approval, not the
completed implementation.~~
The PR is ready to be merged, few changes from the original proposal
were made based on the discussion here:
- validation if the sub-schema is compatible with the editor (e.g.
`keyValue` editor should define object with two string properties `key`
and `value` and nothing else)
- `schemaBased` editor can be used only in root properties
## 🎯 Goal
The goal of this proposal is to enable creators to define sub-schemas
within an Actor's input schema for fields of type `array` and `object`.
These field types would support specifying their inner structure, which
would be used both for input validation and for rendering the
corresponding (sub)fields in the input UI form.
## 📝 Solution
The proposed solution leverages "native" features of JSON Schema,
utilizing its properties such as `properties`, `required`,
`additionalProperties` (for object) and `items` (for array).
As a result of this, creators would be able to define input schema like
this:
```
{
"title": "Apify Actor input schema example",
"type": "object",
"schemaVersion": 1,
"properties": {
"my-object": {
"type": "object",
"title": "My object",
"editor": "schemaBased",
"properties": {
"key1": {
"type": "string",
"title": "Key 1",
"description": "Description",
"editor": "textfield"
},
"key2": {
"type": "integer",
"title": "Key 2",
"description": "Description",
"editor": "integer"
}
},
"required": ["key1"],
"additionalProperties": false
},
"my-array": {
"type": "array",
"title": "My array",
"editor": "json",
"items": {
"type": "object",
"properties": {
"key1": {
"type": "string",
"title": "Key 2",
"description": "Description",
"editor": "textfield"
},
"key2": {
"type": "integer",
"title": "Key 2",
"description": "Description",
"editor": "integer"
}
},
"required": ["key1"],
"additionalProperties": false
}
}
},
"required": []
}
```
Actor with schema like this, would then accept input like:
```
{
"my-object": {
"key1": "test",
"key2": 123
},
"my-array": [
{
"key1": "test",
"key2": 123
},
{
"key1": "test"
}
]
}
```
### Recursiveness
The schema should support recursion, allowing creators to define nested
objects within other objects. This enables complex and deeply structured
input definitions as needed.
```
{
"title": "Apify Actor input schema example",
"type": "object",
"schemaVersion": 1,
"properties": {
"my-object": {
"type": "object",
"title": "My object",
"editor": "schemaBased",
"properties": {
"key1": {
"type": "string",
"title": "Key 1",
"description": "Description",
"editor": "textfield"
},
"key2": {
"type": "object",
"title": "Key 2",
"description": "Description",
"editor": "schemaBased",
"properties": {
"subKey": {
"type": "string",
"title": "SubKey",
"description": "Description",
"editor": "textfield"
}
}
}
},
"required": ["key1"],
"additionalProperties": false
},
"required": []
}
```
The same goes with arrays:
```
{
"title": "Apify Actor input schema example",
"type": "object",
"schemaVersion": 1,
"properties": {
"my-array": {
"type": "array",
"title": "My array",
"editor": "schemaBased",
"items": {
"type": "array",
"items": {
"type": "string"
}
}
}
}
"required": []
}
```
## 👨💻 Implementation
### JSON Schema
At the JSON Schema level, the implementation is relatively
straightforward and can be basically follow the approach used in this
PR. The proposed changes include:
- **Creating new definitions for each property type** - some properties
used in the root schema don’t make sense within a sub-schema context.
Therefore, instead of reusing the root definitions with complex
conditions, it’s simpler to create tailored definitions for sub-schemas.
- **Extending the `object` and `array` definitions with new
properties**:
- `object` type can include:
- `properties` - defines the internal structure of the object. It
supports all property types available at the root input schema level
(with mentioned restrictions).
- `additionalProperties` - specifies whether properties not listed in
`properties` are allowed
- `required` - lists which properties defined in `properties` are
required
```
{
"type": "object",
"properties": {
"key": {
...
}
},
"additionalProperties": false,
"required": "key"
}
```
- `array` type can include
- `items` - defines the type (and optionally the shape) of array items
```
{
"type": "array",
"items": {
"type": "object",
"properties": {...}
}
}
```
### Validation
#### Actor's Input Schema
Validation would almost work out of the box with the updated
`schema.json` file, but a few additional steps are required:
- Since we're manually validating all properties one by one against the
schema (using the `validateProperties` function), we also need to
validate sub-properties. The validation has to be done against a
different set of definitions in case of root-level properties and
sub-properties. This should be straightforward to implement.
- the logic in `parseAjvError` needs to be updated to correctly display
the path to the relevant (sub)property in validation errors. Again, this
is not expected to be complex.
#### Input
Because all newly added properties (`properties`,
`additionalProperties`, `required` and `items`) are native features of
JSON Schema, input validation against the Actor's input schema will work
entirely out of the box.
### Input UI
In the Input UI, we want to give creators the flexibility to render each
sub-field of a complex input field independently.
A proof of concept has already been implemented (currently only for
object fields), and there are no major blockers to a full
implementation. You can see the draft PR here:
apify/apify-core#21564
> Note: The code in the PR is intentionally minimal, not optimal and not
production-ready. Its purpose is to validate the approach.
Creators should have the option to choose how a field is rendered:
- Use an existing editor (e.g. `json`), in which case the sub-schema is
used solely for validation.
- Or render each sub-property as an individual input field based on the
sub-schema.
To support the latter, we need to introduce a new editor type that
signals sub-schema-based rendering. I’ve tentatively called this editor
`schemaBased`, but the name is open for discussion.
For arrays using sub-schemas with the `schemaBased` editor, we’ll need a
UI component that includes controls to add, remove, and optionally
reorder items.
**Note**: Based on the discussion below, we decide to limit
`schemaBased` editor only for root level properties.
**Technical Implementation Notes**
- The main change in the Input UI will be to recursively render
sub-fields when the `schemaBased` editor is used.
- We’ll use dot notation (e.g. field.subField) for Formik field names to
ensure proper binding. Formik handles this automatically.
- We'll also need to support labels, descriptions, and other for
sub-fields, but this should be relatively straightforward.
## ❓ Limitations and open questions
### Root-level properties
We are effectively reusing the existing "root-level" property
definitions within sub-schemas. However, not all root-level properties
make sense in context of sub-schema. Specifically, the following
properties are questionable:
- `default`, `prefill` and `example` - These are better suited for the
root property that contains the entire object or array. Applying them to
individual sub-fields could lead to unexpected or inconsistent behavior.
- `sectionCaption` and `sectionDescription` - These introduce structural
elements (sections) and may not make sense inside nested sub-schemas. We
should consider either removing them entirely from sub-schemas or
revisiting their design from a UI/UX perspective (e.g. nested sections).
### `schemaBased` editor
As mentioned in the Input UI section, there is a need to introduce a new
editor type that signals that each sub-property of a complex object or
array should be rendered as a standalone field using its sub-schema
definition.
I’ve proposed the name `schemaBased` for this new editor, but the name
is open for discussion.
### Compatibility between editor and sub-schema
A key question is whether we should validate the compatibility between
the `editor` and the defined sub-schema within the Actor's input schema,
or leave this responsibility to the Actor developer.
#### Example scenario
A developer defines a property of type `array` with `editor:
stringList`, but also provides a sub-schema specifying object-type
items. The input UI would generate a list of strings, while the
validation would expect objects, resulting in invalid input.
Possible Approaches:
1. **No restrictions (responsibility on creator)**
Allow any combination of `editor` and sub-schema, and assume the creator
understands which combinations are valid. This offers maximum
flexibility but increases the risk of misconfiguration.
2. **Restrict available editors when a sub-schema is defined**
that would be `schemaBased`, `json` and `hidden`.
3. **Strict validation based on editor type**
Enforce that sub-schemas match expected structures for specific editors.
For example for `stringList` editor the sub-schema can only be string
typed items, for `requestListSources` it's object with strictly defined
properties. But this would make the JSON Schema way more complicated
with lot's of if-else branches and duplicated properties definitions.
> Note to this: We are currently validating structure of input for some
editors (for example `requestListSources`) manually in
`validateInputUsingValidator`. So in this case `editor` is not used just
for UI but also influence the validation.
0 commit comments