Skip to content

KeyError: 'annotation' in fit() when handling /webhook events without annotation payload #768

@xiaoyao9184

Description

@xiaoyao9184

When receiving a /webhook POST from Label Studio, the ML backend crashes with a KeyError: 'annotation' in examples/bert_classifier/model.py, specifically inside the fit() method:

[2025-05-14 10:33:06,753] [WARNING] [werkzeug::_log::97]  * Debugger is active!
[2025-05-14 10:33:06,754] [INFO] [werkzeug::_log::97]  * Debugger PIN: 485-577-215
[2025-05-14 10:33:18,338] [INFO] [werkzeug::_log::97] 172.16.22.2 - - [14/May/2025 10:33:18] "POST /webhook HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/home/ubuntu/.conda/envs/label-studio/lib/python3.10/site-packages/flask/app.py", line 2213, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/ubuntu/.conda/envs/label-studio/lib/python3.10/site-packages/flask/app.py", line 2193, in wsgi_app
    response = self.handle_exception(e)
  File "/home/ubuntu/.conda/envs/label-studio/lib/python3.10/site-packages/flask/app.py", line 2190, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/ubuntu/.conda/envs/label-studio/lib/python3.10/site-packages/flask/app.py", line 1486, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/ubuntu/.conda/envs/label-studio/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/ubuntu/.conda/envs/label-studio/lib/python3.10/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/ubuntu/.conda/envs/label-studio/lib/python3.10/site-packages/label_studio_ml/api.py", line 126, in webhook
    result = model.fit(event, data)
  File "/home/ubuntu/code/label-studio-ml-backend/label_studio_ml/examples/bert_classifier/model.py", line 125, in fit
    project_id = data['annotation']['project']
KeyError: 'annotation'

Looking at this line

project_id = data['annotation']['project']

However, according to the surrounding code, project_id has already been provided as a constructor argument to the model.

@_server.route('/webhook', methods=['POST'])
def webhook():
data = request.json
event = data.pop('action')
if event not in TRAIN_EVENTS:
return jsonify({'status': 'Unknown event'}), 200
project_id = str(data['project']['id'])
label_config = data['project']['label_config']
model = MODEL_CLASS(project_id, label_config=label_config)
)

I've forked the repository and updated the following example models to use self.project_id instead:

  • bert_classifier
  • huggingface_ner
  • sklearn_text_classifier

However, I have not yet run the full test suite or studied the full contribution guidelines, as I’m not deeply familiar with this codebase yet.

Would it be acceptable to open a PR with this minimal fix and receive further guidance from maintainers? Or would you prefer I do more in-depth validation first?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions