Skip to content

fix: Replace surrogate characters before rendering #7871

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

SwiftyJunnos
Copy link

@SwiftyJunnos SwiftyJunnos commented Jun 30, 2025

This PR resolves a crash issue in base.html rendering when a git commit message includes surrogate characters, such as emojis (e.g., 🐛).
The root cause was traced to the json_dumps_ensure_ascii function, which does not properly handle surrogate pairs, leading to a UnicodeEncodeError during template rendering.

[2025-06-30 00:32:51,740] [django.request::log_response::253] [ERROR] Internal Server Error: /labelstudio/
Traceback (most recent call last):
  File "/label-studio/.venv/lib/python3.12/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/label-studio/.venv/lib/python3.12/site-packages/django/core/handlers/base.py", line 197, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/label-studio/.venv/lib/python3.12/site-packages/sentry_sdk/integrations/django/views.py", line 90, in sentry_wrapped_callback
    return callback(request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/label-studio/label_studio/core/views.py", line 57, in main
    return render(request, 'home/home.html')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/label-studio/.venv/lib/python3.12/site-packages/sentry_sdk/utils.py", line 1796, in runner
    return sentry_patched_function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/label-studio/.venv/lib/python3.12/site-packages/sentry_sdk/integrations/django/templates.py", line 105, in render
    return real_render(request, template_name, context, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/label-studio/.venv/lib/python3.12/site-packages/django/shortcuts.py", line 26, in render
    return HttpResponse(content, content_type, status)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/label-studio/.venv/lib/python3.12/site-packages/django/http/response.py", line 376, in __init__
    self.content = content
    ^^^^^^^^^^^^
  File "/label-studio/.venv/lib/python3.12/site-packages/django/http/response.py", line 408, in content
    content = self.make_bytes(value)
              ^^^^^^^^^^^^^^^^^^^^^^
  File "/label-studio/.venv/lib/python3.12/site-packages/django/http/response.py", line 317, in make_bytes
    return bytes(value.encode(self.charset))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 3038-3039: surrogates not allowed

So I added a preprocessing step to replace or safely handle surrogate characters before they reach the JSON encoder, ensuring that the response can be encoded in UTF-8 without failure.

While there's currently no use of emojis in commit messages, this fix can prevent potential future errors in case such characters comes up someday.

Copy link

netlify bot commented Jun 30, 2025

👷 Deploy request for heartex-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit d9a727f

Copy link

netlify bot commented Jun 30, 2025

👷 Deploy request for label-studio-docs-new-theme pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit d9a727f

Copy link

netlify bot commented Jun 30, 2025

Deploy Preview for label-studio-storybook canceled.

Name Link
🔨 Latest commit d9a727f
🔍 Latest deploy log https://app.netlify.com/projects/label-studio-storybook/deploys/6861ee76ea6ff1000809c62d

Copy link

netlify bot commented Jun 30, 2025

Deploy Preview for label-studio-playground canceled.

Name Link
🔨 Latest commit d9a727f
🔍 Latest deploy log https://app.netlify.com/projects/label-studio-playground/deploys/6861ee764d09d70008faeb95

@github-actions github-actions bot added the fix label Jun 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant