Skip to content

Conversation

anoadragon453
Copy link
Member

@anoadragon453 anoadragon453 commented Oct 7, 2025

Fixes an edge case where a client could set device_keys, but set it to null, as shown in the following traceback:

2025-10-07T13:41:18.638553764Z stderr F 2025-10-07 13:41:18,637 - synapse.http.server - 149 - ERROR - POST-473 - Failed handle request via 'KeyUploadServlet': <XForwardedForRequest at 0x7f0082c5b990 method='POST' uri='/_matrix/client/v3/keys/upload?redacted...' clientproto='HTTP/1.1' site='8008'>
2025-10-07T13:41:18.638577794Z stderr F Traceback (most recent call last):
2025-10-07T13:41:18.638583004Z stderr F   File "/usr/local/lib/python3.12/site-packages/synapse/http/server.py", line 337, in _async_render_wrapper
2025-10-07T13:41:18.638586915Z stderr F     callback_return = await self._async_render(request)
2025-10-07T13:41:18.638589115Z stderr F                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-07T13:41:18.638591544Z stderr F   File "/usr/local/lib/python3.12/site-packages/synapse/http/server.py", line 562, in _async_render
2025-10-07T13:41:18.638593705Z stderr F     callback_return = await raw_callback_return
2025-10-07T13:41:18.638595845Z stderr F                       ^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-07T13:41:18.638598045Z stderr F   File "/usr/local/lib/python3.12/site-packages/synapse/rest/client/keys.py", line 281, in on_POST
2025-10-07T13:41:18.638600255Z stderr F     if body["device_keys"]["user_id"] != user_id:
2025-10-07T13:41:18.638602435Z stderr F        ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
2025-10-07T13:41:18.638604575Z stderr F TypeError: 'NoneType' object is not subscriptable

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

…eys: null`

Fixes an edge case where a client could set `device_keys`, but set it to `null`.
@anoadragon453 anoadragon453 marked this pull request as ready for review October 7, 2025 15:00
@anoadragon453 anoadragon453 requested a review from a team as a code owner October 7, 2025 15:00
@anoadragon453 anoadragon453 enabled auto-merge (squash) October 7, 2025 15:15
Confirmed that this currently fails on `develop`.
@anoadragon453
Copy link
Member Author

A regression unit test was added as per @devonh's suggestion.

@anoadragon453 anoadragon453 disabled auto-merge October 7, 2025 15:23
@anoadragon453 anoadragon453 merged commit 2443760 into develop Oct 7, 2025
9 checks passed
@anoadragon453 anoadragon453 deleted the anoa/handle_device_keys_edge_case branch October 7, 2025 15:23
@richvdh
Copy link
Member

richvdh commented Oct 7, 2025

👍 to a quick fix to get the release out today, but there is no question that this is client-side misbehaviour which Synapse should reject, otherwise all other server implementations end up having to allow device_keys: null, and it becomes something we have to spec.

Please could you plan to revert this change in due course (say, three months)? I'd recommend adding a warning in the meantime, so that we can get an idea of when it will be safe to revert.

@richvdh
Copy link
Member

richvdh commented Oct 7, 2025

For links: this was followup to #17097

@anoadragon453
Copy link
Member Author

@richvdh Good point. I think we actually need a way in Pydantic to differentiate between a field not being set, and said field being set to null. Looks like we can do that by adding something like:

    # Reject explicit `null` for any field that opts in with Field(..., forbid_null=True)
    @validator('*', pre=True)
    def _forbid_explicit_nulls(cls, v, values, field):
        if v is None and field.field_info.extra.get('forbid_null', True):
            raise ValueError(f"{field.name} cannot be null")
        return v

to RequestBodyModel. Then any field that is explicitly null is denied by default.

I'll create a proper issue for this soon.

@richvdh
Copy link
Member

richvdh commented Oct 8, 2025

@richvdh Good point. I think we actually need a way in Pydantic to differentiate between a field not being set, and said field being set to null. Looks like we can do that by adding something like ... Then any field that is explicitly null is denied by default.

I'm not entirely following you. I thought that the problem that this PR solved was that a field that was explicitly null was denied? So what does the code snippet above do that reverting this PR does not?

@anoadragon453
Copy link
Member Author

@richvdh Ideally we would return a 4xx error when a client sets a field to null instead of leaving it out entirely. That should be the default behavior. For special cases - like what we need over the next three months - we can allow a field to be explicitly set to null.

The code above supports this by introducing a forbid_null toggle, which lets us control whether a given field can be null.

Right now, any Optional[...] field can be explicitly set to null, and it will still pass Pydantic's validation. We've worked around this for one field using an if check in the servlet body, but other fields may still be problematic.

@anoadragon453
Copy link
Member Author

I thought that the problem that this PR solved was that a field that was explicitly null was denied?

Other way around. The field being null passed Pydantic validation, and then we had an exception raised when trying to use the field value:

2025-10-07T13:41:18.638600255Z stderr F     if body["device_keys"]["user_id"] != user_id:
2025-10-07T13:41:18.638602435Z stderr F        ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
2025-10-07T13:41:18.638604575Z stderr F TypeError: 'NoneType' object is not subscriptable

@richvdh
Copy link
Member

richvdh commented Oct 8, 2025

oh I seeee, the problem with #17097 wasn't the introduction of Pydantic validation, but rather some additional validation that was done outside Pydantic. With you now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants