Skip to content

Conversation

sulami
Copy link
Contributor

@sulami sulami commented Aug 24, 2025

Does your PR solve an issue?

First of all, this is not merge-able as-is, it's a request for feedback/advice.

We're using sqlx with AWS Aurora, and have noticed an issue with Aurora's "zero-downtime" restarts, which wipe prepared statements but preserve TCP connections. In this scenario, all existing connections get effectively poisoned without notice, breaking queries until they age out eventually. This has forced us to disable query caching, incurring quite a big performance overhead.

I've had a look at how ActiveRecord, our other main stack handles this, and they wipe the client-side cache and retry if they get back an error. While retrying right away would be nice to rescue the query, getting the connection back into a functional state for future queries is the next best thing, and rather easy.

The main problem is testing the change. I've included a second commit with a working, but bad integration test setup. It's bad because it needs to expose additional public interfaces for the test to enable deleting the server-side prepared statements while leaving the client-side cache intact, something that the library API should not allow for consistency. I feel like an integration test is the better way to test this scenario, but neither sqlx nor MySQL expose a way to set up the required conditions. Specifically, statements prepared via the binary protocol are not named, and thus cannot be deallocated via DEALLOCATE PREPARED.

I would like to invite feedback & advice on:

  • the fix in the first place, I think it's a valid fix, albeit for a niche situation
  • a better way to test this
  • assuming the above are sorted, specifics of the fix

If this or a similar approach gets accepted, the same change might also be needed for Postgres, I suspect it will exhibit the same issue.

Is this a breaking change?

Assuming all the issues get resolved, no, I don't think it would be breaking.

The worst case outcome I can think of is a false positive match on the error, which would wrongly throw away the statement cache and cause re-preparation. It might be nice to match more closely on the specific error, but given the state of this change, this is a functional PoC.

sulami added 2 commits August 24, 2025 14:00
Some cloud offerings like AWS Aurora allow for "zero-downtime" restarts
for patches, which preserves existing TCP connections but wipes out a
lot of server state, including the statement cache. In that scenario,
trying to execute a previously prepared statement causes an error
response with

```
HY000 Unknown prepared statement handler (<id>) given to mysql_stmt_precheck
```

which appears to be the only way to detect this scenario.

To avoid subsequent errors for the same connection, we can clear the
statement cache on the client side, causing all queries to get
re-prepared. This is basically what ActiveRecord does,[0] which we can
confirm is not vulnerable to the same issue. An even better solution
would be to re-prepare and -try the query right there, like ActiveRecord
does.

[0]: https://github.com/rails/rails/blob/main/activerecord/lib/active_record/connection_adapters/mysql2/database_statements.rb#L66-L78
I think it's best to test this issue through integration tests, but
there seems to be no way to clear the server-side prepared statements
without access to the raw packet stream, as they're not named and thus
cannot be deleted via DEALLOCATE PREPARED.

To make the test work, additional interfaces have to be made available,
which should not be shipped as part of the library for regular use.
@sulami sulami force-pushed the sulami/push-pmrnxvsoyqty branch from 5f8f07a to f1fd80a Compare August 24, 2025 23:06
@sulami sulami changed the title fix(sqlx-mysql): RFC Handle missing server-side prepared statements fix(mysql): RFC Handle missing server-side prepared statements Aug 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant