|  | 
|  | 1 | +# JetStream Read-after-Write | 
|  | 2 | + | 
|  | 3 | +| Metadata | Value                                                        | | 
|  | 4 | +|----------|--------------------------------------------------------------| | 
|  | 5 | +| Date     | 2025-07-11                                                   | | 
|  | 6 | +| Author   | @MauriceVanVeen                                              | | 
|  | 7 | +| Status   | Proposed                                                     | | 
|  | 8 | +| Tags     | jetstream, kv, objectstore, server, client, refinement, 2.12 | | 
|  | 9 | +| Updates  | ADR-8, ADR-17, ADR-20, ADR-31, ADR-37                        | | 
|  | 10 | + | 
|  | 11 | +| Revision | Date       | Author          | Info           | | 
|  | 12 | +|----------|------------|-----------------|----------------| | 
|  | 13 | +| 1        | 2025-07-11 | @MauriceVanVeen | Initial design | | 
|  | 14 | + | 
|  | 15 | +## Problem Statement | 
|  | 16 | + | 
|  | 17 | +JetStream does NOT support read-after-write or monotonic reads. This can be especially problematic when | 
|  | 18 | +using [ADR-8 JetStream based Key-Value Stores](ADR-8.md), primarily but not limited to the use of _Direct Get_. | 
|  | 19 | + | 
|  | 20 | +Specifically, we have no way to guarantee a write like `kv.Put` can be observed by a subsequent `kv.Get` or `kv.Watch`, | 
|  | 21 | +especially when the KV/stream is replicated or mirrored. | 
|  | 22 | + | 
|  | 23 | +## Context | 
|  | 24 | + | 
|  | 25 | +The topic of immediate consistency within NATS JetStream can sometimes be a bit confusing. On our docs we claim we | 
|  | 26 | +maintain immediate consistency (as opposed to eventual consistency) even in the face of failures. Which is true.. but, | 
|  | 27 | +as with anything, it depends. | 
|  | 28 | + | 
|  | 29 | +- **Monotonic writes**, all writes to a single stream (replicated or not) are monotonic. It's ordered regardless of | 
|  | 30 | +  publisher by the stream sequence. | 
|  | 31 | +- **Monotonic reads**, if you're using consumers. All reads for a consumer (replicated or not) are monotonic. It's | 
|  | 32 | +  ordered by consumer delivery sequence. (Messages can be redelivered on failure, but this also depends on which | 
|  | 33 | +  settings are used) | 
|  | 34 | + | 
|  | 35 | +Those paths are immediately consistent.. but they are not immediately consistent with respect to each other. This is no | 
|  | 36 | +problem for publishers and consumers of a stream, because they observe all operations to be monotonic. | 
|  | 37 | +But, if you use the KV abstraction for example, you're more often going to use single message gets through `kv.Get`. | 
|  | 38 | +Since those rely on `DirectGet`, even followers can answer, which means we (by default) can't guarantee read-after-write | 
|  | 39 | +or even monotonic reads. Such message GET requests get served randomly by all servers within the peer group (or even | 
|  | 40 | +mirrors if enabled). Those obviously can't be made immediately consistent, since both replication and mirroring are | 
|  | 41 | +async. | 
|  | 42 | + | 
|  | 43 | +Also, when following up a `kv.Create` with `kv.Keys`, you might expect read-after-write such that the returned keys | 
|  | 44 | +contains the key you've just written to. This also requires read-after-write. | 
|  | 45 | + | 
|  | 46 | +## Design | 
|  | 47 | + | 
|  | 48 | +Before sharing the proposed design, let's look at an alternative. Read-after-write could be achieved by having reads (on | 
|  | 49 | +an opt-in basis) go through Raft replication first. This has several disadvantages: | 
|  | 50 | + | 
|  | 51 | +- Reads will become significantly slower, due to requiring replication first. | 
|  | 52 | +- Reads require quorum, due to replication, disallowing any reads when there's downtime or temporarily no leader. | 
|  | 53 | +- Only the stream leader can answer reads, as it is the first one to know that it can answer the request. (Followers | 
|  | 54 | +  replicate asynchronously, so letting them answer would make the response take even longer to return.) | 
|  | 55 | +- Mirrors can still answer `DirectGet` requests, the transparency of mirrors answering read requests will violate any | 
|  | 56 | +  read-after-write guarantees (as the client will not know). This would mean mirrors must not be enabled if this | 
|  | 57 | +  guarantee should be kept. | 
|  | 58 | +- Read-after-write guarantees could temporarily be violated when scaling streams up or down. | 
|  | 59 | +- This is not a compatible approach for consumers, meaning they could not have these guarantees based on this approach. | 
|  | 60 | + | 
|  | 61 | +Although having reads be served through Raft does (mostly) offer a strong guarantee of read-after-write and monotonic | 
|  | 62 | +reads, the disadvantages outway the advantages. Ideally, the solution has the following advantages: | 
|  | 63 | + | 
|  | 64 | +- It's explicitly defined, either in configuration or in code. | 
|  | 65 | +- Works for both replicated and non-replicated streams. (Scale up/down has no influence, and implementation is not | 
|  | 66 | +  replication-specific) | 
|  | 67 | +- Incurs no slowdown, just as fast as reads that don't guarantee read-after-write (no prior replication required). | 
|  | 68 | +- Let followers, and even mirrors, answer read requests as long as they can make the guarantee. | 
|  | 69 | +- Let followers, and mirrors, inform the client when they can't make the guarantee. The guarantee is always kept, but | 
|  | 70 | +  an error is returned that can be retried (to get a successful read). This can be tuned by disabling reads on mirrors | 
|  | 71 | +  or followers. | 
|  | 72 | + | 
|  | 73 | +Now, on to the proposed design which has the above advantages. | 
|  | 74 | + | 
|  | 75 | +The write and read paths remain eventually consistent as it is now. But one can opt-in for immediate consistency to | 
|  | 76 | +guarantee read-after-write and monotonic reads, for both direct/msg read requests as well as consumers. | 
|  | 77 | + | 
|  | 78 | +- **Read-after-write** is achieved because all writes through `js.Publish`, `kv.Put`, etc. return the sequence | 
|  | 79 | +  (inherently last sequence) of the stream. In `DirectGet` requests those observed last sequences can be used for read | 
|  | 80 | +  requests. | 
|  | 81 | +- **Monotonic reads** is achieved by collecting the highest sequence seen in read requests and using that sequence for | 
|  | 82 | +  subsequent read requests. | 
|  | 83 | + | 
|  | 84 | +This can be implemented with an additional `MinLastSeq` field in `JSApiMsgGetRequest` and `ConsumerConfig`. | 
|  | 85 | + | 
|  | 86 | +- This ensures the server only replies with data if it can actually 100% guarantee immediate consistency. This is done | 
|  | 87 | +  by confirming the `LastSeq` it has for its local stream, is at least the `MinLastSeq` specified. | 
|  | 88 | +- Side-note: although `MsgGet` is only answered by the leader, technically an old leader could still respond and serve | 
|  | 89 | +  stale reads. Although this shouldn't happen often in practice, until now we couldn't guarantee it. The error can be | 
|  | 90 | +  detected on the old leader, and it can delay the error response, allowing for the real leader to send the actual | 
|  | 91 | +  answer. | 
|  | 92 | +- Followers/mirrors reject the read request if they can't satisfy the `MinLastSeq`. But can serve reads and share the | 
|  | 93 | +  load otherwise. | 
|  | 94 | +- Consumers don't start delivering messages, until the `MinLastSeq` is reached. (To ensure `pending` counts are correct | 
|  | 95 | +  when following up `kv.Create` with `kv.Keys` for example) | 
|  | 96 | + | 
|  | 97 | +In terms of API, it can look like this: | 
|  | 98 | + | 
|  | 99 | +```go | 
|  | 100 | +// Write | 
|  | 101 | +r, err := kv.Put(ctx, "key", []byte("value")) | 
|  | 102 | + | 
|  | 103 | +// Read request | 
|  | 104 | +kve, err := kv.Get(ctx, "key", jetstream.MinLastRevision(r)) | 
|  | 105 | + | 
|  | 106 | +// Watch/consumer | 
|  | 107 | +kl, err := kv.ListKeys(ctx, jetstream.MinLastRevision(r)) | 
|  | 108 | +``` | 
|  | 109 | + | 
|  | 110 | +By specifying the `MinLastRevision` (or `MinLastSequence` when using a stream normally), you can be sure your read | 
|  | 111 | +request will be rejected by a follower if it can't be satisfied, or the follower will wait to deliver you messages from | 
|  | 112 | +the consumer until it's up-to-date. | 
|  | 113 | + | 
|  | 114 | +This satisfies read-after-write and monotonic reads when combining the write and read paths. | 
|  | 115 | + | 
|  | 116 | +## Decision | 
|  | 117 | + | 
|  | 118 | +[Maybe this was just an architectural decision...] | 
|  | 119 | + | 
|  | 120 | +## Consequences | 
|  | 121 | + | 
|  | 122 | +Since this is an opt-in on a read request or consumer create basis, this is not a breaking change. Depending on client | 
|  | 123 | +implementation, this could be harder to implement. But given it's just another field in the `JSApiMsgGetRequest` and | 
|  | 124 | +`ConsumerConfig`, each client should have no trouble supporting it. | 
0 commit comments