Skip to content

Conversation

@pschichtel
Copy link
Contributor

@pschichtel pschichtel commented Aug 15, 2025

Currently, and independent of the isLenient option, the JSON decoders will coerce Strings to other primitiv types, allowing e.g. "true" to be decoded as Boolean. We are currently transitioning from a very strict decoder setup based on Jackson over to kotlinx serialization for some of the nicer kotlin-specific functionality and the lack of reflection.

This PR is currently meant as an RFC to gauge if this is something that would be accepted upstream. As such there are some caveats:

  • There are no additional tests, however the existing tests show that the default behavior is unchanged.
  • No documentation has been written for the new functionality.
  • There are some TODO's which will require closer inspection for a final implementation and some dedicated tests.
  • The new option should probably be marked as @ExperimentalSerializationApi, but it is not currently.

I would obviously add those things, if there is a chance this would be pulled.

The implementation is rather simple: Introduce a boolean option and in and around the lexer prevent the use of quoted strings, where they wouldn't be necessary unless coercion is wanted. In some places the necessary functionality was already in place as separate methods (decodeBoolean vs decodeBooleanLenient), in others I was able to add minimal changes, usually as part of existing checks around string quotes. I tried to keep the interfaces stable.

Related issues:

consumeOther(allowQuoted = true)

fun consumeOther(allowQuoted: Boolean): String {
// TODO this string peeking stuff might break things...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code around peekedString I'm pretty sure this is not actually a problem, because this branch would only ever be hit when attempting to read a string. Right?

@pschichtel pschichtel changed the title RFC: Allow disabling coercion between primitives RFC: Allow disabling coercion between JSON primitives Aug 15, 2025
@pschichtel pschichtel force-pushed the feature/disable-coercion branch from 8cc0590 to 2e5185f Compare August 18, 2025 10:54
@pschichtel pschichtel changed the base branch from master to dev August 18, 2025 10:55
@pdvrieze
Copy link
Contributor

Looks like a good idea to me

@pschichtel pschichtel force-pushed the feature/disable-coercion branch from 2e5185f to fb21578 Compare September 19, 2025 08:24
@pschichtel
Copy link
Contributor Author

I just rebased this onto latest dev. @sandwwraith @Tapchicoma Is there any chance to get feedback on the "does this have a chance of getting pulled?" question? I've been using this in production in a cloud service, so it's seen quite a bit of traffic without any issues so far.

@pschichtel pschichtel force-pushed the feature/disable-coercion branch from fb21578 to 2184b02 Compare October 15, 2025 12:08
@pschichtel
Copy link
Contributor Author

I've rebased this again. @sandwwraith @fzhinkin Have I completely missed some process here? Any feedback would be appreciated.

@sandwwraith
Copy link
Member

Hi! Sorry, it took me a long time to check out your proposal.

In general, I like the idea of adding such a setting. It would indeed help with stricter requirements and migrating between frameworks, such as Jackson.

However, there are several key points that have to be addressed:

  • Conversion from JsonElement. You can take a look at AbstractJsonTreeDecoder to see how it's done. I think it would be straightforward to add it there, since we have JsonPrimitive.isString property.
  • What to do with things like Map<Int, Int> or Map<String, Boolean>. (see also Quoted booleans are being allowed even with isLenient = false #2749 for reference). In Json, keys can only be strings. So now both {"42": "42"} and {"42": 42} could be mapped to the Map<Int, Int>. I'm not sure whether this flag should prohibit the former. And it probably shouldn't be blocking key parsing at all. What do you think?
  • Since changes involve a Json parser, I would say this is very desirable not only to write tests, but also benchmarks for the new feature. You can take TwitterFeedBenchmark as an example.

What do you think of this? If you want to continue working, I'd happily review your contribution.

@pdvrieze
Copy link
Contributor

The choice of how to handle/serialize maps with non-string keys is ultimately a choice for the format to make (as there is no "native" mapping). The current behaviour to use quoted strings is probably the most elegant (although a strict parameter could prohibit it altogether).

Fundamentally I think that the behaviour of map serialization with non-string keys is a separate one (where configuration may specify an alternative approach as for complex keys). The case for non-string primitive keys is that (to support them at all) the format maps them to strings and serializes/deserializes them as strings.

For map values, they should follow the coercion rule (Map<Int, Int> would only allow {"42": 42 }

@pschichtel
Copy link
Contributor Author

@sandwwraith thanks for your feedback. Yes I'm still interested in working on this.

Conversion from JsonElement. You can take a look at AbstractJsonTreeDecoder to see how it's done. I think it would be straightforward to add it there, since we have JsonPrimitive.isString property.

What issue are you referring to? Decoding from JsonElement already works with my changes, doesn't it?

to do with things like Map<Int, Int> or Map<String, Boolean>. (see also Quoted booleans are being allowed even with isLenient = false #2749 for reference). In Json, keys can only be strings. So now both {"42": "42"} and {"42": 42} could be mapped to the Map<Int, Int>. I'm not sure whether this flag should prohibit the former. And it probably shouldn't be blocking key parsing at all. What do you think?

May intention wasn't to touch anything regarding that behavior in this PR. as @pdvrieze suggests I think this is a separate topic for a separate PR. My intention here was merely to make primitive decoding, where it happens, more strict. I think it's best to go in smaller steps here.

changes involve a Json parser, I would say this is very desirable not only to write tests, but also benchmarks for the new feature. You can take TwitterFeedBenchmark as an example.

True! I'll have a look at that.

I have a few lose ends to tie up before I can work on this again, but I definitely will. We currently use this via a fork in production, so there is a strong interest in getting this upstream :)

@sandwwraith
Copy link
Member

What issue are you referring to? Decoding from JsonElement already works with my changes, doesn't it?

You're right, I overlooked your changes there.

May intention wasn't to touch anything regarding that behavior in this PR.

Well, since code path for Map is the same, you're touching it in any case. In particular, this code throws a decoding exception now:

@Test
fun testStrictParsing() {
    val j = Json {
        allowPrimitiveCoercion = false
    }
    val input = """{"42": 42}"""
    println(j.decodeFromString<Map<Int, Int>>( input))
}

I do not mind this while the flag is experimental, but we would need to make a decision whether this behavior is desirable or if we should allow such deserialization. But I agree, it is possible to do it later.

I have a few loose ends to tie up before I can work on this again, but I definitely will.

Glad to hear that!

@Gama11
Copy link

Gama11 commented Oct 28, 2025

Well, since code path for Map is the same, you're touching it in any case. In particular, this code throws a decoding exception now

We ran into one such case in our codebase (actually it was a value class that wraps an int) with this branch and solved it by implementing a dedicated "map key serializer":

@Serializable
@JvmInline
value class IntWrapper(val value: Int)

private object IntWrapperKeySerializer : KSerializer<IntWrapper> {
    override val descriptor: SerialDescriptor = PrimitiveSerialDescriptor("IntWrapperMapKey", PrimitiveKind.STRING)

    override fun serialize(encoder: Encoder, value: IntWrapper) = encoder.encodeString(value.value.toString())
    override fun deserialize(decoder: Decoder) = IntWrapper(decoder.decodeString().toInt())
}

typealias IntWrapperStringMap = Map<
    @Serializable(with = IntWrapperKeySerializer::class)
    IntWrapper,
    String,
    >

@Serializable
data class Foo(
    val map: IntWrapperStringMap
)

Coming from Jackson, this didn't seem too surprising, since you also need dedicated map key (de)serializers there.

@pschichtel
Copy link
Contributor Author

@sandwwraith Ahh now I understand what you meant. Honestly I think it's the only logical consequence of enabling this flag, that keys must be decoded as strings, if necessary by using custom deserializer. With the great support for value classes this almost feels like a non-issue honestly. We should probably flesh out the documentation on the option to explicitly state this property, but I would argue that people looking for such a strict configuration are expecting this anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants