Skip to content

shift_jisx0213 truncates null-terminator #101828

@SunakazeKun

Description

@SunakazeKun

Bug report

When encoding a null-terminated string in shift_jisx0213, the null-terminator sometimes gets truncated. To add a null-terminator when encoding, I usually use (string + "\0").encode(encoding) which works with most encodings. However, this doesn't seem to be the case here.
Instead, I'm using string.encode(encoding) + "\0".encode(encoding) as a workaround to create the correct result. However, this won't produce the correct result for utf-16, because the BOM would be included twice.

Consider the following sample script to check this for yourself.

strings: list[str] = [
    "hello world",
    "バルーンフルーツ",
    "バルーンフィッシュ",
    "ライフアップキノコ"
]

encoding = "shift_jisx0213"

for string in strings:
    encoded_direct_null = (string + "\0").encode(encoding)
    encoded_append_null = string.encode(encoding) + "\0".encode(encoding)

    print(repr(string))
    print(" - encoded_append_null (EXPECTED!):", encoded_append_null.hex())
    print(" - encoded_direct_null:            ", encoded_direct_null.hex())
    print()

This generates the following results. As you can see, the two results are not the same and in the second and fourth examples, the null-terminator has been removed for some reason. I've tried this with utf-8 and shift_jis as well, but these yield the correct results.

'hello world'
 - encoded_append_null (EXPECTED!): 68656c6c6f20776f726c6400
 - encoded_direct_null:             68656c6c6f20776f726c6400

'バルーンフルーツ'
 - encoded_append_null (EXPECTED!): 836f838b815b83938374838b815b836300
 - encoded_direct_null:             836f838b815b83938374838b815b8363

'バルーンフィッシュ'
 - encoded_append_null (EXPECTED!): 836f838b815b83938374834283628356838500
 - encoded_direct_null:             836f838b815b83938374834283628356838500

'ライフアップキノコ'
 - encoded_append_null (EXPECTED!): 838983438374834183628376834c836d835200
 - encoded_direct_null:             838983438374834183628376834c836d8352

Your environment

  • Python: Python 3.10.7
  • OS: Windows 10 Home

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-modulesC modules in the Modules dirtype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions