Skip to content

Conversation

@anabiman
Copy link
Contributor

This PR:

  • Fixes UTF-8 dataset conversion to stringified JSON in hdf5db.py
  • Updates code format with black
  • Automates version control with versioneer. Output files now correctly store the latest version (1.1.3) accessible via h5json.__version__

The bug referenced in the 1st point can be reproduced via:

import h5py

with h5py.File('foo.hdf5', "w") as fp:
    dt = h5py.string_dtype(encoding='utf-8')
    ds = fp.create_dataset("foo", data=["A", "B", "C"], dtype=dt)

Converting foo.hdf5 to JSON yields an error since the array of strings does not get decoded (it is passed as a list of bytes instead). Note that for encoding="ascii", the conversion works (since in this case Hdf5db.bytesArrayToList is called, which decodes the byte strings).

@ajelenak
Copy link
Contributor

Hi @andrew-abimansour, thank you very much for the PR! I somehow missed the notice, I apologize for such a long delay to merge.

@ajelenak ajelenak merged commit ed5e471 into HDFGroup:develop Jan 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants