Skip to content

Lack of collection ownership in Python allows reusing collections from an existing frame #805

@jmcarcell

Description

@jmcarcell

Discovered with @fredrikshaw, in Python it is possible to take a collection from a frame and writing it to another (or even the same) frame. For example, this code:

from podio.root_io import Reader, Writer
import podio
import edm4hep

input_file = 'edm4hep.root'
output_file = 'test.root'

reader = Reader(input_file)
event = reader.get('events')[0]
new_frame = podio.Frame()

for name in event.getAvailableCollections():
    new_frame.put(event.get(name), name)

writer = Writer(output_file)
writer.write_frame(new_frame, 'events')

is allowed and with the file that is produced with https://github.com/key4hep/EDM4hep/blob/main/scripts/createEDM4hepFile.py, it will show several stack traces but won't stop, and produce a readable file with some collections. When writing to the original frame it seems to fail for every put. This is not allowed in C++, since the collections that one gets from a frame are const, although one could still do a const_cast to bypass this but at least this is explicit. Note also that in the example above, if one excludes the types for which put fails:

    if name in ['RecoMCParticleLinkCollection', 'RecDqdxCollection', 'TrackerHitSimTrackerHitLinkCollection', 'CaloHitSimCaloHitLinkCollection', 'ParticleIDCollection', 'VertexRecoParticleLinkCollection']:
        continue

it seems to work completely fine, which could mean this code works most of the time unless reusing one of the types above. When it fails, the error that appears is:

/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/bits/stl_deque.h:1433: reference std::deque<edm4hep::ReconstructedParticleObj *>::operator[](size_type) [_Tp = edm4hep::ReconstructedParticleObj *, _Alloc = std::allocator<edm4hep::ReconstructedParticleObj *>]: Assertion '__n < this->size()' failed.
 *** Break *** abort

I don't have an answer for this one. If we could know if a collection belongs in a frame then we could check in put and print some warning or error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions