-
Notifications
You must be signed in to change notification settings - Fork 213
Additional metadata from BIDS events.tsv #744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
moabb/paradigms/base.py
Outdated
This pipeline must return an ``np.ndarray``. | ||
This pipeline must be "fixed" because it will not be trained, | ||
i.e. no call to ``fit`` will be made. | ||
additional_metadata: Literal["default", "all"] | list[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
additional_metadata: Literal["default", "all"] | list[str] | |
additional_metadata: None | Literal["all"] | list[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that None
seems to be more in line with the other kwargs and their defaults. I will change here and in the if statements accordingly
This pipeline must be "fixed" because it will not be trained, | ||
i.e. no call to ``fit`` will be made. | ||
additional_metadata: Literal["default", "all"] | list[str] | ||
Additional metadata to be loaded if return_epochs=True. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_data()
function returns a triplet (obj, labels, metadata)
.
obj
contains the data and can be a np.array
, mne.Epochs
or mne.io.Raw
depending on the return_epochs
and return_raws
parameters.
But we should always return some metadata, so the additional columns should always be set when additional_metadata='all'
moabb/paradigms/base.py
Outdated
dm = load_bids_event_metadata( | ||
dataset, subject=subject, session=session, run=run | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion in the MOABB meeting, we think it would be useful in other datasets to have the option to get additional metadata columns (ex: ERPCore). So instead of having a special case for BaseBIDSDataset
, the idea would be to have a method in BaseDataset
:
class BaseDataset:
def get_additional_metadata(self, subject, session, run) -> None | pd.DataFrame:
return None
that would be overwritten by the datasets that have additional metadata to pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The load_bids_event_metadata
is currently using the _find_matching_sidecar
from mne_bids.path
. I was wondering if there is a data set which is BaseDataset
but not BaseBIDSDataset
, for which this approach might break. But happy to implement this in on the BaseDataset
and always return potential additional meta data.
Dreyer dataset would be an super use case here.
…On Wed, 2 Apr 2025, 19:15 Pierre Guetschel, ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In moabb/paradigms/base.py
<#744 (comment)>:
> + dm = load_bids_event_metadata(
+ dataset, subject=subject, session=session, run=run
+ )
After discussion in the MOABB meeting, we think it would be useful in
other datasets to have the option to get additional metadata columns (ex:
ERPCore). So instead of having a special case for BaseBIDSDataset, the
idea would be to have a method in BaseDataset:
class BaseDataset:
def get_additional_metadata(self, subject, session, run) -> None | pd.DataFrame:
return None
that would be overwritten by the datasets that have additional metadata to
pass
—
Reply to this email directly, view it on GitHub
<#744 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKFZNAT6UY33KDDSCZB3SST2XQLJZAVCNFSM6AAAAAB2JHDKBGVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDOMZXGAZDINJQGQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks @matthiasdold for starting this PR. There is one blocker with the current implementation: if we don't use all the events, the paradigm object will filter the epochs and the rows in the additional metadata will not match anymore the events used to create the epochs. |
I think the proper way would be to use the |
Also given the fails for the tests on |
@bruAristimunha - thanks for pointing this out. I will add this to my local testing and see if it would make sense to include a similar mockup for the unit testing |
You can use |
But then every new dataset implementing additional metadata columns would also have to implement the events filtering. This seems redundant and prone to bugs... |
As discussed this morning, we would have potential filtering of metadata on two levels: on the See this pull request. @PierreGtch - how shall we deal with this? Wait for the PR to be merge and fix here afterwards, or replicate the proposed _events_file_to_annotation_kwargs here? |
I think waiting to be merged is the best way. I would recommend going to mne office hour to make things easier, it's Friday on discord. |
Edit: after thinking about it, I now agree with @matthiasdold. The problem is not only that it will take time before the feature makes it into a release of mne-bids, it's also that it will force us to depend on the latest mne-bids version. |
Hey guys @matthiasdold and @PierreGtch, Should we upgrade mne version? |
Hi @bruAristimunha, I talked with @PierreGtch and he is optimistic that mne-tools/mne-python#13228 would soon be ready to merge. Once that is in |
Hi,
here is an idea how we could fetch additional metadata columns from
events.tsv
files for BIDS datasets.E.g. for the
FakeData
set withthe
events.tsv
files would have the following columns:While the
onset
and thetrial_type
are implicitly encoded in the epochs (by their names and how they are cut), the additional information, such asvalue
orsample
would not be part of the epoch/metadata as extracted with:This pull request adds a
additional_metadata: Literal["default", "all"] | list[str] = "default"
kwarg to theparadigm.get_data
method, which allows to either fetch all"all"
or a selected list of columns from theevents.tsv
and attach it to the metadata - also see theTestMetadata
.