Tracking pipeline statuses¶
This guide shows how to track the completion status of BIDSification and processing pipelines.
BIDSification pipelines¶
The nipoppy track-curation
command can be used to track dataset curation stages (reorganization and BIDSification).
The command to create a curation status file from scratch is:
$ nipoppy track-curation --regenerate
Note
Without the --regenerate
flag, nipoppy track-curation
will only update the curation status for new participants in the manifest.
The above command creates or updates the curation status file at <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/curation_status.tsv
.
A summary of curation statuses can be displayed by running the nipoppy status
command, which outputs a table with participant counts at different curation stages, like this:
Participant counts by session at each Nipoppy checkpoint
╷ ╷ ╷ ╷
session_id │ in_manifest │ in_pre_reorg │ in_post_reorg │ in_bids
════════════╪═════════════╪══════════════╪═══════════════╪═════════
1 │ 2 │ 0 │ 0 │ 0
2 │ 2 │ 0 │ 0 │ 0
╵ ╵ ╵ ╵
Note
The in_pre_reorg
and in_post_reorg
columns will be collapsed if all participants in the manifest have been BIDSified.
For each curation stage, the status is determined based on the presence of files in expected directories:
Column |
Relevant directory |
---|---|
|
|
|
|
|
|
Processing pipelines¶
The nipoppy track-processing
command can be used to track the completion status of processing pipelines. The minimal command is:
$ nipoppy track-processing --pipeline <PIPELINE_NAME>
Tip
The pipeline version and step name can be optionally specified using the --pipeline-version
and --pipeline-step
arguments respectively. By default, the latest version and the first step are used.
It is also possible to restrict the run to a single participant and/or session by using the --participant-id
and --session-id
arguments respectively.
The above command creates or updates the processing status file at <NIPOPPY_PROJECT_ROOT>/derivatives/processing_status.tsv
.
A summary of pipeline statuses can be displayed by running the nipoppy status
command:
Participant counts by session at each Nipoppy
checkpoint
╷ ╷ ╷
│ │ │ mriqc
│ │ │ 23.1.0
session_id │ in_manifest │ in_bids │ default
════════════╪═════════════╪═════════╪═════════
1 │ 2 │ 2 │ 2
2 │ 2 │ 2 │ 2
╵ ╵ ╵
Tip
The processing status file can also be uploaded to https://digest.neurobagel.org for filtering and interactive visualizations.
Configuring a pipeline tracker¶
Pipeline completion criteria are defined through the tracker configuration file.
The name of the tracker configuration file can be found in the pipeline’s config file at <NIPOPPY_PROJECT_ROOT>/pipelines/processing/<PIPELINE_NAME>-<PIPELINE_VERSION>/config.json
; by default it is called tracker.json
:
"STEPS": [
{
"INVOCATION_FILE": "invocation.json",
"DESCRIPTOR_FILE": "descriptor.json",
"HPC_CONFIG_FILE": "hpc.json",
"TRACKER_CONFIG_FILE": "tracker.json"
}
],
Importantly, pipeline completion status is not inferred from exit codes, as trackers are run independently of the pipeline runners. Instead, the status is determined by checking for the presence of expected output files.
Here is example of tracker configuration file for the MRIQC pipeline, version 23.1.0:
{
"PATHS": [
"[[NIPOPPY_BIDS_PARTICIPANT_ID]]/[[NIPOPPY_BIDS_SESSION_ID]]/anat/[[NIPOPPY_BIDS_PARTICIPANT_ID]]_[[NIPOPPY_BIDS_SESSION_ID]]*_T1w.json",
"[[NIPOPPY_BIDS_PARTICIPANT_ID]]_[[NIPOPPY_BIDS_SESSION_ID]]*_T1w.html"
]
}
These paths are expected to be relative to the <NIPOPPY_PROJECT_ROOT>/derivatives/<PIPELINE_NAME>/<PIPELINE_VERSION>/output
directory.
Tip
“Glob” expressions (i.e., that include *
) are allowed in paths.
If at least one file matches the expression, then the file will be considered found for that expression.
Note
The template strings [[NIPOPPY_<ATTRIBUTE_NAME>]]
are replaced at runtime by appropriate values.
Available template strings are:
[[NIPOPPY_PARTICIPANT_ID]]
: the participant ID without thesub-
prefix”,[[NIPOPPY_SESSION_ID]]
: the session ID without theses-
prefix”,[[NIPOPPY_BIDS_PARTICIPANT_ID]]
: the participant ID with thesub-
prefix”,[[NIPOPPY_BIDS_SESSION_ID]]
: the session ID with theses-
prefix”,
Given a dataset with the following content in <NIPOPPY_PROJECT_ROOT>/derivatives/<PIPELINE_NAME>/<PIPELINE_VERSION>/output
:
Running the tracker with the above configuration will result in the processing status file showing:
participant_id |
bids_participant_id |
session_id |
bids_session_id |
pipeline_name |
pipeline_version |
pipeline_step |
status |
---|---|---|---|---|---|---|---|
001 |
sub-001 |
1 |
ses-1 |
mriqc |
23.1.0 |
default |
SUCCESS |
Note
If there is an existing processing status file, the rows relevant to the specific pipeline, participants, and sessions will be updated. Other rows will be left as-is.
The pipeline_complete
column can have the following values:
SUCCESS
: all specified paths have been foundFAIL
: at least one of the paths has not been found