ddmra.workflows.run_analyses
- run_analyses(files, qc, out_dir='.', confounds=None, n_iters=10000, n_jobs=1, qc_thresh=0.2, window=1000, analyses=('qcrsfc', 'highlow', 'scrubbing'), verbose=False, pca_threshold=None, outlier_threshold=None, atlas='power_2011', sphere_radius=5.0, run_covariates=None, run_denoising_metrics=None, highlow_cut=0.5)[source]
Run scrubbing, high-low motion, and QCRSFC analyses.
- Parameters:
files ((N,) list of nifti files) – List of 4D (X x Y x Z x T) images in MNI space.
qc ((N,) list of array_like) – List of 1D (T) numpy arrays with QC metric values per img (e.g., FD or respiration).
out_dir (str, optional) – Output directory. Default is current directory.
confounds (None or (N,) list of array-like, optional) – List of 2D (T) numpy arrays with confounds per img. Default is None (no confounds are removed).
n_iters (int, optional) – Number of iterations to run to generate null distributions. Default is 10000.
n_jobs (int, optional) – The number of CPUs to use to do the computation. -1 means ‘all CPUs’. Default is 1.
qc_thresh (float, optional) – Threshold for QC metric used in scrubbing analysis. Default is 0.2 (for FD).
window (int, optional) – Number of units (pairs of ROIs) to include when averaging to generate smoothing curve. Default is 1000.
analyses (tuple, optional) – The analyses to run. Must be one or more of “qcrsfc”, “highlow”, “scrubbing”.
verbose (bool, optional) – If verbose, write out the correlation coefficients used by the QC:RSFC and high-low analyses. Default is False.
pca_threshold (None or float or int, optional) – If None, do not perform outlier detection at all. If a float, perform PCA and retain components explain that proportion of the variance. If an int, perform PCA and retain that number of components.
outlier_threshold (None or float, optional) – If None, do not perform outlier detection at all. If a float, flag any runs with Mahalanobis distance p-value < the float according to chi-squared distribution.
atlas (str or path-like, optional) – Atlas to use for ROI time series extraction. If a path to an existing file, the file is treated as a labels image and loaded with
nilearn.maskers.NiftiLabelsMasker. Otherwise, the value is treated as the name of a coordinate atlas available throughnilearn.datasets, such as"power_2011","dosenbach_2010", or"seitzman_2018", and loaded withnilearn.maskers.NiftiSpheresMasker. Default is"power_2011".sphere_radius (float, optional) – Radius in millimeters for sphere atlases. Ignored when
atlasis a labels image. Default is 5.0.run_covariates (None or pandas.DataFrame, optional) – Run-level covariates to adjust for in the QC:RSFC analysis. Rows must correspond to
filesin order. Numeric columns are used directly, and categorical columns are dummy-coded with one reference level. Default is None.run_denoising_metrics (None or pandas.DataFrame, optional) – Run-level denoising and data-loss metrics to include in
run_denoising_summary.tsv. Rows must correspond tofilesin order, and columns must be numeric. Useful columns include upstream retained-volume counts, censored-volume counts, or temporal degrees of freedom after denoising. Default is None.highlow_cut (float, optional) – Fraction of runs assigned to each extreme QC group in the high-low analysis, in
(0, 0.5].0.5(default) is a median split; smaller values (e.g.,0.25for top vs bottom quartiles) contrast the QC extremes and drop the middle runs. Seeddmra.analysis.highlow_analysis().
Notes
At least
MIN_SUBJECTS(10) runs must be retained for analysis, or aValueErroris raised. When the QC:RSFC or high-low analyses are requested and fewer thanQCFC_STABILITY_N(30) runs are retained, aUserWarningis issued because QC-FC estimates are unstable in small samples (Parkes et al., 2018; Ciric et al., 2017); the analyses still run, but their intercept and slope summaries should be interpreted with caution.This function writes out several files to out_dir: -
analysis_values.tsv.gz: Raw analysis values for analyses.Has four columns: distance, qcrsfc, scrubbing, and highlow.
smoothing_curves.tsv.gz: Smoothing curve information for analyses.Has four columns: distance, qcrsfc, scrubbing, and highlow.
null_smoothing_curves.npz:Null smoothing curves from each analysis. Contains three 2D arrays, where number of columns is same size and order as distance column in
smoothing_curves.tsv.gzand number of rows is number of iterations for permutation analysis. The three arrays’ keys are ‘qcrsfc’, ‘highlow’, and ‘scrubbing’.
ranks.tsv.gz: Diagnostic edgewise ranks of the observed analysis values againstthe edgewise null distributions. These ranks are not inferential p-values.
qcrsfc_summary.tsv(only when theqcrsfcanalysis is requested):Descriptive QC-FC benchmark summaries, including the median absolute QC-FC correlation and the percentage of edges with a significant QC-FC correlation (Ciric et al., 2017; Parkes et al., 2018). These are diagnostics, not the package’s inferential result.
run_denoising_summary.tsv: Run-level volume, confound-regressor, retention,and optional user-provided tDOF/data-loss accounting.
[analysis]_analysis.png: Figure for each analysis.
If
verboseisTrue: -z_corrs.tsv.gz: Z-transformed correlation coefficients for the good files,used by QC:RSFC and high-low analyses.
mean_qcs.tsv.gz: Mean QC values for the good files, used by QC:RSFC and high-lowanalyses.