Overview¶
mnefun
is designed to streamline ILABS data processing by automating and
standardizing data retrieval, remote machine processing (MaxFilter),
preprocessing steps, and inverse computation.
A critical idea is that, once an experiment is complete, you or another ILABS person should be able to run your analysis script once, from scratch, for all subjects, and end up with all of the basic preprocessed files (evoked, epochs, inverse, etc.) you will need for your downstream scripts used for publication (stats, etc.).
To achieve mnefun
’s reproducibility goal, it is important
not to run your processing script by changing parameters for different
subjects as you process each of them, or by doing steps manually.
Where subject-specific values are needed, we can add functionality to allow
subject-specific values in the script itself, such as proj_nums
(see below).
Note
The one step that might (somewhat routinely) need to be “worked around” is the data fetching step, which requires that the files on the acquisition machine be named properly, which might not always be the case, for example when:
files are named incorrectly (typos, inconsistently) during acquisition
runs are re-executed and saved with a different name (e.g.,
_redo_raw.fif
).
But this should ideally be the exception and not the rule.
Experiment parameters can be specified using a params = Params(...)
call in
a script (old way), or by specifying a YAML script with the experiment
parameters (new way) and using mnefun.read_params()
to load the
parameters. The processing pipeline steps and relationships are given below.
All YAML parameters are described in their appropriate sections.
Consider looking at mnefun/examples/funloc
directory for a canonical
example of how to process data using mnefun.
Flow chart¶
Running parameters¶
general
: General options¶
- work_dirstr
Working directory, usually “.”.
- subjects_dirstr
Directory containing the structurals.
- subject_indiceslist
Which subjects to process.
- disp_filesbool
Display status.
Note
Anywhere a dict
is supported as an option (e.g.,
mf_prebad
or proj_nums
), a special entry '__default__'
can be used turn the dictionary into a
defaultdict
instance.
This is useful in cases where a single set of values works for most
subjects, but a few need different ones. For example in YAML form:
proj_nums: {
__default__: [[2, 2, 0], [1, 1, 2], [0, 0, 0]],
subj_08: [[2, 2, 0], [1, 1, 3], [0, 0, 0]],
}
1. fetch_raw¶
Fetch raw files from an acquisition machine.
fetch_raw
: Raw fetching parameters¶
- subjectslist of str
Subject names.
- structuralslist of str
List of subject structurals.
- dateslist of tuple or None
Dates to use for anonymization. Use “None” to more fully anonymize.
- acq_sshstr
The acquisition machine SSH name.
- acq_dirlist of str
List of paths to search and fetch raw data.
- acq_portint
Acquisition port.
- acq_excludelist of str
Regular expressions to exclude when trying to find the correct remote directory. This can be useful for example if a subject was run more than once, or someone has done some preprocessing or made copies on the acquisition machine, e.g.:
['genz_proc', 'genz_[0-9]+_[0-9]+a']
which means “exclude anything with ‘genz_proc’; or anything with a substring that has ‘genz_’, followed by at least one number, followed by ‘_’, followed by at least one number, followed by ‘a’” – the latter being useful when subjects should be named
genz100_9a
but have some duplicate directories namedgenz_100_9a
.- run_nameslist of str
Run names for the paradigm.
- runs_emptylist of str
Empty room run names.
- subject_run_indiceslist of array-like | dict | None
Run indices to include for each subject. This can be a list (must be same length as
params.subjects
) or a dict (keys are subject strings, values are the run indices) where missing subjects get all runs. None is an alias for “all runs”.
2. do_score¶
Do the scoring. This converts TTL triggers to meaningful events.
scoring
: Scoring parameters¶
- scorecallable | None
Scoring function used to slice data into trials.
- on_processcallable
Called at each processing step.
3. do_sss¶
Warning
Before running SSS, set params.mf_prebad[SUBJ]
to a
list of bad MEG channels (str), or (old way) create
SUBJ/raw_fif/SUBJ_prebad.txt
with space-separated list of bad
MEG channel numbers (int).
Using p.mf_autobad=True
can help fill in missed bad channels,
but is not as reliable as experienced analyst inspection.
Run SSS processing. This will:
Copy each raw file to the SSS workstation.
Automatically determine bad channels (only if
mf_autobad=True
)Estimate head positions (remotely if
hp_type='maxwell'
, otherwise locally), see preprocessing: head_position_estimation: Head position estimation parameters.Copy the head positions to the local machine.
Delete generated files from the remote machine.
Annotate bad segments automatically, see preprocessing: annotations: Annotation parameters.
Add any custom annotations (e.g., for segments that operators want to manually mark as bad) that have been saved as
FILENAME-custom-annot.fif
.Run SSS processing locally using
mne.preprocessing.maxwell_filter()
.
The addition of annotations before SSS ensures that tSSS operations are not
disrupted by bad segments of data, and also ensures that the output files
have the annotations (as they are preserved by mnefun
).
preprocessing: multithreading
: Multithreading parameters¶
- n_jobsint
Number of jobs to use in parallel operations.
- n_jobs_mklint
Number of jobs to spawn in parallel for operations that can make use of MKL threading. If Numpy/Scipy has been compiled with MKL support, it is best to leave this at 1 or 2 since MKL will automatically spawn threads. Otherwise, n_cpu is a good choice.
- n_jobs_firint | str
Number of threads to use for FIR filtering. Can also be ‘cuda’ if the system supports CUDA.
- n_jobs_resampleint | str
Number of threads to use for resampling. Can also be ‘cuda’ if the system supports CUDA.
preprocessing: pre-SSS bads
: Automatic bad channel detection¶
- mf_prebaddict
Dict with subject keys, with each value being a list of str of bad MEG channels (e.g.,
['MEG0121', 'MEG1743']
).- mf_autobadbool
Default False. If True use Maxwell-filtering-based automatic bad channel detection to mark bad channels prior to SSS.
- mf_autobad_typestr
Default ‘maxwell’. If ‘maxwell’, use MaxFilter to do automatic detection, if ‘python’ (preferred) use MNE-Python.
- mf_badlimitint
MaxFilter threshold for noisy channel detection (default is 7).
preprocessing: head_position_estimation
: Head position estimation parameters¶
- coil_t_windowfloat | dict
Time window for coil position estimation.
- coil_t_step_minfloat | dict
Coil step min for head / cHPI coil position estimation.
- coil_dist_limitfloat | dict
Dist limit for coils.
- coil_gof_limitfloat | dict
Goodness of fit limit for coils.
preprocessing: annotations
: Annotation parameters¶
- coil_bad_count_duration_limitfloat | dict
Remove segments with < 3 good coils for at least this many sec.
- rotation_limitfloat | dict
Rotation limit (deg/s) for annotating bad segments.
- translation_limitfloat | dict
Head translation limit (m/s) for annotating bad segments.
preprocessing: sss
: SSS parameters¶
- movecompstr | None
Movement compensation to use. Can be ‘inter’ or None.
- hp_typestr
Head position estimation method. Must be either ‘maxfilter’ or ‘python’.
- sss_typestr
Signal space separation method. Must be either ‘maxfilter’ or ‘python’.
- int_orderint
Order of internal component of spherical expansion. Default is 8. Value of 6 recomended for infant data.
- ext_orderint
Order of external component of spherical expansion. Default is 3.
- sss_regularizestr
SSS regularization, usually “in”.
- tsss_durfloat | None
Buffer length (in seconds) fpr Spatiotemporal SSS. Default is 60. however based on system specification a shorter buffer may be appropriate. For data containing excessive head movements e.g. young children a buffer size of 4s is recommended.
- st_correlationfloat
Correlation limit between inner and outer subspaces used to reject ovwrlapping intersecting inner/outer signals during spatiotemporal SSS. Default is .98 however a smaller value of .9 is recommended for infant/ child data.
- filter_chpistr
Filter cHPI signals before SSS.
- filter_chpi_t_windowstr | float | None
If None, use
coil_t_window
. Otherwise, options are the same ascoil_t_window
.- trans_tostr | array-like, (3,) | None
The destination location for the head. Can be:
- ‘median’ (default)
Median (across runs) of the starting head positions.
- ‘twa’
Time-weighted average head position.
None
Will not change the head position.
- str
Path to a FIF file containing a MEG device to head transformation.
- array-like
First three elements are coordinates to translate to. An optional fourth element gives the x-axis rotation (e.g., -30 means a backward 30° rotation).
- sss_originarray-like, shape (3,) | str
Origin of internal and external multipolar moment space in meters. Default is center of sphere fit to digitized head points.
- dig_with_eegbool
If True, include EEG points in estimating the head origin.
- ct_filestr
Cross-talk file, usually “uw” to auto-load the UW file.
- cal_filestr
Calibration file, usually “uw” to auto-load the UW file.
- sss_formatstr
Deprecated. SSS numerical format when using MaxFilter.
- mf_argsstr
Deprecated. Extra arguments for MF SSS.
- cont_as_esssbool
If True (default False), use eSSS to improve the external basis estimate using continuous empty-room projectors (
proj_nums[2]
). Only supported when Python is used for SSS.
4. do_ch_fix¶
Fix EEG channel ordering, and also anonymize files.
5. gen_ssp¶
Warning
Before running SSP, examine SSS’ed files and make
SUBJ/bads/bad_ch_SUBJ_post-sss.txt
; usually, this should only
contain EEG channels. Alternatively, you can use
params.auto_bad = some_float
, see
preprocessing: post-SSS bads: Marking bad channels during SSP.
Generate SSP vectors. If additional projectors are required (e.g., to get
rid of muscle movement artifacts in a verbal response paradigm), you can use
p.proj_extra
, which get applied before any other projectors are computed
(e.g., ECG, blink).
preprocessing: filtering
: Filtering parameters¶
- hp_cutfloat | None
Highpass cutoff in Hz. Use None for no highpassing.
- hp_transfloat
High-pass transition band.
- lp_cutfloat
Cutoff for lowpass filtering.
- lp_transfloat
Low-pass transition band.
- filter_lengthint | str
- fir_designstr
- fir_windowstr
- phasestr
preprocessing: post-SSS bads
: Marking bad channels during SSP¶
- auto_badfloat | None
If not None, bad channels will be automatically excluded after SSS if they disqualify a proportion of events exceeding
auto_bad
. This does not require the autoreject module.- auto_bad_rejectstr | dict | None
Default is None. Must be defined if using Autoreject module to compute noisy sensor rejection criteria. Set to ‘auto’ to compute criteria automatically, or dictionary of channel keys and amplitude values e.g., dict(grad=1500e-13, mag=5000e-15, eeg=150e-6) to define rejection threshold(s). See http://autoreject.github.io/ for details.
- auto_bad_flatdict | None
Flat threshold for auto bad.
- auto_bad_eeg_threshint | None
If more than this number of EEG channels is automatically marked bad, an error will be raised. This helps ensure that not too many channels are marked as bad.
- auto_bad_meg_threshint | None
Same as above but for MEG.
preprocessing: ssp
: SSP creation parameters¶
- proj_numslist | dict
List of projector counts to use for ECG/blink/ERM/HEOG/VEOG; each list contains three values for grad/mag/eeg channels. Can be a dict that maps subject names to projector counts to use. The order of computation and application is empty-room, ECG, blink, HEOG, VEOG.
ECG, blink, and ERM are obligatory lists (though they can be lists of all zeros). Lists for HEOG and VEOG are optional. For example, if you want 1 blink, 2 HEOG, and 3 VEOG projectors (for a total of 6 EOG-related projectors) for each channel type, you would do:
[[...], [1, 1, 1], [...], [2, 2, 2], [3, 3, 3]]
If you want just blink and HEOG, you can use a list of 4 lists instead of 5 (or 3).
- proj_sfreqfloat | None
The sample freq to use for calculating projectors. Useful since time points are not independent following low-pass. Also saves computation to downsample.
- proj_megstr
Can be “separate” (default for backward compat) or “combined” (should be better for SSS’ed data).
- drop_threshfloat
The percentage threshold to use when deciding whether or not to plot Epochs drop_log.
- plot_rawbool
If True, plot the raw files with the ECG/EOG events overlaid.
- ssp_eog_rejectdict | None
Amplitude rejection criteria for EOG SSP computation. None will use the mne-python default.
- ssp_ecg_rejectdict | None
Amplitude rejection criteria for ECG SSP computation. None will use the mne-python default.
- eog_channelstr | dict | None
The channel to use to detect blink events. None will use EOG* channels. In lieu of an EOG recording, MEG1411 may work.
- heog_channelstr | dict | None
The channel to use to detect HEOG events. None will use EOG061. In lieu of an EOG recording, MEG1411 may work.
- veog_channelstr | dict | None
The channel to use to detect HEOG events. None will use EOG062.
- ecg_channelstr | dict | None
The channel to use to detect ECG events. None will use ECG063. In lieu of an ECG recording, MEG1531 may work. Can be a dict that maps subject names to channels.
- eog_t_limstuple | dict
The time limits for EOG calculation. Default (-0.25, 0.25).
- heog_t_limstuple | dict
The time limits for HEOG calculation. Default (-0.25, 0.25).
- veog_t_limstuple | dict
The time limits for VEOG calculation. Default (-0.25, 0.25).
- ecg_t_limstuple | dict
The time limits for ECG calculation. Default(-0.08, 0.08).
- eog_f_limstuple | dict
Band-pass limits for EOG detection and calculation. Default (0, 2).
- heog_f_limstuple | dict
Band-pass limits for HEOG detection and calculation. Default (0, 2).
- veog_f_limstuple | dict
Band-pass limits for VEOG detection and calculation. Default (0, 2).
- ecg_f_limstuple | dict
Band-pass limits for ECG detection and calculation. Default (5, 35).
- eog_threshfloat | dict | None
Threshold for EOG detection. Can vary per subject.
- heog_threshfloat | dict | None
Threshold for HEOG detection. Can vary per subject.
- veog_threshfloat | dict | None
Threshold for VEOG detection. Can vary per subject.
- proj_avebool
If True, average artifact epochs before computing proj.
- proj_extrastr | None
Extra projector filename to load for each subject, e.g.
extra-proj.fif
will loadSUBJ/sss_pca_fif/extra-proj.fif
.- get_projs_fromlist of int | dict
Indices for runs to get projects from.
- cont_hpfloat
Highpass to use for continuous ERM projectors (default None).
- cont_hp_transfloat | None
Highpass transition bandwidth to use for continuous ERM projectors (default 0.5).
- cont_lpfloat
Lowpass to use for continuous ERM projectors (default 5).
- cont_lp_transfloat | None
Lowpass transition bandwidth for continuous ERM projectors (default None).
- cont_rejectdict | None
Rejection parameters for continuous empty-room projection calculations. None (default) will use
params.reject
. This likely needs to be set whencont_as_esss=True
.- plot_drop_logsbool
If True, plot drop logs after preprocessing.
6. apply_ssp¶
Apply SSP vectors and filtering to the files.
7. write_epochs¶
Write epochs to disk.
epoching
: Epoching parameters¶
- tminfloat
tmin for events.
- tmaxfloat
tmax for events.
- t_adjustfloat
Adjustment for delays (e.g., -4e-3 compensates for a 4 ms delay in the trigger.
- baselinetuple | None | str
Baseline to use. If “individual”, use
params.bmin
andparams.bmax
, otherwise pass as the baseline parameter to mne-python Epochs.params.bmin
andparams.bmax
will always be used for covariance calculation. This is useful e.g. when using a high-pass filter and no baselining is desired (but evoked covariances should still be calculated from the baseline period).- bminfloat
Lower limit for baseline compensation.
- bmaxfloat
Upper limit for baseline compensation.
- decimint | float | list
Amount to decimate the data after filtering when epoching data (e.g., a factor of 5 on 1000 Hz data yields 200 Hz data). If a float is used, it should be the destination sample rate (e.g., a value of 200. with 1000 Hz data will use decim=5).
- epochs_typestr | list
Can be ‘fif’, ‘mat’, or a list containing both.
- match_funcallable | None
If None, standard matching will be performed. If a function, must_match will be ignored, and
match_fun
will be called to equalize event counts.- rejectdict
Rejection parameters for epochs.
- flatdict
Flat thresholds for epoch rejection.
- reject_tminfloat | None
Reject minimum time to use when epoching. None will use
tmin
.- reject_tmaxfloat | None
Reject maximum time to use when epoching. None will use
tmax
.- on_missingstring
Can set to ‘error’ | ‘warning’ | ‘ignore’. Default is ‘error’. Determine what to do if one or several event ids are not found in the recording during epoching. See mne.Epochs docstring for further details.
- autoreject_thresholdsbool | False
If True use autoreject module to compute global rejection thresholds for epoching. Make sure autoreject module is installed. See http://autoreject.github.io/ for instructions.
- autoreject_typestuple
Default is (‘mag’, ‘grad’, ‘eeg’). Can set to (‘mag’, ‘grad’, ‘eeg’, ‘eog) to use EOG channel rejection criterion from autoreject module to reject trials on basis of EOG.
- reject_epochs_by_annotbool | str
If True, reject epochs by BAD annotations. If str, will reject epochs by annotations that match the given regular expression
str
.- pick_events_autorejectcallable | string | None
Function for picking autoreject events, or the string “restrict” to limit events to those with an id in
in_numbers
.- analyseslist of str
Lists of analyses of interest.
- in_nameslist of str
Names of input events.
- in_numberslist of list of int
Event numbers (in scored event files) associated with each name.
- out_nameslist of list of str
Event types to make out of old ones.
- out_numberslist of list of int
Event numbers to convert to (e.g., [[1, 1, 2, 3, 3], …] would create three event types, where the first two and last two event types from the original list get collapsed over).
- must_matchlist of int
Indices from the original in_names that must match in event counts before collapsing. Should eventually be expanded to allow for ratio-based collapsing.
- every_otherbool
If True, in addition to standard averages / evoked data, averages will be computed from every other trial, i.e., from even and odd trials separately. This can help assess the SNR of the data.
- epochs_projbool | ‘delayed’
The
proj
argument inmne.Epochs
. Should be'delayed'
if you want the option of plotting sensor-space data with no projectors.- allow_resamplebool
If True (default False), allow resampling raw instances (and events) to that of the first raw insntance in the case that raws do not all have a matching sample rate. This is useful when recordings were errantly performed at different sample rates.
8. gen_covs¶
Generate covariances.
covariance
: Covariance parameters¶
- cov_methodstr
Covariance calculation method.
- compute_rankbool
Default is False. Set to True to compute rank of the noise covariance matrix during inverse kernel computation.
- pick_events_covcallable | string | None
Function for picking covariance events, or the string “restrict” to limit events to those with an id in
in_numbers
.- cov_rankstr | int
Cov rank to use, usually “auto”.
- cov_rank_methodstr
Can be “estimate_rank” to use
mne.rank.estimate_rank
, or “compute_rank” to usemne.compute_rank()
. The latter seems to work better for customtol
values by not row-normalizing data.- cov_rank_tolfloat | str
Tolerance for covariance rank computation. Can also be “auto” or “float32”, though these tend not to be very robust.
- force_erm_cov_rank_fullbool
If True, force the ERM cov to be full rank. Usually not needed, but might help when the empty-room data is short and/or there are a lot of head movements.
9. gen_fwd¶
Warning
Make SUBJ/trans/SUBJ-trans.fif using mne coreg.
Generate forward solutions (and source space if necessary).
forward
: Forward parameters¶
- bem_typestr
Defaults to
'5120-5120-5120'
, use'5120'
for a single-layer BEM.- srcstr | dict
Can start be:
‘oct6’ to use a surface source space decimated using the 6th (or another integer) subdivision of an octahedron, or
‘vol5’ to use a volumetric grid source space with 5mm (or another integer) spacing
- src_posfloat
Default is 7 mm. Defines source grid spacing for volumetric source space.
- fwd_mindistfloat
Minimum distance (mm) for sources in the brain from the skull in order for them to be included in the forward solution source space.
10. gen_inv¶
Generate inverses.
inverse
: Inverse parameters¶
- inv_nameslist of str
Inverse names to use.
- inv_runslist of int
Runs to use for each inverse.
11. gen_report¶
Write mne.Report
HTML of results to disk.
report_params
: Report parameters¶
- pre_funcallable
Function to run before adding any Report sections. Must have the signature:
def pre_fun(report, p, subject, **kwargs): ...
The
**kwargs
is necessary for future compatibility.- chpi_snrbool
cHPI SNR (default True).
- good_hpi_countbool
Number of good HPI coils (default True).
- head_movementbool
Head movement (default True).
- raw_segmentsbool
10 evenly spaced raw data segments (default True).
- psdbool
Raw PSDs, often slow (default True).
- ssp_topomapsbool
SSP topomaps (default True).
- source_alignmentbool
Source alignment (default True).
- drop_logbool
Plot the epochs drop log (default True).
- covariancebool
Covariance image and SVD plots.
- bembool
Plot the BEM.
- snrdict
SNR plots, with keys ‘analysis’, ‘name’, and ‘inv’.
- whiteningdict
Whitening plots, with keys ‘analysis’, ‘name’, and ‘cov’.
- sensordict
Sensor topomaps, with keys ‘analysis’, ‘name’, ‘times’, and ‘proj’. ‘proj’ can be True (default), False, or ‘reconstruct’. False and ‘reconstruct’ require
epochs_proj='delayed'
.- sourcedict
Source plots, with keys ‘analysis’, ‘name’, ‘inv’, ‘times’, ‘views’, and ‘size’.
- post_funcallable
Function to run after adding all other Report sections. Must have the same signature as
pre_fun
above.- preloadbool
If True (default False), load all raw data into memory before generating plots. Can help speed up computations like PSD estimates, but can also consume a large amount of memory.
Filename standardization¶
mnefun imposes custom standardized structure on filenames:
Preparing your machine for MaxFilter use¶
Warning
Head position estimation and bad channel detection are now
available using hp_type='python'
and
mf_autobad_type='python
, respectively.
These are the preferred processing methods going forward
(as of March 2020), and using MaxFilter should be considered
deprecated.
Parameters for remotely connecting to SSS workstation (‘sws’) can be set
by adding a file ~/.mnefun/mnefun.json
with contents like:
$ mkdir ~/.mnefun
$ echo '{"sws_ssh":"kasga", "sws_dir":"/data06/larsoner/sss_work", "sws_port":22}' > ~/.mnefun/mnefun.json
This should be preferred to the old way, which was to set in each script when running on your machine:
params.sws_ssh = 'kasga'
params.sws_dir = '/data06/larsoner/sss_work'
Using per-machine config files rather than per-script variables should help increase portability of scripts without hurting reproducibility (assuming we all use the same version of MaxFilter, which should be a safe assumption).
To test that things are configured correctly, you can do:
$ python -c "import mnefun; mnefun.check_sws()"
On kasga: maxfilter -version (0 sec)
Output:
Revision: 2.2.15 Neuromag maxfilter Dec 11 2012 14:48:44
If you get an error:
Ensure that your file is correctly set up in
~/.mnefun/mnefun.json
. It needs to use standard quotation marks like"
, not fancy ones like”
so ensure that your text editor (if you used one) did not use fancy quotation marks.Ensure that
maxwell_filter
is accessible as a command on the remote machine. Log into the remote machine and do:$ which maxfilter /neuro/bin/util/maxfilter
If you get no output with this command, it means that MaxFilter is not available on your PATH on the remote machine. To fix this, consider adding the following line to the end of your
~/.bashrc
on the remote machine:export PATH=${PATH}:/neuro/bin/util:/neuro/bin/X11