Overview ¶

mnefun is designed to streamline ILABS data processing by automating and standardizing data retrieval, remote machine processing (MaxFilter), preprocessing steps, and inverse computation.

A critical idea is that, once an experiment is complete, you or another ILABS person should be able to run your analysis script once, from scratch, for all subjects, and end up with all of the basic preprocessed files (evoked, epochs, inverse, etc.) you will need for your downstream scripts used for publication (stats, etc.).

To achieve mnefun’s reproducibility goal, it is important not to run your processing script by changing parameters for different subjects as you process each of them, or by doing steps manually. Where subject-specific values are needed, we can add functionality to allow subject-specific values in the script itself, such as proj_nums (see below).

Note

The one step that might (somewhat routinely) need to be “worked around” is the data fetching step, which requires that the files on the acquisition machine be named properly, which might not always be the case, for example when:

files are named incorrectly (typos, inconsistently) during acquisition
runs are re-executed and saved with a different name (e.g., _redo_raw.fif).

But this should ideally be the exception and not the rule.

Experiment parameters can be specified using a params = Params(...) call in a script (old way), or by specifying a YAML script with the experiment parameters (new way) and using mnefun.read_params() to load the parameters. The processing pipeline steps and relationships are given below. All YAML parameters are described in their appropriate sections. Consider looking at mnefun/examples/funloc directory for a canonical example of how to process data using mnefun.

Flow chart ¶

Running parameters ¶

`general`: General options ¶

work_dirstr: Working directory, usually “.”.
subjects_dirstr: Directory containing the structurals.
subject_indiceslist: Which subjects to process.
disp_filesbool: Display status.

Note

Anywhere a dict is supported as an option (e.g., mf_prebad or proj_nums), a special entry '__default__' can be used turn the dictionary into a defaultdict instance. This is useful in cases where a single set of values works for most subjects, but a few need different ones. For example in YAML form:

proj_nums: {
  __default__: [[2, 2, 0], [1, 1, 2], [0, 0, 0]],
  subj_08: [[2, 2, 0], [1, 1, 3], [0, 0, 0]],
  }

1. fetch_raw ¶

Fetch raw files from an acquisition machine.

`fetch_raw`: Raw fetching parameters ¶

subjectslist of str

Subject names.

structuralslist of str

List of subject structurals.

dateslist of tuple or None

Dates to use for anonymization. Use “None” to more fully anonymize.

acq_sshstr

The acquisition machine SSH name.

acq_dirlist of str

List of paths to search and fetch raw data.

acq_portint

Acquisition port.

acq_excludelist of str

Regular expressions to exclude when trying to find the correct remote directory. This can be useful for example if a subject was run more than once, or someone has done some preprocessing or made copies on the acquisition machine, e.g.:

['genz_proc', 'genz_[0-9]+_[0-9]+a']

which means “exclude anything with ‘genz_proc’; or anything with a substring that has ‘genz_’, followed by at least one number, followed by ‘_’, followed by at least one number, followed by ‘a’” – the latter being useful when subjects should be named genz100_9a but have some duplicate directories named genz_100_9a.

run_nameslist of str

Run names for the paradigm.

runs_emptylist of str

Empty room run names.

subject_run_indiceslist of array-like | dict | None

Run indices to include for each subject. This can be a list (must be same length as params.subjects) or a dict (keys are subject strings, values are the run indices) where missing subjects get all runs. None is an alias for “all runs”.

2. do_score ¶

Do the scoring. This converts TTL triggers to meaningful events.

`scoring`: Scoring parameters ¶

scorecallable | None: Scoring function used to slice data into trials.
on_processcallable: Called at each processing step.

3. do_sss ¶

Warning

Before running SSS, set params.mf_prebad[SUBJ] to a list of bad MEG channels (str), or (old way) create SUBJ/raw_fif/SUBJ_prebad.txt with space-separated list of bad MEG channel numbers (int). Using p.mf_autobad=True can help fill in missed bad channels, but is not as reliable as experienced analyst inspection.

Run SSS processing. This will:

Copy each raw file to the SSS workstation.
Automatically determine bad channels (only if mf_autobad=True)
Estimate head positions (remotely if hp_type='maxwell', otherwise locally), see preprocessing: head_position_estimation: Head position estimation parameters.
Copy the head positions to the local machine.
Delete generated files from the remote machine.
Annotate bad segments automatically, see preprocessing: annotations: Annotation parameters.
Add any custom annotations (e.g., for segments that operators want to manually mark as bad) that have been saved as FILENAME-custom-annot.fif.
Run SSS processing locally using mne.preprocessing.maxwell_filter().

The addition of annotations before SSS ensures that tSSS operations are not disrupted by bad segments of data, and also ensures that the output files have the annotations (as they are preserved by mnefun).

`preprocessing: multithreading`: Multithreading parameters ¶

n_jobsint: Number of jobs to use in parallel operations.
n_jobs_mklint: Number of jobs to spawn in parallel for operations that can make use of MKL threading. If Numpy/Scipy has been compiled with MKL support, it is best to leave this at 1 or 2 since MKL will automatically spawn threads. Otherwise, n_cpu is a good choice.
n_jobs_firint | str: Number of threads to use for FIR filtering. Can also be ‘cuda’ if the system supports CUDA.
n_jobs_resampleint | str: Number of threads to use for resampling. Can also be ‘cuda’ if the system supports CUDA.

`preprocessing: pre-SSS bads`: Automatic bad channel detection ¶

mf_prebaddict: Dict with subject keys, with each value being a list of str of bad MEG channels (e.g., ['MEG0121', 'MEG1743']).
mf_autobadbool: Default False. If True use Maxwell-filtering-based automatic bad channel detection to mark bad channels prior to SSS.
mf_autobad_typestr: Default ‘maxwell’. If ‘maxwell’, use MaxFilter to do automatic detection, if ‘python’ (preferred) use MNE-Python.
mf_badlimitint: MaxFilter threshold for noisy channel detection (default is 7).

`preprocessing: head_position_estimation`: Head position estimation parameters ¶

coil_t_windowfloat | dict: Time window for coil position estimation.
coil_t_step_minfloat | dict: Coil step min for head / cHPI coil position estimation.
coil_dist_limitfloat | dict: Dist limit for coils.
coil_gof_limitfloat | dict: Goodness of fit limit for coils.

`preprocessing: annotations`: Annotation parameters ¶

coil_bad_count_duration_limitfloat | dict: Remove segments with < 3 good coils for at least this many sec.
rotation_limitfloat | dict: Rotation limit (deg/s) for annotating bad segments.
translation_limitfloat | dict: Head translation limit (m/s) for annotating bad segments.

`preprocessing: sss`: SSS parameters ¶

movecompstr | None

Movement compensation to use. Can be ‘inter’ or None.

hp_typestr

Head position estimation method. Must be either ‘maxfilter’ or ‘python’.

sss_typestr

Signal space separation method. Must be either ‘maxfilter’ or ‘python’.

int_orderint

Order of internal component of spherical expansion. Default is 8. Value of 6 recomended for infant data.

ext_orderint

Order of external component of spherical expansion. Default is 3.

sss_regularizestr

SSS regularization, usually “in”.

tsss_durfloat | None

Buffer length (in seconds) fpr Spatiotemporal SSS. Default is 60. however based on system specification a shorter buffer may be appropriate. For data containing excessive head movements e.g. young children a buffer size of 4s is recommended.

st_correlationfloat

Correlation limit between inner and outer subspaces used to reject ovwrlapping intersecting inner/outer signals during spatiotemporal SSS. Default is .98 however a smaller value of .9 is recommended for infant/ child data.

filter_chpistr

Filter cHPI signals before SSS.

filter_chpi_t_windowstr | float | None

If None, use coil_t_window. Otherwise, options are the same as coil_t_window.

trans_tostr | array-like, (3,) | None

The destination location for the head. Can be:

‘median’ (default)
Median (across runs) of the starting head positions.
‘twa’
Time-weighted average head position.
None
Will not change the head position.
str
Path to a FIF file containing a MEG device to head transformation.
array-like
First three elements are coordinates to translate to. An optional fourth element gives the x-axis rotation (e.g., -30 means a backward 30° rotation).

sss_originarray-like, shape (3,) | str

Origin of internal and external multipolar moment space in meters. Default is center of sphere fit to digitized head points.

dig_with_eegbool

If True, include EEG points in estimating the head origin.

ct_filestr

Cross-talk file, usually “uw” to auto-load the UW file.

cal_filestr

Calibration file, usually “uw” to auto-load the UW file.

sss_formatstr

Deprecated. SSS numerical format when using MaxFilter.

mf_argsstr

Deprecated. Extra arguments for MF SSS.

cont_as_esssbool

If True (default False), use eSSS to improve the external basis estimate using continuous empty-room projectors (proj_nums[2]). Only supported when Python is used for SSS.

4. do_ch_fix ¶

Fix EEG channel ordering, and also anonymize files.

5. gen_ssp ¶

Warning

Before running SSP, examine SSS’ed files and make SUBJ/bads/bad_ch_SUBJ_post-sss.txt; usually, this should only contain EEG channels. Alternatively, you can use params.auto_bad = some_float, see preprocessing: post-SSS bads: Marking bad channels during SSP.

Generate SSP vectors. If additional projectors are required (e.g., to get rid of muscle movement artifacts in a verbal response paradigm), you can use p.proj_extra, which get applied before any other projectors are computed (e.g., ECG, blink).

`preprocessing: filtering`: Filtering parameters ¶

hp_cutfloat | None: Highpass cutoff in Hz. Use None for no highpassing.
hp_transfloat: High-pass transition band.
lp_cutfloat: Cutoff for lowpass filtering.
lp_transfloat: Low-pass transition band.
filter_lengthint | str: See mne.filter.create_filter().
fir_designstr: See mne.filter.create_filter().
fir_windowstr: See mne.filter.create_filter().
phasestr: See mne.filter.create_filter().

`preprocessing: post-SSS bads`: Marking bad channels during SSP ¶

auto_badfloat | None: If not None, bad channels will be automatically excluded after SSS if they disqualify a proportion of events exceeding auto_bad. This does not require the autoreject module.
auto_bad_rejectstr | dict | None: Default is None. Must be defined if using Autoreject module to compute noisy sensor rejection criteria. Set to ‘auto’ to compute criteria automatically, or dictionary of channel keys and amplitude values e.g., dict(grad=1500e-13, mag=5000e-15, eeg=150e-6) to define rejection threshold(s). See http://autoreject.github.io/ for details.
auto_bad_flatdict | None: Flat threshold for auto bad.
auto_bad_eeg_threshint | None: If more than this number of EEG channels is automatically marked bad, an error will be raised. This helps ensure that not too many channels are marked as bad.
auto_bad_meg_threshint | None: Same as above but for MEG.

`preprocessing: ssp`: SSP creation parameters ¶

proj_numslist | dict

List of projector counts to use for ECG/blink/ERM/HEOG/VEOG; each list contains three values for grad/mag/eeg channels. Can be a dict that maps subject names to projector counts to use. The order of computation and application is empty-room, ECG, blink, HEOG, VEOG.

ECG, blink, and ERM are obligatory lists (though they can be lists of all zeros). Lists for HEOG and VEOG are optional. For example, if you want 1 blink, 2 HEOG, and 3 VEOG projectors (for a total of 6 EOG-related projectors) for each channel type, you would do:

[[...],
 [1, 1, 1],
 [...],
 [2, 2, 2],
 [3, 3, 3]]

If you want just blink and HEOG, you can use a list of 4 lists instead of 5 (or 3).

proj_sfreqfloat | None

The sample freq to use for calculating projectors. Useful since time points are not independent following low-pass. Also saves computation to downsample.

proj_megstr

Can be “separate” (default for backward compat) or “combined” (should be better for SSS’ed data).

drop_threshfloat

The percentage threshold to use when deciding whether or not to plot Epochs drop_log.

plot_rawbool

If True, plot the raw files with the ECG/EOG events overlaid.

ssp_eog_rejectdict | None

Amplitude rejection criteria for EOG SSP computation. None will use the mne-python default.

ssp_ecg_rejectdict | None

Amplitude rejection criteria for ECG SSP computation. None will use the mne-python default.

eog_channelstr | dict | None

The channel to use to detect blink events. None will use EOG* channels. In lieu of an EOG recording, MEG1411 may work.

heog_channelstr | dict | None

The channel to use to detect HEOG events. None will use EOG061. In lieu of an EOG recording, MEG1411 may work.

veog_channelstr | dict | None

The channel to use to detect HEOG events. None will use EOG062.

ecg_channelstr | dict | None

The channel to use to detect ECG events. None will use ECG063. In lieu of an ECG recording, MEG1531 may work. Can be a dict that maps subject names to channels.

eog_t_limstuple | dict

The time limits for EOG calculation. Default (-0.25, 0.25).

heog_t_limstuple | dict

The time limits for HEOG calculation. Default (-0.25, 0.25).

veog_t_limstuple | dict

The time limits for VEOG calculation. Default (-0.25, 0.25).

ecg_t_limstuple | dict

The time limits for ECG calculation. Default(-0.08, 0.08).

eog_f_limstuple | dict

Band-pass limits for EOG detection and calculation. Default (0, 2).

heog_f_limstuple | dict

Band-pass limits for HEOG detection and calculation. Default (0, 2).

veog_f_limstuple | dict

Band-pass limits for VEOG detection and calculation. Default (0, 2).

ecg_f_limstuple | dict

Band-pass limits for ECG detection and calculation. Default (5, 35).

eog_threshfloat | dict | None

Threshold for EOG detection. Can vary per subject.

heog_threshfloat | dict | None

Threshold for HEOG detection. Can vary per subject.

veog_threshfloat | dict | None

Threshold for VEOG detection. Can vary per subject.

proj_avebool

If True, average artifact epochs before computing proj.

proj_extrastr | None

Extra projector filename to load for each subject, e.g. extra-proj.fif will load SUBJ/sss_pca_fif/extra-proj.fif.

get_projs_fromlist of int | dict

Indices for runs to get projects from.

cont_hpfloat

Highpass to use for continuous ERM projectors (default None).

cont_hp_transfloat | None

Highpass transition bandwidth to use for continuous ERM projectors (default 0.5).

cont_lpfloat

Lowpass to use for continuous ERM projectors (default 5).

cont_lp_transfloat | None

Lowpass transition bandwidth for continuous ERM projectors (default None).

cont_rejectdict | None

Rejection parameters for continuous empty-room projection calculations. None (default) will use params.reject. This likely needs to be set when cont_as_esss=True.

plot_drop_logsbool

If True, plot drop logs after preprocessing.

6. apply_ssp ¶

Apply SSP vectors and filtering to the files.

7. write_epochs ¶

Write epochs to disk.

`epoching`: Epoching parameters ¶

tminfloat: tmin for events.
tmaxfloat: tmax for events.
t_adjustfloat: Adjustment for delays (e.g., -4e-3 compensates for a 4 ms delay in the trigger.
baselinetuple | None | str: Baseline to use. If “individual”, use params.bmin and params.bmax, otherwise pass as the baseline parameter to mne-python Epochs. params.bmin and params.bmax will always be used for covariance calculation. This is useful e.g. when using a high-pass filter and no baselining is desired (but evoked covariances should still be calculated from the baseline period).
bminfloat: Lower limit for baseline compensation.
bmaxfloat: Upper limit for baseline compensation.
decimint | float | list: Amount to decimate the data after filtering when epoching data (e.g., a factor of 5 on 1000 Hz data yields 200 Hz data). If a float is used, it should be the destination sample rate (e.g., a value of 200. with 1000 Hz data will use decim=5).
epochs_typestr | list: Can be ‘fif’, ‘mat’, or a list containing both.
match_funcallable | None: If None, standard matching will be performed. If a function, must_match will be ignored, and match_fun will be called to equalize event counts.
rejectdict: Rejection parameters for epochs.
flatdict: Flat thresholds for epoch rejection.
reject_tminfloat | None: Reject minimum time to use when epoching. None will use tmin.
reject_tmaxfloat | None: Reject maximum time to use when epoching. None will use tmax.
on_missingstring: Can set to ‘error’ | ‘warning’ | ‘ignore’. Default is ‘error’. Determine what to do if one or several event ids are not found in the recording during epoching. See mne.Epochs docstring for further details.
autoreject_thresholdsbool | False: If True use autoreject module to compute global rejection thresholds for epoching. Make sure autoreject module is installed. See http://autoreject.github.io/ for instructions.
autoreject_typestuple: Default is (‘mag’, ‘grad’, ‘eeg’). Can set to (‘mag’, ‘grad’, ‘eeg’, ‘eog) to use EOG channel rejection criterion from autoreject module to reject trials on basis of EOG.
reject_epochs_by_annotbool | str: If True, reject epochs by BAD annotations. If str, will reject epochs by annotations that match the given regular expression str.
pick_events_autorejectcallable | string | None: Function for picking autoreject events, or the string “restrict” to limit events to those with an id in in_numbers.
analyseslist of str: Lists of analyses of interest.
in_nameslist of str: Names of input events.
in_numberslist of list of int: Event numbers (in scored event files) associated with each name.
out_nameslist of list of str: Event types to make out of old ones.
out_numberslist of list of int: Event numbers to convert to (e.g., [[1, 1, 2, 3, 3], …] would create three event types, where the first two and last two event types from the original list get collapsed over).
must_matchlist of int: Indices from the original in_names that must match in event counts before collapsing. Should eventually be expanded to allow for ratio-based collapsing.
every_otherbool: If True, in addition to standard averages / evoked data, averages will be computed from every other trial, i.e., from even and odd trials separately. This can help assess the SNR of the data.
epochs_projbool | ‘delayed’: The proj argument in mne.Epochs. Should be 'delayed' if you want the option of plotting sensor-space data with no projectors.
allow_resamplebool: If True (default False), allow resampling raw instances (and events) to that of the first raw insntance in the case that raws do not all have a matching sample rate. This is useful when recordings were errantly performed at different sample rates.

8. gen_covs ¶

Generate covariances.

`covariance`: Covariance parameters ¶

cov_methodstr: Covariance calculation method.
compute_rankbool: Default is False. Set to True to compute rank of the noise covariance matrix during inverse kernel computation.
pick_events_covcallable | string | None: Function for picking covariance events, or the string “restrict” to limit events to those with an id in in_numbers.
cov_rankstr | int: Cov rank to use, usually “auto”.
cov_rank_methodstr: Can be “estimate_rank” to use mne.rank.estimate_rank, or “compute_rank” to use mne.compute_rank(). The latter seems to work better for custom tol values by not row-normalizing data.
cov_rank_tolfloat | str: Tolerance for covariance rank computation. Can also be “auto” or “float32”, though these tend not to be very robust.
force_erm_cov_rank_fullbool: If True, force the ERM cov to be full rank. Usually not needed, but might help when the empty-room data is short and/or there are a lot of head movements.

9. gen_fwd ¶

Warning

Make SUBJ/trans/SUBJ-trans.fif using mne coreg.

Generate forward solutions (and source space if necessary).

`forward`: Forward parameters ¶

bem_typestr

Defaults to '5120-5120-5120', use '5120' for a single-layer BEM.

srcstr | dict

Can start be:

‘oct6’ to use a surface source space decimated using the 6th (or another integer) subdivision of an octahedron, or
‘vol5’ to use a volumetric grid source space with 5mm (or another integer) spacing

src_posfloat

Default is 7 mm. Defines source grid spacing for volumetric source space.

fwd_mindistfloat

Minimum distance (mm) for sources in the brain from the skull in order for them to be included in the forward solution source space.

10. gen_inv ¶

Generate inverses.

`inverse`: Inverse parameters ¶

inv_nameslist of str: Inverse names to use.
inv_runslist of int: Runs to use for each inverse.

11. gen_report ¶

Write mne.Report HTML of results to disk.

`report_params`: Report parameters ¶

pre_funcallable

Function to run before adding any Report sections. Must have the signature:

def pre_fun(report, p, subject, **kwargs):
    ...

The **kwargs is necessary for future compatibility.

chpi_snrbool

cHPI SNR (default True).

good_hpi_countbool

Number of good HPI coils (default True).

head_movementbool

Head movement (default True).

raw_segmentsbool

10 evenly spaced raw data segments (default True).

psdbool

Raw PSDs, often slow (default True).

ssp_topomapsbool

SSP topomaps (default True).

source_alignmentbool

Source alignment (default True).

drop_logbool

Plot the epochs drop log (default True).

covariancebool

Covariance image and SVD plots.

bembool

Plot the BEM.

snrdict

SNR plots, with keys ‘analysis’, ‘name’, and ‘inv’.

whiteningdict

Whitening plots, with keys ‘analysis’, ‘name’, and ‘cov’.

sensordict

Sensor topomaps, with keys ‘analysis’, ‘name’, ‘times’, and ‘proj’. ‘proj’ can be True (default), False, or ‘reconstruct’. False and ‘reconstruct’ require epochs_proj='delayed'.

sourcedict

Source plots, with keys ‘analysis’, ‘name’, ‘inv’, ‘times’, ‘views’, and ‘size’.

post_funcallable

Function to run after adding all other Report sections. Must have the same signature as pre_fun above.

preloadbool

If True (default False), load all raw data into memory before generating plots. Can help speed up computations like PSD estimates, but can also consume a large amount of memory.

Filename standardization ¶

mnefun imposes custom standardized structure on filenames:

`naming`: File naming tags and folders ¶

list_dirstr: Directory for event lists, usually “lists”.
bad_dirstr: Directory to use for bad channels, usually “bads”.
bad_tagstr: Tag for bid channel filename, usually “_post-sss.txt”.
raw_dirstr: Raw directory, usually “raw_fif”.
keep_origbool: Keep original files after anonymization.
raw_fif_tagstr: File tag for raw data, usually “_raw.fif”.
sss_fif_tagstr: File tag for SSS-processed files, usually “_raw_sss.fif”.
sss_dirstr: Directory to use for SSS processed files, usually “sss_fif”.
pca_dirstr: Directory for processed files, usually “sss_pca_fif”.
epochs_dirstr: Directory for epochs, usually “epochs”.
epochs_prefixstr: The prefix to use for the -epo.fif file.
epochs_tagstr: Tag for epoochs, usually ‘-epo’.
eq_tagstr: Tag for equalized data, usually “eq”.
cov_dirstr: Directory to use for covariances, usually “covariance”.
forward_dirstr: Directory for forward solutions, usually “forward”.
trans_dirstr: Directory to use for trans files, usually “trans”.
inverse_dirstr: Directory for storing inverses, usually “inverse”.
inv_tagstr: Tag for all inverses, usually “-sss”.
inv_erm_tagstr: Tag for ERM inverse, usually “-erm”.
inv_fixed_tagstr: Tag for fixed inverse, usually “-fixed”.
inv_loose_tagstr: Tag for loose inverse, usually “”.
inv_free_tagstr: Tag for free orientation inverse, usually “-free”.

Preparing your machine for MaxFilter use ¶

Warning

Head position estimation and bad channel detection are now available using hp_type='python' and mf_autobad_type='python, respectively. These are the preferred processing methods going forward (as of March 2020), and using MaxFilter should be considered deprecated.

Parameters for remotely connecting to SSS workstation (‘sws’) can be set by adding a file ~/.mnefun/mnefun.json with contents like:

$ mkdir ~/.mnefun
$ echo '{"sws_ssh":"kasga", "sws_dir":"/data06/larsoner/sss_work", "sws_port":22}' > ~/.mnefun/mnefun.json

This should be preferred to the old way, which was to set in each script when running on your machine:

params.sws_ssh = 'kasga'
params.sws_dir = '/data06/larsoner/sss_work'

Using per-machine config files rather than per-script variables should help increase portability of scripts without hurting reproducibility (assuming we all use the same version of MaxFilter, which should be a safe assumption).

To test that things are configured correctly, you can do:

$ python -c "import mnefun; mnefun.check_sws()"
On kasga: maxfilter -version (0 sec)
Output:
Revision: 2.2.15 Neuromag maxfilter Dec 11 2012 14:48:44

If you get an error:

Ensure that your file is correctly set up in ~/.mnefun/mnefun.json. It needs to use standard quotation marks like ", not fancy ones like ” so ensure that your text editor (if you used one) did not use fancy quotation marks.
Ensure that maxwell_filter is accessible as a command on the remote machine. Log into the remote machine and do:
```
$ which maxfilter
/neuro/bin/util/maxfilter
```
If you get no output with this command, it means that MaxFilter is not available on your PATH on the remote machine. To fix this, consider adding the following line to the end of your ~/.bashrc on the remote machine:
```
export PATH=${PATH}:/neuro/bin/util:/neuro/bin/X11
```