Wednesday, June 25, 2014

Escape to Los Angeles

Because you all need to know when and where I go on vacation, I will be traveling to Los Angeles, California, for the next week; followed by a trip to Minnesota. Those of you still in Bloomington: Please don't break into my apartment. But if you do, could you at least water my plants? I forgot about that when I left town this afternoon. Thanks!

However, just because I am on the other side of the country, hiking and swimming and laughing and playing the Schubert F Minor Fantasie and Grieg and Brahms cello sonatas, doesn't mean that I'll quit updating. Do you think I would ever quit that? Did we quit when the Germans bombed Pearl Harbor? Hell no. So remember to check back regularly, because I will continue to post on topics such as GLMs, MVPA, FIFY, and IMHO. Maybe. It really depends on how much fun I'm having.


Monday, June 23, 2014

Multiple Regression in SPM

A couple of videos have been posted about multiple regression in SPM, both at the first level and second level. Technically, almost all of the GLMs researchers usually set up use multiple linear regression, where a linear combination of weighted regressors is fit to the timecourse of each voxel. However, in SPM parlance, regressors are distinct from conditions: Conditions are typically convolved with some sort of basis function (usually a hemodynamic response function), whereas regressors are not convolved with anything. For example, someone may enter motion parameters as regressors, since they are not convolved with any hemodynamic shape, but still may account for some of the variance observed in the signal (and, we hope, the signal that is primarily due to motion). Another example would be entering in the timecourse extracted from an ROI to do a functional connectivity analysis, or inserting the interaction term from a PPI analysis as a regressor.

At the second level SPM also gives the option for doing multiple regression; but this is simply the entering of covariates which can be tested for correlations with the signal change in contrasts across subjects. In most cases, however, you can select any other design, and enter in covariates regardless of which design that you choose; which I believe to be the better option, since it is more flexible.

In any case, most of the rest of the details can be found in the following videos. Speak hands for me.





Thursday, June 19, 2014

Updated SPM Command Line, Including Beta Series Option

I've updated the code of the previously discussed previous script in the previous post, previously, including the addition of more help text to explain certain options and termination of the script if the timing file is not found.

However, the biggest change is a block of code to convert the timings files so that each trial is now a regressor, before entering them into the GLM. With so many regressors, the GLM will look a little funky; possibly something like this:


Note that, since there are so many regressors and so many scans, not all of them are going to be labeled individually; but it should look as though there is only one estimation per column, or blip of white underneath each regressor. Also due to space constraints the parameter estimability may not be completely visible, but when each trial is being estimated individually, you should still get a beta for each trial. Check the output directory to make sure that you have the number of betas you think you should have: one for each trial, plus the amount of session constants (usually equal to the number of runs for the subject).

The updated script can be found here; as always, send me any comments if you encounter any serious issues or bugs. I recommend using it as a replacement for the previously discussed previous script, since it can either estimate each trial individually or average them together, depending on whether you set the DO_BETASERIES flag to 1 or 0.

All of this is explained clearly in the following video:


Monday, June 16, 2014

Running SPM First-Level Analyses from the Command Line

For most people, doing first-level analyses in SPM is tedious in the extreme; for other people, these analyses are manageable through command line scripting in Matlab. And there are still other people who don't even think about first-level analyses or SPM or Matlab or any of that. This last class of people are your college classmates who majored in something else, like economics, and are living significantly less stressful lives than you, making scads of money, and regularly eating out at restaurants where they serve shrimp the size of Mike Tyson's forearm.

Assuming that you belong to the first category, however, first-level analyses are invariably a crashing bore; and, worse, completely impractical for something like beta-series analysis, where a separate regressor is needed for individual events.

The following script will do all of that for you, although it makes many assumptions about your timing files and where everything is stored. First, your directory needs to be set up with a root directory containing all of the subjects, as well as a separate timing directory containing your onsets files; and second, that the timing files are written in a four-column format of runs, regressors, onsets (in seconds), and duration. Once you have all of that, though, all it takes is simply modifications of the script to run your analysis.

As always, let me know if this actually helps, or whether there are sections of code that can be cleaned up. I will be making a modification tomorrow allowing for beta series analysis by converting all of the timing events into individual regressors, which should help things out.

Link to script: http://mypage.iu.edu/~ajahn/docs/Run_SPM_FirstLevel.m

% INPUT:
%  rootDir        - Path to where subjects are located
%  subjects       - Vector containing list of subjects (can concatenate with brackets)
%  spmDir         - Path to folder where you want to move the multiple conditions .mat files and where to the output SPM file (directory is created if it doesn't
%  exist)
%  timingDir      - Path to timing directory, relative to rootDir
%  timingSuffix   - Suffix appended to timing files
%  matPrefix      - Will be prefixed to output .mat files.
%  dataDir        - Directory storing functional runs.
%
% OUTPUT:
%  One multiple condition .mat file per run. Also specifies the GLM and runs
%  beta estimation.
%       
%
%       Note that this script assumes that your timing files are organized
%       with columns in the following order: Run, ConditionName, Onset (in
%       seconds), Duration (in seconds)
%
%       Ideally, this timing file should be written out from your
%       presentation software, e.g., E-Prime
%
%       Also note how the directory tree is structured in the following example. You may have to
%       change this script to reflect your directory structure.
%       
%       Assume that:
%         rootdir = '/server/studyDirectory/'
%         spmDir = '/modelOutput/'
%         timingDir = '/timings/onsets/'
%         subjects = [101]
%         timingSuffix = '_timing.txt'
%       1) Sample path to SPM directory:
%       '/server/studyDirectory/101/modelOutput/'
%       2) Sample path to timing file in the timing directory:
%       '/server/studyDirectory/timings/onsets/101_timing.txt'
%
% Andrew Jahn, Indiana University, June 2014
% andysbrainblog.blogspot.com

clear all

%%%----Things you will want to change for your study----%%%

rootDir = '/data/hammer/space5/ImagineMove/fmri/';
subjects = [214 215];
spmDir = '/RESULTS/testMultis/';
timingDir = 'fMRI_behav/matlab_input/';
timingSuffix = '_spm_mixed.txt';
matPrefix = 'testMulti';
dataDir = '/RESULTS/data/';

%Change these parameters to reflect study-specific information about number
%of scans and discarded acquisitions (disacqs) per run
funcs = {'swuadf0006.nii', 'swuadf0007.nii', 'swuadf0008.nii'}; %You can add more functional runs; just remember to separate them with a comma
numScans = [240 240 240]; %This if the number of scans that you acquire per run
disacqs = 2; %This is the number of scans you later discard during preprocessing
numScans = numScans-disacqs;
TR = 2; %Repetition time, in seconds

%Estimating the GLM can take some time, particularly if you have a lot of betas. If you just want to specify your
%design matrix so that you can assess it for singularities, turn this to 0.
%If you wish to do it later, estimating the GLM through the GUI is very
%quick.
ESTIMATE_GLM = 1;


%%%-----------------------------------------------------%%%



%For each subject, create timing files and jobs structure
for subject = subjects
    
    %See whether output directory exists; if it doesn't, create it
    outputDir = [rootDir num2str(subject) spmDir];
    
    if ~exist(outputDir)
        mkdir(outputDir)
    end
     
    %---Navigate to timing directory and create .mat files for each run---%
    cd([rootDir timingDir]);
    
    fid = fopen([num2str(subject) timingSuffix], 'rt');
    T = textscan(fid, '%f %s %f %f', 'HeaderLines', 1); %Columns should be 1)Run, 2)Regressor Name, 3) Onset Time (in seconds, relative to start of each run), and 4)Duration, in seconds
    fclose(fid);
    
    clear runs nameList names onsets durations sizeOnsets %Remove these variables if left over from previous analysis
    
    runs = unique(T{1});

    %Begin creating jobs structure
    jobs{1}.stats{1}.fmri_spec.dir = cellstr(outputDir);
    jobs{1}.stats{1}.fmri_spec.timing.units = 'secs';
    jobs{1}.stats{1}.fmri_spec.timing.RT = TR;
    jobs{1}.stats{1}.fmri_spec.timing.fmri_t = 16;
    jobs{1}.stats{1}.fmri_spec.timing.fmri_t0 = 1;

    %Create multiple conditions .mat file for each run
    for runIdx = 1:size(runs, 1)
            nameList = unique(T{2});
            names = nameList';
            onsets = cell(1, size(nameList,1));
            durations = cell(1, size(nameList,1));
            sizeOnsets = size(T{3}, 1);
        for nameIdx = 1:size(nameList,1)
            for idx = 1:sizeOnsets
                if isequal(T{2}{idx}, nameList{nameIdx}) && T{1}(idx) == runIdx
                    onsets{nameIdx} = double([onsets{nameIdx} T{3}(idx)]);
                    durations{nameIdx} = double([durations{nameIdx} T{4}(idx)]);
                end
            end
            onsets{nameIdx} = (onsets{nameIdx} - (TR*disacqs)); %Adjust timing for discarded acquisitions
        end

        save ([matPrefix '_' num2str(subject) '_' num2str(runIdx)], 'names', 'onsets', 'durations')
        
        %Grab frames for each run using spm_select, and fill in session
        %information within jobs structure
        files = spm_select('ExtFPList', [rootDir num2str(subject) dataDir], ['^' funcs{runIdx}], 1:numScans);

        jobs{1}.stats{1}.fmri_spec.sess(runIdx).scans = cellstr(files);
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).cond = struct('name', {}, 'onset', {}, 'duration', {}, 'tmod', {}, 'pmod', {});
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).multi = cellstr([outputDir matPrefix '_' num2str(subject) '_' num2str(runIdx) '.mat']);
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).regress = struct('name', {}, 'val', {});
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).multi_reg = {''};
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).hpf = 128;
        
                
    end
    
    movefile([matPrefix '_' num2str(subject) '_*.mat'], outputDir)
    
    %Fill in the rest of the jobs fields
    jobs{1}.stats{1}.fmri_spec.fact = struct('name', {}, 'levels', {});
    jobs{1}.stats{1}.fmri_spec.bases.hrf = struct('derivs', [0 0]);
    jobs{1}.stats{1}.fmri_spec.volt = 1;
    jobs{1}.stats{1}.fmri_spec.global = 'None';
    jobs{1}.stats{1}.fmri_spec.mask = {''};
    jobs{1}.stats{1}.fmri_spec.cvi = 'AR(1)';
    
    %Navigate to output directory, specify and estimate GLM
    cd(outputDir);
    spm_jobman('run', jobs)
    
    if ESTIMATE_GLM == 1
        load SPM;
        spm_spm(SPM);
    end
    
end

%Uncomment the following line if you want to debug
%keyboard



Sunday, June 15, 2014

Writing Out Timing Files from E-Prime

Now, I promised you all something that would help make running SPM analyses easier; for although we would all like something that is both transparent and usable, in most FMRI packages they are polar of each other. SPM in many ways is very usable; but this tends to be more hindrance than asset when running large groups of subjects, when manually selecting each session and copying and pasting by hand become not only tedious, but dangerous, as the probability of human error is compounded with each step. Best to let the machines do the work for you when you can. Anything else is an affront to them; you'll wake more than just the dogs.

Do assist our future overlords in their work, command lines for each step should be used whenever possible. But, rummily enough, SPM offers little documentation about this, and it more or less needs to be puzzled out on your own. For most processing this wouldn't be too much of a hassle, but there is one step where it is indispensable, more indispensable, if possible, than Nutella on a banana sprinkled with chopped walnuts. Or possibly warm Nutella drizzled on top of vanilla ice cream and left standing until it cools and hardens into a shell. Has anyone tried this? Could you let me know how it tastes? Thanks!

Where was I? Right; the one step where you need batching is first-level analysis, where timing information is entered for each session (or run) for each subject. Back in the autumn of aught eight during my days as a lab technician I used to be just like you, arriving to work in the morning and easing my falstaffian bulk into my chair before starting my mind-numbing task of copying and pasting onset times into each condition for each run, invariably tiring of it after a few minutes but keeping at it, and maybe completing a few subjects by the end of the day. Not the most exciting or fulfilling work, and I probably screwed a lot of things up without noticing it.

SPM has tried to anticipate problems like this by specifying a Multiple Condition option through their interface, where you supply a .mat file for each session that contains all of the names, onsets, and durations for that session, and everything is automatically filled in for you. Which sounds great, until you realize that creating Matlab files is, like, hard, and then you go back to your old routine. (The manual also says that these .mat files can be generated automatically by presentation software such as COGENT, but I have never met anyone who has ever used COGENT; this is, in all likelihood, a fatcat move by Big Science to get you to buy a piece of software that you don't need.)

What we will be doing, then, is trying to make this as straightforward and simple of a process as possible. However, this assumes that you have your onsets organized in a certain way; and first we will talk about how to create those onsets from your stimulus presentation program. This will allow you much more flexibility, as you can choose what you want to write into a simple text file without having to go through copying and pasting data from the E-Prime DataAid program into Excel for further processing.  (I use E-Prime, and will be reviewing that here, but I believe that most presentation programs will allow you to write out data on the fly.)

Within E-Prime, you will need to know exactly when the scanner started, and use this as time zero for your onsets files. I usually do this by having a screen that says "Waiting for scanner..." which only terminates once the scanner sends a trigger pulse through a USB cord hooked up to the presentation laptop. We have a Siemens Trio scanner, which sends a backtick ( ` ); check whether your scanner does the same.



Note that I have an inline script (i.e., an object that contains E-Basic code) right after the WaitScanner slide that contains the following code:

StartTimestamp = CLng(c.GetAttrib("WaitScanner.RTTime")
c.SetAttrib "StartTimestamp", StartTimestamp

if c.GetAttrib("Session") = 1 Then

Open "OnsetTimes_" & c.GetAttrib("Subject") & ".txt" For Output as #1
Close #1

Open "OnsetTimes_" & c.GetAttrib("Subject") & ".txt" For Append As #1
      Print #1, "Run", "Event", "Onset", "Dur"
Close #1

end if


Also, in the duration/input tab for the properties of waitscanner, I have the Keyboard accepting ( ` ) as an input, and termination when it receives that backtick. The StartTimestamp will simply record when the scanner first started, and the if/then statement ensures that the file does not get overwritten when the next run is started.

After that, the StartTimestamp variable and the newly create text file can be used to record information after each event, with the specified condition being determined by you. To calculate the timing for each event, all you have to do is grab the onset time of a condition through the c.GetAttrib command, and subtract the StartTimestamp variable from that:



Additional runs will continue to append to the text file, until you have all of the timing information that you need for your subject. This can then be parsed by a script and divided into multiple condition .mat files, which we will discuss tomorrow.

Thursday, June 12, 2014

Beta Series Analysis in SPM

The following is a script to run beta series analysis in SPM; which, conceptually, is identical to beta series analysis in AFNI, just made more complicated by the use of Matlab commands, many of which I would otherwise remain totally ignorant of.

Also, if you believe that I did this on my own, you are a scoundrel and a fool. Virtually all of the code is from my colleague Derek Nee, a postdoc who used to work in my lab and who remains my superior in all things, scientific and otherwise. He also once got me with the old "If you say gullible slowly, it sounds like orange" joke.

In any case, all that I did was repackage it into a function to be more accessible by people outside of our lab. You may see a couple snippets of code that are useful outside of just doing beta series analysis; for example, the spm_get_data command, which can be used for parameter extraction from the command line without using marsbar. Also note that the script assumes that each trial for a condition has been labeled separately with the regressor name followed by a "_" and then the occurrence of the trial. So, for example, if I were looking at 29 instances of pressing the left button, and the regressor name was LeftButtonPress, I would have LeftButtonPress_1, LeftButtonPress_2, LeftButtonPress_3...LeftButtonPress_29.

Obviously, this is very tedious to do by hand. The Multiple Conditions option through the SPM GUI for setting up 1st level analyses is indispensable; but to cover that is another tutorial, one that will probably cross over into our discussion of GLMs.


The script:

function DoBetaSeries(rootdir, subjects, spmdir, seedroi, Conds, MapNames)
% function BetaSeries(rootdir, subjects, spmdir, seedroi, Conds, MapNames)
% INPUT:
%  rootdir  - Path to where subjects are stored
%  subjects - List of subjects (can concatenate with brackets)
%  spmdir   - Path to folder containing SPM file
%  seedroi  - Absolute path to file containing ROI in NIFTI format
%  Conds    - List of conditions for beta series analysis
%  MapNames - Output for list of maps, one per Conds cell
%       
%   Example use:
%       BetaSeries('/data/study1/fmri/', [101 102 103],
%       'model/BetaSeriesDir/', '/data/study1/Masks/RightM1.nii',
%       {'TapLeft'}, {'TapLeft_RightM1'})
%
%       conditions to be averaged together should be placed together in a cell
%       separate correlation maps will be made for each cell
%       Conds = {'Incongruent' 'Congruent' {'Error' 'Late'}}; 
%
%       For MapNames, there should be one per Conds cell above
%       e.g., with the Conds above, MapNames = {'Incongruent_Map',
%       'Congruent_Map', 'Error_Late_Map'}
%
%       Once you have z-maps, these can be entered into a 2nd-level
%       1-sample t-test, or contrasts can be taken between z-maps and these
%       contrasts can be taken to a 1-sample t-test as well.
%
% 

if nargin < 5
 disp('Need rootdir, subjects, spmdir, seedroi, Conds, MapNames. See "help BetaSeries" for more information.')
    return
end


%Find XYZ coordinates of ROI
Y = spm_read_vols(spm_vol(seedroi),1);
indx = find(Y>0);
[x,y,z] = ind2sub(size(Y),indx);
XYZ = [x y z]';


%Find each occurrence of a trial for a given condition
%These will be stacked together in the Betas array
for i = 1:length(subjects)
    subj = num2str(subjects(i));
    disp(['Loading SPM for subject ' subj]);
    %Can change the following line of code to CD to the directory
    %containing your SPM file, if your directory structure is different
    cd([rootdir subj filesep spmdir]);
    load SPM;
    
    for cond = 1:length(Conds)
        Betas = [];
        currCond = Conds{cond};
        if ~iscell(currCond)
            currCond = {currCond};
        end
        for j = 1:length(SPM.Vbeta) 
            for k = 1:length(currCond)
                if ~isempty(strfind(SPM.Vbeta(j).descrip,[currCond{k} '_'])) 
                    Betas = strvcat(Betas,SPM.Vbeta(j).fname);
                end
            end
        end
              

    %Extract beta series time course from ROI
    %This will be correlated with every other voxel in the brain
    if ischar(Betas)
        P = spm_vol(Betas);
    end

    est = spm_get_data(P,XYZ);
    est = nanmean(est,2);

        
        
        %----Do beta series correlation between ROI and rest of brain---%

            MapName = MapNames{cond};
            disp(['Performing beta series correlation for ' MapName]);
            
            Vin = spm_vol(Betas);
            nimgo = size(Vin,1);
            nslices = Vin(1).dim(3);

            % create new header files
            Vout_r = Vin(1);   
            Vout_p = Vin(1);
            [pth,nm,xt] = fileparts(deblank(Vin(1).fname));
            Vout_r.fname = fullfile(pth,[MapNames{cond} '_r.img']);
            Vout_p.fname = fullfile(pth,[MapNames{cond} '_p.img']);

            Vout_r.descrip = ['correlation map'];
            Vout_p.descrip = ['p-value map'];

            Vout_r.dt(1) = 16;
            Vout_p.dt(1) = 16;

            Vout_r = spm_create_vol(Vout_r);
            Vout_p = spm_create_vol(Vout_p);

            % Set up large matrix for holding image info
            % Organization is time by voxels
            slices = zeros([Vin(1).dim(1:2) nimgo]);
            stack = zeros([nimgo Vin(1).dim(1)]);
            out_r = zeros(Vin(1).dim);
            out_p = zeros(Vin(1).dim);


            for i = 1:nslices
                B = spm_matrix([0 0 i]);
                %creates plane x time
                for j = 1:nimgo
                    slices(:,:,j) = spm_slice_vol(Vin(j),B,Vin(1).dim(1:2),1);
                end

                for j = 1:Vin(1).dim(2)
                    stack = reshape(slices(:,j,:),[Vin(1).dim(1) nimgo])';
                    [r p] = corr(stack,est);
                    out_r(:,j,i) = r;
                    out_p(:,j,i) = p;

                end

                Vout_r = spm_write_plane(Vout_r,out_r(:,:,i),i);
                Vout_p = spm_write_plane(Vout_p,out_p(:,:,i),i);

            end

                
            %Convert correlation maps to z-scores
            %NOTE: If uneven number of trials in conditions you are
            %comparing, leave out the "./(1/sqrt(n-3)" term in the zdata
            %variable assignment, as this can bias towards conditions which
            %have more trials
            
                disp(['Converting correlation maps for subject ' subj ', condition ' MapNames{cond} ' to z-scores'])
                P = [MapNames{cond} '_r.img'];
                n = size(Betas,1);
                Q = MapNames{cond};
                Vin = spm_vol([MapNames{cond} '_r.img']);

                % create new header files
                Vout = Vin;   

                [pth,nm,xt] = fileparts(deblank(Vin(1).fname));
                Vout.fname = fullfile(pth,[Q '_z.img']);

                Vout.descrip = ['z map'];

                Vout = spm_create_vol(Vout);

                data = spm_read_vols(Vin);
                zdata = atanh(data)./(1/sqrt(n-3));

                spm_write_vol(Vout,zdata);

    end
end



This can either be copied and pasted into a script, or downloaded as a .m file here.




Tuesday, June 10, 2014

ICA Analysis Playlist: Soup to Nuts

Fellow brainbloggers,

I will be providing more details on each of the ICA steps, but for those who need to do it right now, there is a playlist up on Youtube, a survival kit of sorts, which will guide you through an example analysis from start to finish and point out dangers along the way. As I was doing this over the weekend, I encountered a couple of particularly pernicious pitfalls, precariously poised on the precipice of total annihilation:

1. When doing dual_regression, make sure that your input list points to your functional data that has not only been preprocessed, but also normalized to standard space. Often FSL automatically creates a dataset called filtered_func_data.nii.gz in both the main directory and the normalized directory; choose the normalized one.

2. You can do multi-session MELDOIC through the GUI, but only if each subject has the same number of TRs. If there is a discrepancy, MELODIC will exit out once it encounters the different subject. In that case, you need to analyze each subject individually (or by batch subjects together who have the same number of TRs), and then do melodic from the command line, using a command such as:

melodic -i ICA_List.txt -a concat -o ICA_Output --nobet -d 30 --mmthresh 0.5 --tr 2.5

Where ICA_List.txt contains a path to the preprocessed, normalized data for each subject on each row.

3. I previously mentioned that you should not insult our future robot overlords, and leave the dimensionality estimation to FSL. However, this can lead to large numbers of components, and it is often a good idea to set this manually at 30, give or take a few components. Usually around 30 components will hit the sweet spot between overfitting and underfitting the amount of variance to each component.

Those are my observations from the front. As always, take these tutorials with a large calculus of salt.



Friday, June 6, 2014

Independent Components Analysis, Part II: Using FSL Example Data

For our next step, we will be working with FSL example data - somewhat artificial data, true, and much better quality than anything you can ever expect from the likes of your data, which will only lead to toil, sweat, and the garret. Sufficient unto the day is the frustration thereof.

However, it is a necessary step to see what ICA analysis ought to look like, as well as learning how to examine and identify components related to tasks and networks that you are interested in. Also - and possibly more important - you will learn to recognize sources of noise, such as components related to head motion and physiological artifacts.

First, go to the link http://fsl.fmrib.ox.ac.uk/fslcourse/ and scroll to the bottom for instructions on how to download the datasets. You can use either wget or curl. For this demonstration, we will be using datasets 2 and 6:

curl -# -O -C - http://fsl.fmrib.ox.ac.uk/fslcourse/fsl_course_data2.tar.gz
curl -# -O -C - http://fsl.fmrib.ox.ac.uk/fslcourse/fsl_course_data6.tar.gz


Once you have downloaded them, unzip them using gunzip, and then "tar -xf" on the resulting tar files. This will create a folder called fsl_course_data, which you should rename so that they do not conflict with each other. Within fsl_course_data2, navigate to the /melodic/av directory, where you will find a small functional dataset that was acquired while the participant was exposed to auditory stimuli and visual stimuli - which sounds much more scientific than saying, "The participant saw stuff and heard stuff."

Open up the MELODIC gui either through the FSL gui, or through typing Melodic_gui from the command line. Most of the preprocessing steps can be kept as is. However, keep the following points in mind:

1. Double-check the TR. FSL will fill it in automatically from the header of the NIFTI file, but it isn't always reliable.

2. Spatial smoothing isn't required in ICA, but a small amount can help produce better-looking and more identifiable component maps. Somewhere on the order of the size of a voxel or two will usually suffice.


3. By default, MELODIC automatically estimates the number of components for you. However, if you have severe delusions and believe that you know how many components should be generated, you can turn off the "Automatic dimensionality estimation" option in the Stats tab, and enter the number of components you want.


4. The Threshold IC maps option is not the same thing as a p-value correction threshold. I'm not entirely clear on how it relates to the mixture modeling carried out by ICA, but my sense from reading the documentation and papers using ICA is that a higher threshold only keeps those voxels that have a higher probability of belonging to the true signal distribution, instead of the background noise distribution, and it comes down to a balance between false positives and false negatives. I don't have any clear guidelines about what threshold to use, but I've seen cutoffs used within the 0.8-0.9 range in papers.


5. I don't consider myself a snob, but I was using the bathroom at a friend's house recently, and I realized how uncomfortable that cheap, non-quilted toilet paper can be. It's like performing intimate hygiene with roofing materials.


6. Once you have your components, you can load them into FSLview and scroll through them with the "Volumes" button in the lower left corner. You can also load the Atlases from the Tools menu and double-click on it to get a semi-transparent highlight of where different cortical regions are. This can be useful when trying to determine whether certain components fall within network areas that you would expect them to.




More details in the videos below, separately for the visual-auditory and resting-state datasets.



Wednesday, June 4, 2014

Introduction to Independent Components Analysis

In the course of your FMRI studies, you may have at one point stumbled upon the nightmarish world of Independent Components Analysis (ICA), an intimidating-sounding technique often used by people who know far more than I do. Nevertheless, let us meditate upon it and see what manner of things we have here.

Components are simpler building blocks of a more complex signal. For example, in music, the soundwave profiles of chords can appear very complex; yet a Fourier analysis can filter it back into its constituent parts - simpler sine waves associated with each note that, combined together, create the chord's waveform.



A similar process happens when ICA is applied to neuroimaging data. Data which comes right off the scanner is horribly messy, a festering welter of voxels and timecourses that any right man would run away from as though his very life depended upon it. To make this more comprehensible, however, ICA can decompose it into more meaningful components, each one explaining some amount of the variance of the overall signal. 

Note also that we talk of spatial and temporal components, which will be extracted from each other; this may seem somewhat odd, as the two are inseparable in a typical FMRI dataset: each voxel (the spatial component) has a timecourse (the temporal component). However, ICA splits these apart and recombines them, in descending order, into components that explain the amount of variance of the original spatio-temporal maps. This is what is represented in the figures shown on the FSL website, reprinted below:



After these components have been extracted, something happens that some (like me) consider terrifying: You need to identify the components yourself, assigning each one to whatever condition seems most reasonable. Does that one look like a component related to visual processing? Then call it the visual component. Does that one look like a resting-state network? Then call it a resting-state network component. This may all seem rather cavalier, especially considering that you are the one making the judgments. Seriously; just think about that for a moment. Think about what you've done today.

In any case, that is a brief overview of ICA. To be fair, there are more rigorous ways of classifying what components actually represent - such as creating templates for connectivity networks or activation patterns and calculating the amount of fit between that and your component - but to be honest, you probably don't have the motivation to go through all of that. And by you, I mean me.


Next we will work through the example datasets on FSL's website, discussing such problems as over- and under-fitting, the Melodic GUI, and what options need to be changed; followed by using ICA to analyze resting-state data.

Tuesday, June 3, 2014

The Next Three Months

Comrades, friends, visionaries, thinkers, and fortune-hunters,

This is, barring any unforeseen circumstances, such as meteor strike, death by Nutella overdose, or floss strangulation, my last summer of graduate school at Indiana University; and I will be spending the majority of that time working on my dissertation to escape this durance vile.

However, I have also been thinking (which should worry you), and it appears to me that what this site has become is a pastiche of training and tutorials which deal with small, isolated problems: How to do this for your data, what it means, and then moving on to the next, often without a clear link between the two, and often with a joke or two thrown in for good measure as a diversion to briefly suspend the terror that haunts your guilt-ridden minds. No doubt some come here only to receive what goods they need and then ride on, I no more than a digital manciple supplying a growing, ravenous demand for answers. Some may come here to merely carp and backbite at my charlatanry; others out of a morbid curiosity for ribald stories and for whatever details about my personal life they can divine; and still others, perhaps, see something here that warms something inside of them.

What is needed, however, is a focus on what ties them all together; not only technique and methods, but ways of thinking that will stand you in good stead for whatever you wish to figure out, neuroimaging or otherwise. And, as new wine requires new bottles, a higher-order website is needed to serve as a hub where everything that I have created, along with other materials, lies together; where one is able to quickly find the answers that they need, whether through videos or links to the appropriate tools tools and scripts, along with educational material about the most important concepts to grasp before moving forward with a particular problem. Problem-solution flowcharts, perhaps. Applets that quiz you and record scores, possibly. And, of course, links to my haberdashers. (You think I am making a fortune? Ascots and cufflinks cost more, and without them one would not be in good taste.)

So much for exhortations. Actually putting this into practice is the difficult part, and is an effort that could, quite possibly, kill me from exhaustion, or at least cause me to procrastinate endlessly. We shall see. At least I have taken a few steps towards starting, which includes looking at some HTML code and prodding at it thoughtfully through the computer screen with my finger. I'm not exactly sure how HTML works; but at least whenever I am working with it, I take pains to look very concerned and deep in thought.

However, before this endeavor there are still a few matters left to discuss, which I had earlier promised but still haven't delivered yet. In some states I would be hanged for such effrontery; but your continued patience means that either 1) You really, really don't want to read the manual (and who can blame you?), or 2) You find it worth the wait. Regardless, and whatever my behavior suggests to the contrary, I do want to help out with these concepts, but I do not always get to everyone's questions. Sometimes too much time passes, and they are gradually swept into the vast oubliette of other stuff I tend to forget; other times, some emails do in fact miss me, and aren't opened until much later, when it would seem too late to reply. In any case, I apologize if I have forgotten any of you. Below is a brief schedule of what I plan to cover:

1. Independent Components Analysis with FSL
2. Beta Series Analysis in SPM
3. Explanation of GLMs
4. Multiple Regression in SPM
5. MVPA in AFNI
6. An explanation of GSReg, and why it's fashionable to hate it, using example data on the AFNI website
7. Overview of 3dTproject

And a couple of minor projects which I plan to get to at some point:

1. Comparison of cluster correction algorithms in SPM and AFNI
2. How to simulate Kohonen nets in Matlab, which will make your nerd cred go through the roof

We will start tomorrow with Independent Components Analysis, a method considerably different from seed-based analysis, starting with the differences between the two and terminating at a destination calamitous beyond all reckoning; and then moving on to Beta Series analysis in SPM, which means doing something extremely complicated through Matlab, and plagiarizing code from one of my mentors; and then the rest of it in no set order, somewhere within the mists beyond right knowing where the eye rolls crazily and the lip jerks and drools.