Thursday, July 19, 2012

FSL Tutorial 2: FEAT (Part 2): The Reckoning

(Note: Hit the fullscreen option and play at a higher resolution for better viewing)


Now things get serious. I am talking more serious than the projected peak Nutella in 2020, after which our Nutella resources will slow down to a trickle and then simply evaporate. This video goes into detail about the FEAT stats tab, specifically what you should and shouldn't do (which pretty much means just leaving most of the defaults as is), lest you screw everything up, which, let's face it, you probably will anyway. People will tell you that's okay, but it's not.

I've tried to compress these tutorials into a shorter amount of time, because I usually take one look at the duration of an online video and don't even bother with something greater than three or four minutes (unless it happens to be a Starcraft 2 replay). But there's no getting around the fact that this stuff takes a while to explain, so the walkthroughs will probably remain in the ten-minute range.

To supplement the tutorial, it may help to flesh out a few concepts, particularly if this is your first time doing stuff in FSL. The most important part of an fMRI experiment - besides the fact that it should be well-planned, make sense to the subject, and be designed to compare hypotheses against each other - is the timing. In other words, knowing what happened when. If you don't know that, you're up a creek without Nutella a paddle. There's no way to salvage your experiment if the timing is off or unreliable.

The documentation on FSL's website isn't very good when demonstrating how to make timing files, and I'm surprised that the default option is a square waveform to be convolved with a canonical Hemodynamic Response Function (HRF). What almost every researcher will want is the Custom 3 Column format, which specifies the onset of each condition, how long it lasted, and any auxiliary parametric information you have reason to believe may modulate the amplitude of the Blood Oxygenation Level Dependent (BOLD) response. This auxiliary parametric information could be anything about that particular trial of the condition; for example, if you are showing the subject one of those messed-up IAPS photos, and you have a rating about how messed-up it is, this can be entered into the third column of the timing file. If you have no reason to believe that one trial of a condition should be different from any other in that condition, you can set every instance to 1.

Here is a sample timing file to be read into FSL (which they should post an example of somewhere in their documentation; I haven't been able to find one yet, but they do provide a textual walkthrough of how to do it under the EV's part of the Stats section here):

10  1  1
18  1  1
25  1  1
30  1  1


To translate this text file, this would mean that this condition occurred at 10, 18, 25, and 30 seconds relative to the start of the run; each trial of this condition lasted one second; and there is no parametric modulation.

A couple of other things to keep in mind:

1) When setting up contrasts, also make a simple effect (i.e., just estimate a beta weight) for each condition. This is because if everything is set up as a contrast of one beta weight minus another, you can lose valuable information about what is going on in that contrast.

As an example of why this might be important, look at this graph. Just look at it!

Proof that fMRI data is (mostly) crap


These are timecourses extracted from two different conditions, helpfully labeled "pred2corr0" and "pred2corr1". When we took a contrast of pred2corr0 and compared it to pred2corr1, we got a positive value. However, here's what happened: The peaks of the HRF (represented here by the timepoints under the "3" in the x-axis, which translates into 6 seconds (3 scans of 2 seconds each = 6 seconds), representing the typical peak of the HRF after the onset of a stimulus) for both conditions were negative. It just happened that the peak for the pred2corr1 condition was more negative than that of pred2corr0, hence the positive contrast value.

2) If you have selected "Temporal Derivative" for all of your regressors, then every other column will represent an estimate of what the temporal derivative should look like. Adding a temporal derivative has the advantage of accounting for any potential lags in the onset of the HRF, but comes at the cost of a degree of freedom, since you have something extra to estimate.

3) After the model is set up and you click on the "Efficiency" tab, you will see two sections. The section on the left represents the correlation between regressors, and the section on the right represents the singular value decomposition eigenvalue for each condition.

What is an eigenvalue? Don't ask me; I'm just a mere cognitive neuroscientist.
For the correlations, brighter intensities represent higher correlations. So, it makes sense that the diagonal is all white, since each condition correlates with itself perfectly. However, it is the off-diagonal squares that you need to pay attention to, and if any of them are overly bright, you have a problem. A big one. Bigger than finding someone you loved and trusted just ate your Nu-...but let's not go there.

As for the eigenvalues on the right, I have yet to find out what range of values represent safety and which ones represent danger. I will keep looking, but for now, it is probably a better bet to do design efficiency estimation using a package like AFNI to get a good idea of how your design will hold up under analysis.


That's it. A few more videos will be uploaded, and then the beginning user should have everything he needs to get started.

27 comments:

  1. Thanks for the awesome blog and tutorials! A great supplement to the FSL documentation!

    ReplyDelete
    Replies
    1. Thanks for the kind words! You're very welcome!

      Delete
  2. thank you you saved my life

    ReplyDelete
  3. You are the man!

    ReplyDelete
  4. Hello all,
    first thank you a lot (like a lot) for this awesome blog.

    I've encountered a problem that need help with. I have done an event-related analysis on a subject in fsl which has given us good results. now i am interested to do this in spm. i want to use the preprocessed 4D nifty file (filtered_func_data.nii.gz) and do the first level analysis in spm. the results are totally different, do you have any idea why? i have used exactly the same contrasts. the design matrix (design.mat in fsl and SPM.Xx.x in spm) are way different. do you have any suggestions?

    Thanks

    ReplyDelete
    Replies
    1. Hi Ashkan,

      If the design matrices are radically different, then that is probably the cause of the differences you see in the contrast maps. There may be something going on with SPM filtering the data again, but I'm not sure. How exactly are you setting up the design matrix in SPM? Are you converting the onsets to SPM format? That is the first thing that comes to my mind.

      -Andy

      Delete
  5. Hi Andy,
    Thanks very much for all of your tutorials. They've been incredibly helpful. I was wondering if you could clarify one of the examples you provided in this tutorial, as I think it might be almost identical to what I'm now trying to do (I've had trouble finding detailed documentation regarding this on the FSL website). If you entered in the rating of how "messed up" each of your IAPS images are in the third column of your timing file, and then ran the first level analysis, you would basically be getting information regarding regions for which the BOLD response is modulated by degree of..."messed up image," right? If you also had two different categories of images, would you be able to create an EV for each type of image, and then ultimately create contrasts that would show you brain regions for which the modulation is significantly greater for one category than the other? (e.g. degree of messed up images modulates activity in the left IFG to a greater extent when images depict people as compared to animals)? When you enter this auxiliary parametric information in your timing file, can you simply enter in the raw values, or do they need to be demeaned? Thanks very much for your help; I really appreciate it!

    -Avery

    ReplyDelete
    Replies
    1. Hi Avery,

      You should demean your parametric modulators, since you want to see how well a deflection from the mean (i.e., an unmodulated response) captures the variability in the BOLD response.

      It's been a while since I've done this kind of analysis in FSL, but I know that at least in SPM and AFNI you get two regressors: One for the unmodulated, "normal" regressor, and another regressor inserted into the model capturing any additional variability in the BOLD response modeled by your parametric modulator. These parametric modulation maps can then be used for second-level t-tests, just like any other regressor or contrast.


      Best,

      -Andy

      Delete
    2. Hi Andy,
      Thanks for your quick response! I'll try this out and seem what happens. It looks like FSL is also going to require that I normalize the demeaned values - I think I can only enter in numbers between -1 and 1.

      Delete
  6. This comment has been removed by the author.

    ReplyDelete
  7. Hi Andrew - thank you for all of your helpful tutorials! Do you know if the onset files for FSL need to be in order from earliest timepoint to latest timepoint or is it possible for them to be "out of order"?

    ReplyDelete
    Replies
    1. Hi Sarah,

      FSL will rearrange the onsets from earliest to latest, no matter how they are entered into the timing file. You can test this yourself by creating a timing file with entries "1 1 1; 50 1 1; 100 1 1" and another file with the entries "100 1 1; 50 1 1; 1 1 1". They should be identical when you click on the "View Design" button in the GLM tab.

      Best,

      -Andy

      Delete
    2. Thank you, Andy! So helpful!

      Delete
  8. Hi Andy,

    Thanks a lot for all your help.

    Regarding the sample text file:

    10 1 1
    18 1 1
    25 1 1
    30 1 1

    Do they refer to one event, in four time points?

    I was wondering how a sample time file looks like when there are 3 events of different types (first event (a), second event (b), third event (c) following each other as:

    a
    a
    b
    a
    b
    b
    c
    a
    b
    c

    and how in the analysis it becomes clear which is which?

    Best,
    Simin

    ReplyDelete
    Replies
    1. Hi Simin,

      That text file represents one condition with 4 events (or trials) occurring at timepoints 10, 18, 25, and 30.

      If you have multiple conditions, you will need to make a separate text file for each one, following the same format as the sample text file I show above.

      -Andy

      Delete
    2. Dear Andy, thanks a lot for your reply.
      I have also another question
      Is it possible that to analyze a blocked design fMRI experiment, under Stat tab, full model setup be used, with basic shape, custom (one entry per volume) and convolution, Double-Gamma HRF, and now with a text file (corresponding to timing), but instead of the time of the onset of the stimuli, having rows equal to number of volumes, as zeroes and ones referring to in which volume there was rest and in which volume there was task, as:
      0
      0
      0
      0
      1
      1
      1
      1
      1
      1
      0
      0
      0
      0
      Does it make sense?

      Delete
    3. Dear Andy,
      I still need your reply.

      Best,

      Delete
    4. Hey Simin,

      I would instead use the "Custom 3-Column File" option, since you can specify the duration of the task. The 1-Column File option assumes that there is a distinct task event at each "1" that you specify; in your example, that there are 6 separate task events.

      You can run a simulation by creating 2 text files: one with the timing you have, and another one with one line that says "12 18 1"; and then doing the following steps:

      1. Open up FEAT
      2. Under the Data tab, specify that there are 14 volumes; leave the TR as 3.0s
      3. Select the Stats tab, then Full Model Setup
      4. Create 3 EVs
      5. For EV1, select Basic Shape: Custom (3 Column Format) and select your 3-column text file
      6. For EV2, select Basic Shape: Custom (1 Column Format) and select your 1-column text file
      7. For EV3, select Basic shape: Square, and specify Off as 12s, and On as 18s
      8. Select Convolution: Double-Gamma for each EV, and unselect Add temporal derivative

      When you click on View Design, you should see that the 3-column format and the boxcar function (EVs 1 and 3) are nearly identical, and that there is a difference between the estimated shape of EVs 2 and 3.


      Best,

      -Andy

      Delete
    5. Hi Andy,
      Thanks a lot.

      Best,
      Simin

      Delete
  9. Hi Andy,

    I am a medical student doing research in neuroimaging, and I had a few quick questions for you. My research adviser has me doing both 3 and 1 column format EVs for the Feat analysis of a study we are doing. What exactly is the difference between 3 and 1 column? And how does FSL model these two types differently? If you could help me with this I would really appreciate it.

    Also I just wanted to thank you for your videos and tutorials. You're a life saver.

    -Nick

    ReplyDelete
    Replies
    1. Hey Nick,

      1-column format specifies whether the condition occurred at a timepoint (i.e., a TR) or not, by entering either a 0 or a 1 for each timepoint. 3-column format gives you more flexibility by allowing you to specify onsets not occurring on a TR, and by allowing you to specify duration and parametric modulation. In either 1-column or 3-column format, the onset is convolved with the basis function that you specify.

      For most cases I prefer 3-column format since it can do everything 1-column format can do, and more.

      -Andy

      Delete
  10. Hi Andy,
    Firstly- thanks for this post (and your blog). It's so very helpful!
    I have different timing files for each participant because I want to look at correct conditions together. Thus, based on behavioural performance, the number of events and timing is different across subjects.
    In this video you loaded in one timing file per condition. Is there an option to load in one timing file per participant per condition? Or should I use the third column in the timing file to note whether the trial was correct or not (for example '2' for correct trials versus '1' for incorrect)?
    Many thanks!

    ReplyDelete
  11. Hi Andy,

    I know you mentioned that the image on the right (when you click on efficiency) represents the singular value decomposition eigenvalue for each condition, and if any of the diagonals are dark or black, it's a problem. In the example that you have pictured above, the 3 bottom right diagonals look dark/black. Would this be a problem?

    If so, what does the problem mean/how do you interpret it?

    Also, what can you do to address the problem? Is it that the effect size you need to see a difference is just higher, so you have lower power or is there something inherently wrong/incorrect/flawed about the EV and contrasts you've used to for your model/analysis?

    Thank you for your help in advance!

    ReplyDelete
  12. Hi Andy, firstly, thank you for the blog! I'm analysing fMRI data for the first time, practically by myself, and it's been very helpful. I got stuck at the contrasts bit, however, and thought I'd ask if you could help me as the deadline for my thesis is approaching and I have nobody else to turn to (my supervisor doesn't know FSL..). It would save my life!

    Basically, I just need an experienced opinion to check that my contrasts actually reflect my hypotheses, before I press go.

    I have 6EVS - 4 emotional facial expressions (EV1 anger, EV2 disgust, EV3 surprise, EV4 happiness) and control (EV5 neutral, EV6 moving object).

    I want to find:
    a) areas more active (and positive) for a given emotion than for control
    b) areas more active (and positive) for at least one emotion than for control
    c) areas more active (and positive) for all emotions than for control
    d) focus on specific ROIs and find out which emotions activate them the most (> than all other emotions and control, and positive)

    FIRST-LEVEL CONTRASTS
    OC1 1 0 0 0 0 0
    OC2 0 1 0 0 0 0
    OC3 0 0 1 0 0 0
    OC4 0 0 0 1 0 0
    OC5 0 0 0 0 1 0
    OC6 0 0 0 0 0 1
    OC7 1 0 0 0 -1 -1
    OC8 0 1 0 0 -1 -1
    OC9 0 0 1 0 -1 -1
    OC10 0 0 0 1 -1 -1
    OC11 1 1 1 1 0 0
    OC12 1 -1 -1 -1 -1 -1
    OC13 -1 1 -1 -1 -1 -1
    OC14 -1 -1 1 -1 -1 -1
    OC15 -1 -1 -1 1 -1 -1

    F1: OC1, OC2, OC3, OC4
    F2: OC7, OC8, OC9, OC10

    From my understanding:
    a) (for anger): OC7 masked by OC1; (for disgust): OC8 masked by OC2 etc.
    b) F2 masked by F1
    c) OC11 masked by OC7 (already masked by OC1) and OC8, OC9, OC10
    d) OC12 masked by OC1; OC13 masked by OC2 etc.
    ...while masking using Z>0 instead of Z stats pass thresholding.

    Specific things I'm not sure about:
    - should the numbers in a row add up to 0? I.e. should my OC7 actually be 1 0 0 0 -0.5 -0.5?
    - does the maskception actually work in the way I image? ("OC11 masked by OC7 (already masked by OC1"), or would "OC11 masked by OC7" simply mean OC11 masked by 1 0 0 0 -1 -1?

    Thank you in advance, your help will be greatly appreciated!!

    ReplyDelete
    Replies
    1. Hi Luc,

      Sorry for getting back to you so late; I haven't been checking this site as regularly!

      As for your questions, yes, the positive weights should add to +1, and the negative weights should add to -1. For example, with OC7, you want the average control activation; if you don't make the weights sum to -1 in that case, it will be biased towards the negative weights simply because there are more regressors that are weighted negatively.

      As for your masking question, be careful about defining an ROI with a contrast and then extracting that same parameter from it. This is known as circular analysis, or double dipping, and it artificially inflates your ROI estimates.

      For example, if you mask OC11 by OC7, the ROI defined by OC7 includes any voxels where the anger EV is higher than the control EVs (or where the control EVs are lower than the anger EV). If you then look for active voxels in that region for OC11, OC11, already includes the anger EV, so it's biased toward finding a significant finding there. In any case, it would not be a valid inferential test. I would recommend using an unbiased ROI, such as an anatomical ROI or an ROI based on coordinates from related studies (i.e., an independent dataset).

      Feel free to reply to this; I'll be checking for an answer, and will respond sooner.


      Best,

      -Andy

      Delete
  13. Hi Andy!

    Thanks for your videos! They saved me a lot of times!

    Now I would like to ask you a few questions on the EV design (custom 3 columns format).
    I have a dataset, 6 runs 30 stimuli, 2x2 i.e. 4 per each run.
    My question is: how can I understand when each event was on and for how long, in order to make to file to upload?

    I hope my question is clear, thank you a lot!
    Carlotta.

    ReplyDelete