Sunday, July 20, 2014

Quick and Efficient ROI Analysis Using spm_get_data

For any young cognitive neuroscientist out for the main chance, ROI analyses are an indispensable part of the trade, along with having a cool, marketable name, such as Moon Unit or Dweezil Zappa. Therefore, you will find yourself doing a lot of such ROI analyses; and the quicker and more efficient you can do them, with a minimum of error, will allow you to succeed wildly and be able to embark upon an incredible, interesting career for the next four or five decades of your life before you die.

Most ROI analyses in SPM are carried out through the Marsbar toolbox, and for most researchers, that is all they will ever need. However, for those who feel more comfortable with the command line, there is a simple command within the SPM library - spm_get_data - that will make all of your ROI fantasies come true. All the command needs is an array of paths leading to the images you wish to extract data from, along with a matrix of coordinates representing the ROI.

First, the ROI coordinates can be gathered by loading up an arbitrary contrast and selecting an ROI created through, say, Marsbar or wfu_pickatlas. Next, set your corrected p-value threshold to 1; this will guarantee that every voxel in that ROI is "active," which will be recorded by a structure automatically generated each time the Results GUI is opened, a structure called xSPM. One of the fields, xSPM.XYZ, contains the coordinates for each voxel within that ROI. This can then be assigned to a variable, and the same procedure done for however many ROIs you have. The best part, however, is that you only need to do this once for each ROI; once you have assigned it to a variable, you can simply store it in a new .mat file with the "save" command (e.g., save('myROIs.mat', 'M1', 'ACC', 'V1')). These can then be restored to the Matlab workspace at any time by loading that .mat file.

Note: An alternative way to get these coordinates just from the command line would be the following:

Y = spm_read_vols(spm_vol(seedroi),1);
indx = find(Y>0);
[x,y,z] = ind2sub(size(Y),indx);

XYZ = [x y z]';

Where the variable "seedroi" can be looped over a cell containing the paths to each of your ROIs.

The next step is to create your array of images you wish to extract data from - which, conveniently, is stored within the SPM.mat file that is created any time you run a second-level analysis. For example, let's say that I carried out a couple of 2nd-level t-tests, one for left button presses, the other for right button presses. If I go into the folder for left button presses that has already estimated an run a second-level analysis, all of the contrast images that went into that analysis are now stored in SPM.xY.P, which is available in your workspace after simply navigating to the directory containing your SPM.mat file and typing "load SPM".

Lastly, spm_get_data is called to do the ROI analysis by extracting data from each voxel in the ROI for each subject for each contrast, and these are averaged across all of the voxels in that ROI using the "mean" function. Sticking with the current example, let's say have a left M1 region and a right M1 region, the coordinates of which have been extracted using the procedure detailed above, and which have been saved into variables called left_M1 and right_M1, respectively. I then navigate to the left button presses second-level directory, load SPM, and type the following command:

right_M1_leftButtonPress = mean(spm_get_data(SPM.xY.P, right_M1),2)

which returns an array of one value per subject for that contrast averaged across the ROI. You can then easily navigate to another second-level directory - say, right button presses - and, after loading the SPM.mat file, do the same thing:

right_M1_rightButtonPress = mean(spm_get_data(SPM.xY.P, right_M1),2)

T-tests can then be carried out between the sets of parameter or contrast estimates with the t_test function:

[h, p, ci, stats] = ttest(right_M1_leftButtonPress, right_M1_rightButtonPress)

which will return the p-value, confidence interval, and t-value that you would then report in your results.


  1. Hi Andy,

    Thank you for your post. I had assumed that using marsbar and spm_get_data to extract data would give identical results but to my surprise, that is not the case (or maybe I did something wrong).

    For instance, for a particular ROI, in marsbar GUI, I'd use the following steps would be: Design --> Set design from file. Data--> Extract ROI data (default). Data-->Export data. Export what--> Summary time course(s) for region(s). export to--> Matlab workspace. Name data1

    For the same ROI, using spm_get_data like your example:
    data2= mean(spm_get_data(SPM.xY.P, roi_coordinates),2);

    Here, I get two different set of data for data1 and data2. Do you happen to know what differs between the two methods of extraction?

    Many thanks.

    Sam A.

    1. Hi Sam,

      It looks as though you are doing the two approaches right, so I'm unsure of where the disconnect is; I would check SPM.xY.P to make sure that it is pointing to the contrast files you want to work with, and also that the roi_coordinates is the same size as the cluster you select in the SPM results GUI. Other than that, I may need to see the actual data you are working with to debug it.



  2. Hi Andy,

    Thank you for the response. I think the spm_get_data gives you the raw data while the marsbar approaches gives you the timeseries after baseline correction (baseline=average signal of all voxels within ROI). Function getdata.m from marsbar is likely to be equivalent to spm_get_data but I'm not 100% certain.



  3. Hi Andy,

    Thank you for sharing the knowledge! You're blog is great - very useful and easy to follow! I have tried this analysis though, and I actually get different coordinates when selecting them from the SPM GUI, setting the threshold to 1, as you mention, and from the code lines you have above (using spm_read_vols). Do you have any idea why this could be the case?


    1. Hi Catarina,

      That is probably because the mask is at a different resolution than your contrast images; for example, if you load up the Results screen and look at the "Voxel size" field in the lower right corner, you will see something like 3.0 3.0 3.0 mm mm mm, which means your images have a resolution of 3x3x3mm. If your ROI mask has a resolution of, say, 2x2x2mm, then when you load it as a mask in the Results screen, it will be resampled to have the same resolution as the functional images. If you take the original mask and resample it to your functional image resolution (using a program like AFNI's 3dresample), then you should get nearly identical results with each procedure.

      There's a new way to extract beta weights - at least, in SPM12. You can just right click on the results in the glass brain, and select Extract data -> betas -> This cluster. That makes everything I've written about this obsolete :-(


    2. Hi Andy,

      I am actually having the same problem as Catarina. so my question is: in that case, using the SPM GUI and setting the threshold to 1 would be the right way to handle different resolutions?

      if yes, what should I do in cases where I get no results at all in the GUI?

      thank you very much,


    3. This comment has been removed by the author.

    4. Hi again (:

      another problem I am having is that I get slightly different coordinates for different subjects. any idea why? since I only extract the XYZ of the ROI which is the same, I would expect identical results.

      thank you very much, your blog is the best!


  4. Hi Andy,
    Your blog is wonderful and has helped me many a time, thanks!

    I am conducting ROI analysis at the first-level because I need to anatomically define my ROIs in each participant. Using spm_get_data I have extracted parameters from for each voxel in my ROI. These numbers are large (around 600-700) and there is a value perscan for each voxel within the ROI.

    I have 2 questions:
    1. When averaging, I'm assuming I should average across voxels for each scan? And not vice-versa

    2. What's the next step? Can these values be inputed directly into second-level analysis?

    Many thanks,

  5. Hi Andy, many thanks for your blog which is a valuable resource for me. I'm sure you have this info somewhere in your blog, but I wanted to add it to this thread as it is relevant for those interested in doing this.,_standard_error_and_confidence_interval_for_a_coordinate