Wednesday, December 17, 2014


Fellow brainbloggers,

Like a neglectful parent, I have been away far too long - although I did leave twenty dollars on the kitchen table for you to order pizza - and while I could recite the usual litany of excuses, including job searching, teaching, writing papers, sorting the recycling, learning glassblowing, and recovering from gout, none of those warrants such an excessive absence; especially given the supportive, heartwarming messages I have received from some of you describing how the blog has helped save time, money, and, I like to think, relationships. Maybe. If you've learned nothing else, just remember that the most important part of a relationship is winning every argument.

However, as the end of my graduate school years comes closer, I have had to make difficult decisions resulting in the abandonment of the mistress of my youth, blogging, for the wife of mine age, dissertation writing. I never thought I would end up like all the others at this stage, finding myself using all of my free time to work on my dissertation. Partly it is because using the excuse of working on your dissertation is incredibly effective at getting out of any undesirable obligations. For example:

Injured Friend: Andy, I need to stanch the bleeding caused by a freak accident involving a pair of nosehair trimmers. Please help.
Me: You know I would, but I have to work on my dissertation.
Injured Friend: Of course; how selfish of me.

Girlfriend: Andy, we've been in a long-distance relationship for three years now, but during that time I have only seen you once on Skype - and that was just to tell me the joke about who they found in Jeffrey Dahmer's freezer*. I need more commitment from you.
Me: Sorry, Schmoopy, but I have to work on my dissertation.
Girlfriend: Whenever you say "I have to work on my dissertation," you sound so incredibly confident, productive, and sexy. I completely understand, and apologize for any inconvenience I may have caused.

Advisor: Andy, you need to spend less time screwing around, and more time working on your dissertation.
Me: Maybe, but I should probably work on my dissertation instead.
Advisor: Holy mackerel, you need to get on that.

The manifest absurdity of such statements doesn't seem to strike anybody as particularly odd - after all, you are slogging away for long hours on something productive and good, and not, say, out in the streets mugging anybody, or committing cyberterrorism - and this fills many graduate students with a satisfying sense of self-righteousness.

In any case, if you have written regarding any questions or issues, I will get to it eventually - there is a bit of a backlog at the moment, and I've had to do some triage. Although you should probably know that, again like a neglectful parent, I will be going to Spain for the holidays. That's right, I said Spain. And although I realize it may not always be the wisest thing to advertise when I am away from my apartment, please don't break into it while I'm gone. (Although, if you do, there are hotpockets in the freezer. Just saying.)

*Ben and Jerry.

Wednesday, October 8, 2014

How to Secure a Job in Academia

Ending graduate school and going on the job market is a terrifying prospect, especially for those nursing at the teat of a graduate student stipend. Sure, it's not an especially large amount of money, but it gets you by, pays for rent, pays for the food, and possibly pays for Netflix. The only reason you would leave it is for the more attractive teat of a professor's salary, which, if you hit the jackpot and receive tenure, you will get for the rest of your life. That is, unless you screw up bigtime by neglecting your teaching and research duties, have destructive affairs with your students, and in general completely abuse the purpose of tenure.

I am, of course, joking. There's no way you would ever lose tenure. That's why it's so popular: You can act however you want and nobody can do anything to stop you. Seriously. The U.S. military is currently experimenting with granting soldiers tenure, complete with sabbaticals every three years, and finding that they become invincible on the battlefield.

Obviously, then, securing a tenure-track job is important. If nothing else, you will need something to do for the next few decades of your life before you begin to decay and die. The rest of your friends have jobs, some of them on Wall Street. You're never quite sure what it is that they do, since most of the pictures you see of them, from what you can make out, involve cocaine-fueled orgies with celebrities. Still, they have jobs. They have purpose. The purpose of life, actually - and this is what everyone, deep down, believes in their core - is to have a secure job that pays well and that everyone else admires, even envies. The best jobs (and this is especially true in modern democracies) will dole out prizes regularly, and, ideally, you will get those prizes. 

This is the meaning of life. Anyone who tells you otherwise is wrong. I am right. The notion that there could be anything more to life is pernicious, even hateful, and you will remove it from your mind. I permit you to find the leisure time to read books, go to the opera, appreciate art, take up yoga, become politically involved, choose to become religious or to become an atheist, determine what your values are, form meaningful relationships. These activities will make you feel like a swell person, like an authentic human being, and you will derive much pleasure from comparing how well-rounded and how thoughtful you are to others. But one must never lose sight of what matters.

That is why I recommend using the website to build your job application. The website is managed by Dr. Karen Kelsky, who has had literally oodles of experience reviewing job applications and has a nose for what works and what does not work. Everybody uses her site. Everybody. If you do not use her site, you will fail. Failure means not getting the job, which means you will not have purpose in your life.

You should be terrified at this prospect. You may think there are alternatives. There are no alternatives. The most successful tyranny does not suppress alternatives, it removes awareness of alternatives. This is me establishing a tyranny over you. You will obey. This is easy, since you have already been conditioned to feel this way by graduate school. You love jobs, prizes, and the acclaim of your peers; you are horrified by poverty, debt, shame. It is natural. Everyone feels it, no matter what they do. I have known scores of individuals desperately trying to lead bohemian existences, but in the end they all came back to the importance of a good job. Even those who most fervently preach the ideal of nonconformity, sincerity, and independence of mind, are those who, underneath their outrageous behavior and wild external adornments, lead the most traditional and safest of lives. For all of the exotic places they travel to, for all the louche connections they boast about, their internal lives are flat, their sexual lives withered. It is not the divine madness Socrates praised, nor is it the direct, immediate, nonintellectual perception of reality so highly prized by Lawrence. It is a stopgap, a rearguard action; merely something to fill up the vile lacuna in the middle of their existence.

But I digress. What I mean to say is that you should follow my orders for getting a job. Following my orders is not weakness. It is rational. You will want to get the job, so you will go to the website I just gave you. You will follow its instructions. You will both smile and cringe at the anecdotes which hit close to home for you. You will compare its examples with what you have written, and find out where you are wrong and she is right.

The reason for all of this is to be secure. There was a time where desiring this above all else was considered cowardly, pusillanimous, and shameful, but that was then. This is now. You may sneer at all of this, but you know that I am right. You may have faint stirrings of indignation that rebel against everything I have said, but you will still do what I say. Do this, and you will be happy. The notion that happiness consists in anything else is laughable. Happiness is promised by health, security, and a sound mind; not by Plato, Dickens, and Hemingway. Give men bread, and then ask of them virtue.

Tuesday, September 23, 2014

Andy's Brain Blog Needs Your Help!

To be more specific, I (Andy) need your help; but what's good for Andy's Brain Blog is good for America - and you're all patriots, right?

As I mentioned before, I am currently applying for jobs and putting together my research and teaching portfolios, playing up all the sexy studies currently in the works, and what I plan to do for the next few years; how I can attract students to the university, students to the department, secure money, funding, recognition, and all that good stuff necessary for the vitality of the university.

However, as all of you know, this right here is one of my dearest, most ambitious projects - to make statistics, neuroimaging, and computational modeling available to everyone in straightforward, simple terms. To use online repositories to get at data unavailable to the majority of smaller, liberal arts institutions, so that students from all parts of the globe, researchers anywhere, and quite literally anyone with a computer can get a piece of the action. To make the information in dense, unreadable technical manuals accessible and easy to understand through hands-on, no-nonsense tutorials. And - perhaps most importantly - I wear a suit when I do it.

I want to continue doing this. I want to continue building, expanding, and teaching, both here and in the classroom. I will not rest: I will drink life to the lees. Heck, maybe I'll even drink the lees too.

But to do that I need your help.

Through both the comments here, on the YouTube channel, and in private correspondence, I've talked with many researchers, students, and professors around the country and around the world. Most of you I've never seen, but I've had the privilege to help out professors and scholars all the way from Italy to China; I've freelanced for PET researchers at Michigan State, schizophrenia experimenters at Indiana, designed experiments for primates in New York. The AFNI tutorials created here have been used as class material at the University of Pittsburgh, and my code for Hodgkin-Huxley simulations have been used for demonstrations at Claremont McKenna College in California. My recipe for homemade granola is used by several hungry researchers to keep them going throughout those long afternoons. The list could go on.

What I ask for is if you have ever used the materials here in an educational setting, be it for the researchers in your lab or the students in your classroom, please let me know by sending an email to ajahn [at] indiana [dot] edu. I am trying to compile a list of where it is used, to demonstrate its use and effectiveness.

Lastly - allow me to get real here for a moment - I've thought of all of you, each person that I've conversed with or replied to or Skyped with, as my children. And not as "children" in the sense of putting you all down on my tax returns as dependents; although, if the IRS isn't too strict about metaphors, I think I could get away with that. No, I've thought of you as real children: Shorter than I am, a little incoherent at times maybe, often trying to get my attention, and playing with Legos and Gak.

Regardless, it's been one of my great pleasures to help you all out. Now get away from that electrical socket.

Sunday, September 21, 2014

Bayesian Inference, Step 2: Running Bayesian Inference with Your Data

If you've had the opportunity to install R Studio and JAGS, and to download the associated programs needed to run Bayesian inference, it is a small step to actually getting your parameter estimates. Two of the programs from the BEST folder - BESTExample.R, and BEST1G.R - allow you to run independent-samples and one-sample t-tests, respectively. All that's required is changing the input in the "y1" and "y2" strings with your data, separate each observation with a comma, and then run the program. Other options can be changed, such as which value you are comparing against in the one-sample t-test, but everything else can essentially remain the same.

I realize it's been a while between posts, but right now I'm currently in the process of applying for jobs; this should start to pick up again in mid-October once the deadlines pass, but in the meantime, wish me luck!

Tuesday, September 2, 2014

Bayesian Inference, Step 1: Installing JAGS On Your Machine

Common complaint: "Bayesian analysis is too hard! Also, I have kidney stones."
Solution: Make Bayesian analysis accessible and efficient through freeware that anyone can use!

These days, advances in technology, computers, and lithotripsy have made Bayesian analysis easy to implement on any personal computer. All it requires is a couple of programs and a library of scripts to run the actual process of Bayesian inference; all that needs to be supplied by you, the user, is the data you have collected. Conceptually, this is no more difficult then entering in data into SAS or SPSS, and, I would argue, is easier in practice.

This can be done in R, statistical software that can interface with a variety of user-created packages. You can download one such package, JAGS, to do the MCMC sampling for building up distributions of parameter estimates, and then use those parameter estimates to brag to your friends about how you've "Gone Bayes."

All of the software and steps you need to install R, JAGS, and rjags (a program allowing JAGS to talk to R) can be found on John Kruschke's website here. Once you have that, it's simply a matter of entering in your own data, and letting the program do the nitty-gritty for you.

Tuesday, August 19, 2014

Gone for a Few Days

Going to be out of the state for a few days; a few of you have written questions to me about technical issues or methods, and I'll try to get back to you as soon as I can!

In the meantime, as is custom whenever I go on vacation, here is another late masterwork by Brahms. Always a mysterious figure, one of the greatest enigmas surrounding Brahms's life was his relationship with Clara Schumann, piano virtuoso and widow of the famous composer Robert Schumann. Scholars have speculated for decades what happened between them - because, obviously, knowing whether they hooked up or not is essential for how we listen to and appreciate the music of Brahms, and such a line of inquiry is a valid funding opportunity for government grants.

Fortunately, however, I recently came across one of Brahms's lost journals in a second-hand bookstore; and the secrets disclosed within will finally give us an answer. Let's take a look:

November 4th, 1876: Cloudy today. Composed first symphony.

August 9th, 1879: Went clubbing with Liszt and Wagner. Hooked up with Clara Schumann. No doubt about it.

October 24th, 1884: Note to self: Find out why I'm writing my journal in English.

April 3rd, 1897: Died from cancer.

Monday, August 18, 2014

Bayesian Inference Web App That Explains (Nearly) Everything

For those still struggling to understand the concepts of Bayesian inference - in other words, all of you - there is a web app developed by Rasmus Baath which allows one to see the process unfolding in real time. Similar to an independent-samples t-test, we are trying to estimate a parameter of the mean difference between the populations that the samples were drawn from; however, the Bayesian approach offers much richer information about a distribution of parameter values, and, more importantly, which ones are more credible than others, given the data that has been collected.

The web app is simple to use: You input data values for your two groups, specify the number of samples and burn-in samples (although the defaults for this are fine), and then hit the Start button. The MCMC chain begins sampling the posterior distribution, which builds up a distribution of credible parameter values, and 95% of the distribution containing the parameter values with the most credibility is labeled as the highest density interval, or HDI. This same procedure is applied to all of the other parameters informed by the data, including normality, effect size, and individual group means and standard deviations, which provides a much richer set of information than null hypothesis significance testing (NHST).

Because of this I plan to start incorporating more Bayesian statistics into my posts, and also because I believe it will overtake, replace, and destroy NHST as the dominant statistical method in the next ten years, burning its crops, sowing salt in its fields, looting its stores, stampeding its women, and ravishing its cattle. All of this, coming from someone slated to teach the traditional NHST approach next semester; which, understandably, has me conflicted. On the one hand, I feel pressured to do to whatever is most popular at the moment; but on the other, I am an opportunistic coward.

In any case, Bayesian inference is becoming a more tractable technique, thanks to programs that interface with the statistics package R, such such JAGS. Using this to estimate parameters for region of interest data, I think, will be a good first step for introducing Bayesian methods to neuroimagers.

Friday, August 8, 2014

Parametric Modulation with SPM: Why Order Matters

Those who are the youngest children in the family are keenly aware of the disadvantages of being last in the pecking order; the quality of toys, clothes, and nutrition is often best for the eldest children in the family, and gradually tapers off as you reach the last kids, who are usually left with what are called, in parent euphemisms, "hand-me-downs."

In reality, these are things that should have been discarded several years ago due to being broken, unsafe, containing asbestos, or clearly being overused, such as pairs of underwear that have several holes in them so large that you're unsure of where exactly your legs go.

In SPM land, a similar thing happens to parametric modulators, with later modulators getting the dregs of the variance not wanted by the first modulators. Think of variance as toys and cake, and parametric modulators as children; everybody wants the most variance, but only the first modulators get the lion's share of it; the last modulators get whatever wasn't given to the first modulators.

The reason this happens is because when GLMs are set up, SPM uses a command, spm_orth, to orthogonalize the modulators. The first modulator is essentially untouched, but all other modulators coming after it are orthogonalized with respect to the first; and this proceeds in a stepwise function, with the second modulator getting whatever was unassigned to the first, the third getting whatever was unassigned to the second, and so on.

(You can see this for yourself by doing a couple of quick simulations in Matlab. Type x1=rand(10,1), and x2=rand(10,1), and then combine the two into a matrix with xMat = [x1 x2]. Type spm_orth(xMat), and notice that the first column is unchanged, while the second is greatly altered. You can then flip it around by creating another matrix, xMatRev = [x2 x1], and using spm_orth again to see what effect this change has.)

Be aware of this, as this can dramatically alter your results depending on how your modulators were ordered. (If they were completely orthogonal by design, then the spm_orth command should have virtually no impact; but this scenario is rare.) This came to my attention when looking at a contrast of a parametric modulator, which was one of three assigned to a main regressor. In one GLM, the modulator was last, and correspondingly explained almost no variance in the data; while in another GLM, the modulator was first, and loaded on much more of the brain. The following two figures show the difference, using exactly the same contrast and correction threshold, but simply reversing the order of the modulators:

Results window when parametric modulator was the last one in order.

Results window when parametric modulator was the first one in order.

This can be dealt with in a couple of ways:
  1. Comment out the spm_orth call in two scripts used in setting up the GLM: line 229 in spm_get_ons.m, and lines 285-287 in spm_fMRI_design.m. However, this might affect people who actually do want the orthogonalization on, and also this increases the probability that you will screw something up. Also note that doing this means that the modulators will not get de-meaned when entered into the GLM.
  2. Create your own modulators by de-meaning the regressor, inserting your value at the TR where the modulator occurred, then convolving this with an HRF (usually the one in If needed, downsample the output to match the original TR.

It is well to be aware of all this, not only for your own contrasts, but to see how this is done in other papers. I recall reading one a long time ago - I can't remember the name - where several different models were used, in particular different models where the modulators were entered in different orders. Not surprisingly, the results were vastly different, although why this was the case wasn't clearly explained. Just something to keep in mind, as both a reader and a reviewer.

I should also point out that this orthogonalization does not occur in AFNI; the order of modulators has no influence on how they are estimated. A couple of simulations should help illustrate this:

Design matrices from AFNI with simulated data (one regressor, two parametric modulators.) Left: PM1 entered first, PM2 second. Right: Order reversed. The values are the same in both windows, just transposed to illustrate which one was first in the model.

More information on the topic can be found here, and detailed instructions on how to construct your own modulators can be found on Tor Wager's lab website.

Wednesday, August 6, 2014

DTI Analysis: Soup to Nuts Playlist

Instead of going through each DTI analysis step individually, I've collated everything into a Youtube playlist down below. Just remember that we are using data from the FSL practical course here, and also remember that suing somebody for giving out bad advice, although it is admittedly an easy way to become fantastically wealthy, won't necessarily make you happier.

In any case, just to briefly go over the rest of the steps: After correcting for magnetic field distortions and eddy currents, tensors are fitted using the dtifit command (or simply going through the FDT menu in the FSL interface). Once this has been done for each subject, a series of TBSS tools are used, each one prefixed by "tbss"; for example, tbss_1_preproc, tbss_2_reg, and so on. (You can find all of these in the $FSLDIR/bin directory, and if you have a good handle on Unix programming, you can inspect the code yourself.) After you have run all of those for your dataset, you set up the appropriate experimental design and contrasts, and use the randomise tool to perform statistics in your tractography mask.

Keep in mind that this is just beginner's material; and that if you were to compare your DTI competence to dating, if would be like you were still in that awkward teenager phase, unable to talk to anybody or make eye contact.

However, much of this material has already been covered in other blogs and online resources, provided by several other highly talented scientists and researchers, and - as much as I constantly fantasize about establishing a monopoly over neuroimaging information - there is no practical way to destroy them all.

Therefore, instead of posting redundant information, I highly recommend checking out an ongoing step-by-step series on TORTOISE by Peter Molfese, which you can compare to your results with FSL, and another blog dedicated to diffusion imaging, The latter site covers virtually every piece of diffusion imaging software and pipeline out there, and is a good place to start.

Monday, August 4, 2014

DTI Analysis, Steps 1 & 2: Distortion Correction and Eddy Correction

DTI data comes into this world just as we do - raw, unprocessed, unthresholded, and disgusting. Today's parents are much too easy on their children; back in my day, boys and girls were expected to correct their own erratic motion artifacts, to strip their own skulls when too much cranium started sprouting around their cortex, and reach significance through their own diligence and hard work. "No dinner for you until you've gotten rid of those filthy distortion currents on your hands, young man!" was a common saying around my household.

Unfortunately, these days we live in a society increasingly avoiding taking responsibility for their own actions, filled with people more and more unwilling to strip their own skulls. (Even though that would save neuroimaging researchers countless hours of preprocessing.) This trend is the result of several complex cultural and sociological factors, by which I mean: Lawyers. Things have gotten so bad that you could, to take a random example, potentially sue someone posting advice on an unlicensed website if following those instructions somehow messed up your data.

This actually happened to a friend of mine who publishes the nutrition blog tailored for Caucasian males, White Man's Secret. After one of his readers formed kidney stones as a result of following advice to drink a two-liter of Mello Yello every day, the author was sued for - and I quote from the court petition - "Twelve Bazillion Dollars."

In any case, DTI data takes some time to clean up before you can do any statistics on it. However, the preprocessing steps are relatively few, and are analogous to what is done in a typical FMRI preprocessing pipeline. The steps are to identify and correct any distortions during the acquisition, and then to correct for eddy currents.

The first step, distortion correction, can be done with FSL's topup command. Note that this will require two diffusion weighted images, one with the encoding direction in anterior-to-posterior direction (AP) and one encoded in the posterior-to-anterior direction (PA). You will also need a textfile containing information about the encoding directions and the total readout time (roughly the amount of time between the first echo and the last echo); for example, let's say we have a text file, acqparams.txt, containing the following rows:

0 -1 0 0.0665
0 1 0 0.0665

Here a -1 means AP, and 1 means PA. The last column is the readout time. If you have any questions about this, ask your scanner technician. "What if he's kind of creepy?" I don't care, girl; you gotta find out somehow.

While you're at it, also ask which image from the scan was the AP scan, and which was the PA scan. Once you have identified each, you should combine the two using fslmerge:

fslmerge -t AP_PA_scan AP_scan PA_scan 

Once you have that, you have all you need for topup:

topup --imain=AP_PA_scan --datain=acqparams.txt --out=topup_AP_PA_scan

This will produce an image, topup_AP_PA_scan, of where the distortions are located in your DTI scan. This will then be called by the eddy tool to undo any distortions at the same time that it does eddy correction.

The only other preparation you need for the eddy correction is an index list of where the acquisition parameters hold (which should usually just be a 1 for each row for each scan), and a mask excluding any non-brain content, which you can do with bet:

bet dwi_image betted_dwi_image -m -f 0.2

Then feed this into the eddy command (note that a bvecs and bvals image should be automatically generated if you imported your data using a tool such as dcm2niigui):

eddy --imain=dwi_image --mask=betted_dwi_image --index=index.txt --acqp=acqparams.txt --bvecs=bvecs --bvals=bvals --fwhm=0 --topup=topup_AP_PA_scan --flm=quadratic --out=eddy_unwarped_images

Note that the --fwhm and --flm arguments I chose here are the defaults that FSL recommends.

The differences before and after can be seen in fslview:

Of course, the distortion correction step isn't strictly necessary; and you can still do a much simpler version of eddy correction by simply selecting FDT Diffusion from the FSL menu and then selecting the Eddy Correction tab, which only requires input and output arguments. All I'm saying is that, if you're feeling particularly litigious, you should probably be safe and do both steps.

Friday, August 1, 2014

Blistering Fast Functional Connectivity Analysis with 3dTproject

Because speed is associated with freedom, happiness, and the American way - we want results, we want visible results, and we want them now, darn it! - it is no surprise that neuroimaging analysis is becoming exponentially faster and more efficient. Gone are the days when you could run a slice-timing step, go to bed, and wake up eight hours later when it finally finished. Most likely it crashed, meaning you had to run the entire thing all over again, but that didn't matter - the point was that you at least felt as though you were doing something during that time.

Regardless, we are approaching a radically different era now; and one of the harbingers of that era is AFNI's 3dTproject. Released a few months ago, this tool is now the default in both and when you do resting state analyses. It's quicker, more efficient, and allows fewer chances to mess things up, which is a good thing.

To include 3dTproject in your analysis pipeline, simply apply "rest" to the analysis initialization screen when running, or copy example #9 from the help documentation of Under no circumstances should you try running 3dTproject manually, since you, possessing the hand-eye coordination of a cheese burrito, will inevitably make a typo and screw something up. Nevertheless, if you insist on doing it yourself, I recommend using the following code that is automatically generated by the .py scripts:

3dTproject -polort 0 -input pb04.$subj.r*.blur+tlrc.HEAD -censor motion_${subj}_censor.1D -cenmode ZERO -ort X.nocensor.xmat.1D -prefix errts.${subj}.tproject

Needless to say, you will have to have already run your preprocessing steps and have run 3dDeconvolve with the -x1D_stop option to generate the necessary matrices.

Wednesday, July 30, 2014

Introduction to Diffusion Tensor Imaging: From Hospital Horror Story to Neuroimaging

It is well known that one of the deepest human fears is to be a patient in a hospital late at night, all alone, while a psychotic nurse injects you with a paralyzing agent, opens up your skull with a bone saw, and begins peeling away layers of your cortex while you are still awake.*

As horrifying as this nightmare scenario may be, it also lends an important insight into an increasingly popular neuroimaging method, diffusion tensor imaging (DTI; pronounced "diddy"). To be more gruesomely specific, the psychotic nurse is able to peel away strips of your brain because it has preferred tear directions - just like string cheese. These strips follow the general direction of fascicles, or bundles of nerves, comprising grey and white matter pathways; and of these pathways, it is white matter that tends to exhibit a curious phenomenon called anisotropy.

Imagine, for example, that I release a gas - such as, let's say, deadly neurotoxin - into a spherical compartment, such as a balloon. The gas, through a process called Brownian motion (not to be confused with the Dr. Will Brown frequently mentioned here) will begin to diffuse randomly in all directions; in other words, as though it is unconstrained.

However, release the same gas into a cylindrical or tube-shaped compartment, such as one of those cardboard tubes you used to fight with when you were a kid,** and the gas particles will tend to move along the direction of the tube. This is what is known as anisotropy, meaning that the direction of the diffusion tends to go in a particular direction, as opposed to isotropy, where diffusion occurs in all directions with equal probability.

Left two figures: Ellipsoids showing different amounts of anisotropy, with lambda 1, 2, and 3 symbolizing eigenvalues. Eigenvalues represent the amount of diffusion along a particular direction. Right: A sphere representing isotropy, where diffusion occurs with equal probability in all directions.

This diffusion can be measured in DTI scans, which in turn can be used to indirectly measure white matter integrity and structural connectivity between different areas of the brain. A common use of DTI is to compare different populations, such as young and old, and to observe where fractional anisotropy (FA) differs between groups, possibly with the assumption that less FA can be indicative of less efficient communication between cortical regions. There are other applications as well, but this is the one we will focus on for the remainder of the tutorials.

The data that we will be using can be found on the FSL course website, after scrolling down to Data Files and downloading File 2 (melodic and diffusion). I haven't been able to find any good online repositories for DTI data, so we'll be working with a relatively small sample of three subjects in one group, and three subjects in the other. Also note that while we will focus on FSL, there are many other tools that process DTI data, including some new commands in AFNI, and also a program called TORTOISE. As with most things I post about, these methods have already been covered in detail by others; and in particular I recommend a blog called, which covers both the AFNI and TORTOISE approaches to DTI, along with other tutorials that I thought I had been the first to cover, but which have actually already been discussed in detail. I encourage you to check it out - but also to come back, eventually.

*Or maybe that's just me.
**And maybe you still do!

Sunday, July 20, 2014

Quick and Efficient ROI Analysis Using spm_get_data

For any young cognitive neuroscientist out for the main chance, ROI analyses are an indispensable part of the trade, along with having a cool, marketable name, such as Moon Unit or Dweezil Zappa. Therefore, you will find yourself doing a lot of such ROI analyses; and the quicker and more efficient you can do them, with a minimum of error, will allow you to succeed wildly and be able to embark upon an incredible, interesting career for the next four or five decades of your life before you die.

Most ROI analyses in SPM are carried out through the Marsbar toolbox, and for most researchers, that is all they will ever need. However, for those who feel more comfortable with the command line, there is a simple command within the SPM library - spm_get_data - that will make all of your ROI fantasies come true. All the command needs is an array of paths leading to the images you wish to extract data from, along with a matrix of coordinates representing the ROI.

First, the ROI coordinates can be gathered by loading up an arbitrary contrast and selecting an ROI created through, say, Marsbar or wfu_pickatlas. Next, set your corrected p-value threshold to 1; this will guarantee that every voxel in that ROI is "active," which will be recorded by a structure automatically generated each time the Results GUI is opened, a structure called xSPM. One of the fields, xSPM.XYZ, contains the coordinates for each voxel within that ROI. This can then be assigned to a variable, and the same procedure done for however many ROIs you have. The best part, however, is that you only need to do this once for each ROI; once you have assigned it to a variable, you can simply store it in a new .mat file with the "save" command (e.g., save('myROIs.mat', 'M1', 'ACC', 'V1')). These can then be restored to the Matlab workspace at any time by loading that .mat file.

Note: An alternative way to get these coordinates just from the command line would be the following:

Y = spm_read_vols(spm_vol(seedroi),1);
indx = find(Y>0);
[x,y,z] = ind2sub(size(Y),indx);

XYZ = [x y z]';

Where the variable "seedroi" can be looped over a cell containing the paths to each of your ROIs.

The next step is to create your array of images you wish to extract data from - which, conveniently, is stored within the SPM.mat file that is created any time you run a second-level analysis. For example, let's say that I carried out a couple of 2nd-level t-tests, one for left button presses, the other for right button presses. If I go into the folder for left button presses that has already estimated an run a second-level analysis, all of the contrast images that went into that analysis are now stored in SPM.xY.P, which is available in your workspace after simply navigating to the directory containing your SPM.mat file and typing "load SPM".

Lastly, spm_get_data is called to do the ROI analysis by extracting data from each voxel in the ROI for each subject for each contrast, and these are averaged across all of the voxels in that ROI using the "mean" function. Sticking with the current example, let's say have a left M1 region and a right M1 region, the coordinates of which have been extracted using the procedure detailed above, and which have been saved into variables called left_M1 and right_M1, respectively. I then navigate to the left button presses second-level directory, load SPM, and type the following command:

right_M1_leftButtonPress = mean(spm_get_data(SPM.xY.P, right_M1),2)

which returns an array of one value per subject for that contrast averaged across the ROI. You can then easily navigate to another second-level directory - say, right button presses - and, after loading the SPM.mat file, do the same thing:

right_M1_rightButtonPress = mean(spm_get_data(SPM.xY.P, right_M1),2)

T-tests can then be carried out between the sets of parameter or contrast estimates with the t_test function:

[h, p, ci, stats] = ttest(right_M1_leftButtonPress, right_M1_rightButtonPress)

which will return the p-value, confidence interval, and t-value that you would then report in your results.

Thursday, July 17, 2014

Deep Thoughts

During my recent vacation I had the misfortune of witnessing two sublime musical events: One was seeing Natasha Paremski play Rachmaninoff's Third Piano Concerto (immediately dubbed "Rock's Turd" by quick-thinking music critics), and the other was attending a live performance of Mozart's Requiem in D Minor. Both were profound, noble, beautiful beyond compare; but hearing such masterworks also compelled me to evaluate my own life, an exercise I detest. Instead of being able to muse upon my own greatness and simply say to myself whatever words came into my mind - activities which I enjoy heartily - I was instead stunned into silence, and slowly, against my will, forced into the odious task of self-reflection. What have I done so far with my life, I thought, and what will I do with the usual biblical threescore and ten? Mozart and Mendelssohn were composing deathless music while still in their teens; Goethe began incubating the idea for Faust while only a boy; Liszt wrote the first version of his finger-breaking études when he was only fifteen years old. What was I doing around then? Obsessively working on my Starcraft win/loss ratio, and worrying whether my acne and alarming profusion of nasal hair were diagnostic of a thyroid disorder.

The way to happiness, as I am continually reminded, is to actively avoid any contact with anyone greater than you are; and if any acquaintance with a masterwork must be made, if only to be able to talk about it with an air of superficial sophistication, better to read a summary or criticism of it and find out what others have said about it, rather than experience the real thing. Or, if you dare, come into contact with it, but do not linger over it, or let it become part of your internal world, lest you realize how depressingly unfurnished and impoverished it really is. Read Madame Bovary quickly, for example, without pausing to observe how much craftsmanship and precision goes into every sentence, and how carefully weighed is each word; listen to the Jupiter Symphony only to become familiar with the musical themes, without thinking for a moment what kind of miracle this music is, and how no one will ever again write quintuple invertible counterpoint with such elegance and ease.

But should you ever chance to happen upon someone who makes you realize that this person thinks better than you, feels deeper than you, and has a vision of life rising above the paltry, the partial, and the self-defensive - horrors! - merely remind yourself that you would have been the same way, if only you had the advantages this person had while they were growing up, and only if you possessed the same temperament as they do - which, when you think about it, is really just a random occurrence. In this way you can begin to feel a bit better about having contributed so humiliatingly little; in this way, everyone can fancy themselves an embarrassed genius.


In any case, I've gotten a couple of comments and have made a couple of small changes to the SPM automated first-level script, such as catching whether certain files exist or not, and also dealing with runs that do not contain a particular regressor, which may be present in other runs. According to my imperial tests, it should be able to handle anything you throw at it. Maybe.

Wednesday, June 25, 2014

Escape to Los Angeles

Because you all need to know when and where I go on vacation, I will be traveling to Los Angeles, California, for the next week; followed by a trip to Minnesota. Those of you still in Bloomington: Please don't break into my apartment. But if you do, could you at least water my plants? I forgot about that when I left town this afternoon. Thanks!

However, just because I am on the other side of the country, hiking and swimming and laughing and playing the Schubert F Minor Fantasie and Grieg and Brahms cello sonatas, doesn't mean that I'll quit updating. Do you think I would ever quit that? Did we quit when the Germans bombed Pearl Harbor? Hell no. So remember to check back regularly, because I will continue to post on topics such as GLMs, MVPA, FIFY, and IMHO. Maybe. It really depends on how much fun I'm having.

Monday, June 23, 2014

Multiple Regression in SPM

A couple of videos have been posted about multiple regression in SPM, both at the first level and second level. Technically, almost all of the GLMs researchers usually set up use multiple linear regression, where a linear combination of weighted regressors is fit to the timecourse of each voxel. However, in SPM parlance, regressors are distinct from conditions: Conditions are typically convolved with some sort of basis function (usually a hemodynamic response function), whereas regressors are not convolved with anything. For example, someone may enter motion parameters as regressors, since they are not convolved with any hemodynamic shape, but still may account for some of the variance observed in the signal (and, we hope, the signal that is primarily due to motion). Another example would be entering in the timecourse extracted from an ROI to do a functional connectivity analysis, or inserting the interaction term from a PPI analysis as a regressor.

At the second level SPM also gives the option for doing multiple regression; but this is simply the entering of covariates which can be tested for correlations with the signal change in contrasts across subjects. In most cases, however, you can select any other design, and enter in covariates regardless of which design that you choose; which I believe to be the better option, since it is more flexible.

In any case, most of the rest of the details can be found in the following videos. Speak hands for me.

Thursday, June 19, 2014

Updated SPM Command Line, Including Beta Series Option

I've updated the code of the previously discussed previous script in the previous post, previously, including the addition of more help text to explain certain options and termination of the script if the timing file is not found.

However, the biggest change is a block of code to convert the timings files so that each trial is now a regressor, before entering them into the GLM. With so many regressors, the GLM will look a little funky; possibly something like this:

Note that, since there are so many regressors and so many scans, not all of them are going to be labeled individually; but it should look as though there is only one estimation per column, or blip of white underneath each regressor. Also due to space constraints the parameter estimability may not be completely visible, but when each trial is being estimated individually, you should still get a beta for each trial. Check the output directory to make sure that you have the number of betas you think you should have: one for each trial, plus the amount of session constants (usually equal to the number of runs for the subject).

The updated script can be found here; as always, send me any comments if you encounter any serious issues or bugs. I recommend using it as a replacement for the previously discussed previous script, since it can either estimate each trial individually or average them together, depending on whether you set the DO_BETASERIES flag to 1 or 0.

All of this is explained clearly in the following video:

Monday, June 16, 2014

Running SPM First-Level Analyses from the Command Line

For most people, doing first-level analyses in SPM is tedious in the extreme; for other people, these analyses are manageable through command line scripting in Matlab. And there are still other people who don't even think about first-level analyses or SPM or Matlab or any of that. This last class of people are your college classmates who majored in something else, like economics, and are living significantly less stressful lives than you, making scads of money, and regularly eating out at restaurants where they serve shrimp the size of Mike Tyson's forearm.

Assuming that you belong to the first category, however, first-level analyses are invariably a crashing bore; and, worse, completely impractical for something like beta-series analysis, where a separate regressor is needed for individual events.

The following script will do all of that for you, although it makes many assumptions about your timing files and where everything is stored. First, your directory needs to be set up with a root directory containing all of the subjects, as well as a separate timing directory containing your onsets files; and second, that the timing files are written in a four-column format of runs, regressors, onsets (in seconds), and duration. Once you have all of that, though, all it takes is simply modifications of the script to run your analysis.

As always, let me know if this actually helps, or whether there are sections of code that can be cleaned up. I will be making a modification tomorrow allowing for beta series analysis by converting all of the timing events into individual regressors, which should help things out.

Link to script:

%  rootDir        - Path to where subjects are located
%  subjects       - Vector containing list of subjects (can concatenate with brackets)
%  spmDir         - Path to folder where you want to move the multiple conditions .mat files and where to the output SPM file (directory is created if it doesn't
%  exist)
%  timingDir      - Path to timing directory, relative to rootDir
%  timingSuffix   - Suffix appended to timing files
%  matPrefix      - Will be prefixed to output .mat files.
%  dataDir        - Directory storing functional runs.
%  One multiple condition .mat file per run. Also specifies the GLM and runs
%  beta estimation.
%       Note that this script assumes that your timing files are organized
%       with columns in the following order: Run, ConditionName, Onset (in
%       seconds), Duration (in seconds)
%       Ideally, this timing file should be written out from your
%       presentation software, e.g., E-Prime
%       Also note how the directory tree is structured in the following example. You may have to
%       change this script to reflect your directory structure.
%       Assume that:
%         rootdir = '/server/studyDirectory/'
%         spmDir = '/modelOutput/'
%         timingDir = '/timings/onsets/'
%         subjects = [101]
%         timingSuffix = '_timing.txt'
%       1) Sample path to SPM directory:
%       '/server/studyDirectory/101/modelOutput/'
%       2) Sample path to timing file in the timing directory:
%       '/server/studyDirectory/timings/onsets/101_timing.txt'
% Andrew Jahn, Indiana University, June 2014

clear all

%%%----Things you will want to change for your study----%%%

rootDir = '/data/hammer/space5/ImagineMove/fmri/';
subjects = [214 215];
spmDir = '/RESULTS/testMultis/';
timingDir = 'fMRI_behav/matlab_input/';
timingSuffix = '_spm_mixed.txt';
matPrefix = 'testMulti';
dataDir = '/RESULTS/data/';

%Change these parameters to reflect study-specific information about number
%of scans and discarded acquisitions (disacqs) per run
funcs = {'swuadf0006.nii', 'swuadf0007.nii', 'swuadf0008.nii'}; %You can add more functional runs; just remember to separate them with a comma
numScans = [240 240 240]; %This if the number of scans that you acquire per run
disacqs = 2; %This is the number of scans you later discard during preprocessing
numScans = numScans-disacqs;
TR = 2; %Repetition time, in seconds

%Estimating the GLM can take some time, particularly if you have a lot of betas. If you just want to specify your
%design matrix so that you can assess it for singularities, turn this to 0.
%If you wish to do it later, estimating the GLM through the GUI is very


%For each subject, create timing files and jobs structure
for subject = subjects
    %See whether output directory exists; if it doesn't, create it
    outputDir = [rootDir num2str(subject) spmDir];
    if ~exist(outputDir)
    %---Navigate to timing directory and create .mat files for each run---%
    cd([rootDir timingDir]);
    fid = fopen([num2str(subject) timingSuffix], 'rt');
    T = textscan(fid, '%f %s %f %f', 'HeaderLines', 1); %Columns should be 1)Run, 2)Regressor Name, 3) Onset Time (in seconds, relative to start of each run), and 4)Duration, in seconds
    clear runs nameList names onsets durations sizeOnsets %Remove these variables if left over from previous analysis
    runs = unique(T{1});

    %Begin creating jobs structure
    jobs{1}.stats{1}.fmri_spec.dir = cellstr(outputDir);
    jobs{1}.stats{1}.fmri_spec.timing.units = 'secs';
    jobs{1}.stats{1}.fmri_spec.timing.RT = TR;
    jobs{1}.stats{1}.fmri_spec.timing.fmri_t = 16;
    jobs{1}.stats{1}.fmri_spec.timing.fmri_t0 = 1;

    %Create multiple conditions .mat file for each run
    for runIdx = 1:size(runs, 1)
            nameList = unique(T{2});
            names = nameList';
            onsets = cell(1, size(nameList,1));
            durations = cell(1, size(nameList,1));
            sizeOnsets = size(T{3}, 1);
        for nameIdx = 1:size(nameList,1)
            for idx = 1:sizeOnsets
                if isequal(T{2}{idx}, nameList{nameIdx}) && T{1}(idx) == runIdx
                    onsets{nameIdx} = double([onsets{nameIdx} T{3}(idx)]);
                    durations{nameIdx} = double([durations{nameIdx} T{4}(idx)]);
            onsets{nameIdx} = (onsets{nameIdx} - (TR*disacqs)); %Adjust timing for discarded acquisitions

        save ([matPrefix '_' num2str(subject) '_' num2str(runIdx)], 'names', 'onsets', 'durations')
        %Grab frames for each run using spm_select, and fill in session
        %information within jobs structure
        files = spm_select('ExtFPList', [rootDir num2str(subject) dataDir], ['^' funcs{runIdx}], 1:numScans);

        jobs{1}.stats{1}.fmri_spec.sess(runIdx).scans = cellstr(files);
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).cond = struct('name', {}, 'onset', {}, 'duration', {}, 'tmod', {}, 'pmod', {});
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).multi = cellstr([outputDir matPrefix '_' num2str(subject) '_' num2str(runIdx) '.mat']);
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).regress = struct('name', {}, 'val', {});
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).multi_reg = {''};
        jobs{1}.stats{1}.fmri_spec.sess(runIdx).hpf = 128;
    movefile([matPrefix '_' num2str(subject) '_*.mat'], outputDir)
    %Fill in the rest of the jobs fields
    jobs{1}.stats{1}.fmri_spec.fact = struct('name', {}, 'levels', {});
    jobs{1}.stats{1}.fmri_spec.bases.hrf = struct('derivs', [0 0]);
    jobs{1}.stats{1}.fmri_spec.volt = 1;
    jobs{1}.stats{1} = 'None';
    jobs{1}.stats{1}.fmri_spec.mask = {''};
    jobs{1}.stats{1}.fmri_spec.cvi = 'AR(1)';
    %Navigate to output directory, specify and estimate GLM
    spm_jobman('run', jobs)
    if ESTIMATE_GLM == 1
        load SPM;

%Uncomment the following line if you want to debug

Sunday, June 15, 2014

Writing Out Timing Files from E-Prime

Now, I promised you all something that would help make running SPM analyses easier; for although we would all like something that is both transparent and usable, in most FMRI packages they are polar of each other. SPM in many ways is very usable; but this tends to be more hindrance than asset when running large groups of subjects, when manually selecting each session and copying and pasting by hand become not only tedious, but dangerous, as the probability of human error is compounded with each step. Best to let the machines do the work for you when you can. Anything else is an affront to them; you'll wake more than just the dogs.

Do assist our future overlords in their work, command lines for each step should be used whenever possible. But, rummily enough, SPM offers little documentation about this, and it more or less needs to be puzzled out on your own. For most processing this wouldn't be too much of a hassle, but there is one step where it is indispensable, more indispensable, if possible, than Nutella on a banana sprinkled with chopped walnuts. Or possibly warm Nutella drizzled on top of vanilla ice cream and left standing until it cools and hardens into a shell. Has anyone tried this? Could you let me know how it tastes? Thanks!

Where was I? Right; the one step where you need batching is first-level analysis, where timing information is entered for each session (or run) for each subject. Back in the autumn of aught eight during my days as a lab technician I used to be just like you, arriving to work in the morning and easing my falstaffian bulk into my chair before starting my mind-numbing task of copying and pasting onset times into each condition for each run, invariably tiring of it after a few minutes but keeping at it, and maybe completing a few subjects by the end of the day. Not the most exciting or fulfilling work, and I probably screwed a lot of things up without noticing it.

SPM has tried to anticipate problems like this by specifying a Multiple Condition option through their interface, where you supply a .mat file for each session that contains all of the names, onsets, and durations for that session, and everything is automatically filled in for you. Which sounds great, until you realize that creating Matlab files is, like, hard, and then you go back to your old routine. (The manual also says that these .mat files can be generated automatically by presentation software such as COGENT, but I have never met anyone who has ever used COGENT; this is, in all likelihood, a fatcat move by Big Science to get you to buy a piece of software that you don't need.)

What we will be doing, then, is trying to make this as straightforward and simple of a process as possible. However, this assumes that you have your onsets organized in a certain way; and first we will talk about how to create those onsets from your stimulus presentation program. This will allow you much more flexibility, as you can choose what you want to write into a simple text file without having to go through copying and pasting data from the E-Prime DataAid program into Excel for further processing.  (I use E-Prime, and will be reviewing that here, but I believe that most presentation programs will allow you to write out data on the fly.)

Within E-Prime, you will need to know exactly when the scanner started, and use this as time zero for your onsets files. I usually do this by having a screen that says "Waiting for scanner..." which only terminates once the scanner sends a trigger pulse through a USB cord hooked up to the presentation laptop. We have a Siemens Trio scanner, which sends a backtick ( ` ); check whether your scanner does the same.

Note that I have an inline script (i.e., an object that contains E-Basic code) right after the WaitScanner slide that contains the following code:

StartTimestamp = CLng(c.GetAttrib("WaitScanner.RTTime")
c.SetAttrib "StartTimestamp", StartTimestamp

if c.GetAttrib("Session") = 1 Then

Open "OnsetTimes_" & c.GetAttrib("Subject") & ".txt" For Output as #1
Close #1

Open "OnsetTimes_" & c.GetAttrib("Subject") & ".txt" For Append As #1
      Print #1, "Run", "Event", "Onset", "Dur"
Close #1

end if

Also, in the duration/input tab for the properties of waitscanner, I have the Keyboard accepting ( ` ) as an input, and termination when it receives that backtick. The StartTimestamp will simply record when the scanner first started, and the if/then statement ensures that the file does not get overwritten when the next run is started.

After that, the StartTimestamp variable and the newly create text file can be used to record information after each event, with the specified condition being determined by you. To calculate the timing for each event, all you have to do is grab the onset time of a condition through the c.GetAttrib command, and subtract the StartTimestamp variable from that:

Additional runs will continue to append to the text file, until you have all of the timing information that you need for your subject. This can then be parsed by a script and divided into multiple condition .mat files, which we will discuss tomorrow.

Thursday, June 12, 2014

Beta Series Analysis in SPM

The following is a script to run beta series analysis in SPM; which, conceptually, is identical to beta series analysis in AFNI, just made more complicated by the use of Matlab commands, many of which I would otherwise remain totally ignorant of.

Also, if you believe that I did this on my own, you are a scoundrel and a fool. Virtually all of the code is from my colleague Derek Nee, a postdoc who used to work in my lab and who remains my superior in all things, scientific and otherwise. He also once got me with the old "If you say gullible slowly, it sounds like orange" joke.

In any case, all that I did was repackage it into a function to be more accessible by people outside of our lab. You may see a couple snippets of code that are useful outside of just doing beta series analysis; for example, the spm_get_data command, which can be used for parameter extraction from the command line without using marsbar. Also note that the script assumes that each trial for a condition has been labeled separately with the regressor name followed by a "_" and then the occurrence of the trial. So, for example, if I were looking at 29 instances of pressing the left button, and the regressor name was LeftButtonPress, I would have LeftButtonPress_1, LeftButtonPress_2, LeftButtonPress_3...LeftButtonPress_29.

Obviously, this is very tedious to do by hand. The Multiple Conditions option through the SPM GUI for setting up 1st level analyses is indispensable; but to cover that is another tutorial, one that will probably cross over into our discussion of GLMs.

The script:

function DoBetaSeries(rootdir, subjects, spmdir, seedroi, Conds, MapNames)
% function BetaSeries(rootdir, subjects, spmdir, seedroi, Conds, MapNames)
%  rootdir  - Path to where subjects are stored
%  subjects - List of subjects (can concatenate with brackets)
%  spmdir   - Path to folder containing SPM file
%  seedroi  - Absolute path to file containing ROI in NIFTI format
%  Conds    - List of conditions for beta series analysis
%  MapNames - Output for list of maps, one per Conds cell
%   Example use:
%       BetaSeries('/data/study1/fmri/', [101 102 103],
%       'model/BetaSeriesDir/', '/data/study1/Masks/RightM1.nii',
%       {'TapLeft'}, {'TapLeft_RightM1'})
%       conditions to be averaged together should be placed together in a cell
%       separate correlation maps will be made for each cell
%       Conds = {'Incongruent' 'Congruent' {'Error' 'Late'}}; 
%       For MapNames, there should be one per Conds cell above
%       e.g., with the Conds above, MapNames = {'Incongruent_Map',
%       'Congruent_Map', 'Error_Late_Map'}
%       Once you have z-maps, these can be entered into a 2nd-level
%       1-sample t-test, or contrasts can be taken between z-maps and these
%       contrasts can be taken to a 1-sample t-test as well.

if nargin < 5
 disp('Need rootdir, subjects, spmdir, seedroi, Conds, MapNames. See "help BetaSeries" for more information.')

%Find XYZ coordinates of ROI
Y = spm_read_vols(spm_vol(seedroi),1);
indx = find(Y>0);
[x,y,z] = ind2sub(size(Y),indx);
XYZ = [x y z]';

%Find each occurrence of a trial for a given condition
%These will be stacked together in the Betas array
for i = 1:length(subjects)
    subj = num2str(subjects(i));
    disp(['Loading SPM for subject ' subj]);
    %Can change the following line of code to CD to the directory
    %containing your SPM file, if your directory structure is different
    cd([rootdir subj filesep spmdir]);
    load SPM;
    for cond = 1:length(Conds)
        Betas = [];
        currCond = Conds{cond};
        if ~iscell(currCond)
            currCond = {currCond};
        for j = 1:length(SPM.Vbeta) 
            for k = 1:length(currCond)
                if ~isempty(strfind(SPM.Vbeta(j).descrip,[currCond{k} '_'])) 
                    Betas = strvcat(Betas,SPM.Vbeta(j).fname);

    %Extract beta series time course from ROI
    %This will be correlated with every other voxel in the brain
    if ischar(Betas)
        P = spm_vol(Betas);

    est = spm_get_data(P,XYZ);
    est = nanmean(est,2);

        %----Do beta series correlation between ROI and rest of brain---%

            MapName = MapNames{cond};
            disp(['Performing beta series correlation for ' MapName]);
            Vin = spm_vol(Betas);
            nimgo = size(Vin,1);
            nslices = Vin(1).dim(3);

            % create new header files
            Vout_r = Vin(1);   
            Vout_p = Vin(1);
            [pth,nm,xt] = fileparts(deblank(Vin(1).fname));
            Vout_r.fname = fullfile(pth,[MapNames{cond} '_r.img']);
            Vout_p.fname = fullfile(pth,[MapNames{cond} '_p.img']);

            Vout_r.descrip = ['correlation map'];
            Vout_p.descrip = ['p-value map'];

            Vout_r.dt(1) = 16;
            Vout_p.dt(1) = 16;

            Vout_r = spm_create_vol(Vout_r);
            Vout_p = spm_create_vol(Vout_p);

            % Set up large matrix for holding image info
            % Organization is time by voxels
            slices = zeros([Vin(1).dim(1:2) nimgo]);
            stack = zeros([nimgo Vin(1).dim(1)]);
            out_r = zeros(Vin(1).dim);
            out_p = zeros(Vin(1).dim);

            for i = 1:nslices
                B = spm_matrix([0 0 i]);
                %creates plane x time
                for j = 1:nimgo
                    slices(:,:,j) = spm_slice_vol(Vin(j),B,Vin(1).dim(1:2),1);

                for j = 1:Vin(1).dim(2)
                    stack = reshape(slices(:,j,:),[Vin(1).dim(1) nimgo])';
                    [r p] = corr(stack,est);
                    out_r(:,j,i) = r;
                    out_p(:,j,i) = p;


                Vout_r = spm_write_plane(Vout_r,out_r(:,:,i),i);
                Vout_p = spm_write_plane(Vout_p,out_p(:,:,i),i);


            %Convert correlation maps to z-scores
            %NOTE: If uneven number of trials in conditions you are
            %comparing, leave out the "./(1/sqrt(n-3)" term in the zdata
            %variable assignment, as this can bias towards conditions which
            %have more trials
                disp(['Converting correlation maps for subject ' subj ', condition ' MapNames{cond} ' to z-scores'])
                P = [MapNames{cond} '_r.img'];
                n = size(Betas,1);
                Q = MapNames{cond};
                Vin = spm_vol([MapNames{cond} '_r.img']);

                % create new header files
                Vout = Vin;   

                [pth,nm,xt] = fileparts(deblank(Vin(1).fname));
                Vout.fname = fullfile(pth,[Q '_z.img']);

                Vout.descrip = ['z map'];

                Vout = spm_create_vol(Vout);

                data = spm_read_vols(Vin);
                zdata = atanh(data)./(1/sqrt(n-3));



This can either be copied and pasted into a script, or downloaded as a .m file here.

Tuesday, June 10, 2014

ICA Analysis Playlist: Soup to Nuts

Fellow brainbloggers,

I will be providing more details on each of the ICA steps, but for those who need to do it right now, there is a playlist up on Youtube, a survival kit of sorts, which will guide you through an example analysis from start to finish and point out dangers along the way. As I was doing this over the weekend, I encountered a couple of particularly pernicious pitfalls, precariously poised on the precipice of total annihilation:

1. When doing dual_regression, make sure that your input list points to your functional data that has not only been preprocessed, but also normalized to standard space. Often FSL automatically creates a dataset called filtered_func_data.nii.gz in both the main directory and the normalized directory; choose the normalized one.

2. You can do multi-session MELDOIC through the GUI, but only if each subject has the same number of TRs. If there is a discrepancy, MELODIC will exit out once it encounters the different subject. In that case, you need to analyze each subject individually (or by batch subjects together who have the same number of TRs), and then do melodic from the command line, using a command such as:

melodic -i ICA_List.txt -a concat -o ICA_Output --nobet -d 30 --mmthresh 0.5 --tr 2.5

Where ICA_List.txt contains a path to the preprocessed, normalized data for each subject on each row.

3. I previously mentioned that you should not insult our future robot overlords, and leave the dimensionality estimation to FSL. However, this can lead to large numbers of components, and it is often a good idea to set this manually at 30, give or take a few components. Usually around 30 components will hit the sweet spot between overfitting and underfitting the amount of variance to each component.

Those are my observations from the front. As always, take these tutorials with a large calculus of salt.

Friday, June 6, 2014

Independent Components Analysis, Part II: Using FSL Example Data

For our next step, we will be working with FSL example data - somewhat artificial data, true, and much better quality than anything you can ever expect from the likes of your data, which will only lead to toil, sweat, and the garret. Sufficient unto the day is the frustration thereof.

However, it is a necessary step to see what ICA analysis ought to look like, as well as learning how to examine and identify components related to tasks and networks that you are interested in. Also - and possibly more important - you will learn to recognize sources of noise, such as components related to head motion and physiological artifacts.

First, go to the link and scroll to the bottom for instructions on how to download the datasets. You can use either wget or curl. For this demonstration, we will be using datasets 2 and 6:

curl -# -O -C -
curl -# -O -C -

Once you have downloaded them, unzip them using gunzip, and then "tar -xf" on the resulting tar files. This will create a folder called fsl_course_data, which you should rename so that they do not conflict with each other. Within fsl_course_data2, navigate to the /melodic/av directory, where you will find a small functional dataset that was acquired while the participant was exposed to auditory stimuli and visual stimuli - which sounds much more scientific than saying, "The participant saw stuff and heard stuff."

Open up the MELODIC gui either through the FSL gui, or through typing Melodic_gui from the command line. Most of the preprocessing steps can be kept as is. However, keep the following points in mind:

1. Double-check the TR. FSL will fill it in automatically from the header of the NIFTI file, but it isn't always reliable.

2. Spatial smoothing isn't required in ICA, but a small amount can help produce better-looking and more identifiable component maps. Somewhere on the order of the size of a voxel or two will usually suffice.

3. By default, MELODIC automatically estimates the number of components for you. However, if you have severe delusions and believe that you know how many components should be generated, you can turn off the "Automatic dimensionality estimation" option in the Stats tab, and enter the number of components you want.

4. The Threshold IC maps option is not the same thing as a p-value correction threshold. I'm not entirely clear on how it relates to the mixture modeling carried out by ICA, but my sense from reading the documentation and papers using ICA is that a higher threshold only keeps those voxels that have a higher probability of belonging to the true signal distribution, instead of the background noise distribution, and it comes down to a balance between false positives and false negatives. I don't have any clear guidelines about what threshold to use, but I've seen cutoffs used within the 0.8-0.9 range in papers.

5. I don't consider myself a snob, but I was using the bathroom at a friend's house recently, and I realized how uncomfortable that cheap, non-quilted toilet paper can be. It's like performing intimate hygiene with roofing materials.

6. Once you have your components, you can load them into FSLview and scroll through them with the "Volumes" button in the lower left corner. You can also load the Atlases from the Tools menu and double-click on it to get a semi-transparent highlight of where different cortical regions are. This can be useful when trying to determine whether certain components fall within network areas that you would expect them to.

More details in the videos below, separately for the visual-auditory and resting-state datasets.