Sunday, September 2, 2012

Optseq and Event-Related Designs

One question I frequently hear - besides "Why are you staring at me like that?" - is, "Should I optimize my experimental design?" This question is foolish. As well ask man what he thinks of Nutella. Optimal experimental designs are essential to consider before creating an experiment, in order to ensure that there is a balance between the number of trials, the duration of the experiment, and the estimation efficiency. Optseq is one such tool which allows the experimenter to maximize all three.

Optseq, a tool developed by the good people over at MGH, helps the user order the presentation of experimental conditions or stimuli in order to optimize the estimation of beta weights for a given condition or a given contrast. By providing a list of the experimental conditions, the length of the experiment, the TR, and how long and how frequent the experimental conditions last, optseq will generate schedules of stimuli presentation that maximizes the power of the design, both through reducing variance due to noise and increasing the efficiency of estimating beta weights. This is especially important in event-related designs, which are discussed in greater detail below.

Chapter 1: Slow Event-Related Designs

Slow event-related designs refer to experimental paradigms in which the hemodynamic response function (HRF) for a given condition does not overlap with another condition's HRF. For example, let us say that we have an experiment in which IAPS photos are presented to the subject at certain intervals. Let us also say that there are three classes of photos (or conditions): 1) Disgusting pictures (DisgustingPic); 2) Attractive pictures (AttractivePic); and neutral pictures (NeutralPic). The order in which these pictures are presented will be at a  long enough time interval so that the HRFs do not overlap, as shown in the following figure (Note: All of the following figures are taken from the AFNI website; all credit and honor is theirs. They may find these later and try to sue me, but that is a risk I am willing to take):



The interstimulus time interval (ITI) in this example is 15 seconds, allowing enough time for the entire HRF to rise and decay, and therefore allow the experimenter to estimate a beta weight for each condition individually without worrying about overlap in the HRFs. (The typical duration of an HRF, from onset to decay back to baseline, is roughly 12 seconds, with a long period of undershoot and recovery thereafter). However, slow event-related designs suffer from relatively long experimental durations, since the time after each stimulus presentation has to allow enough time for the HRF to decay to baseline levels of activity. Furthermore, the subject may become bored after such lengthy "null" periods, and may develop strategies (aka, "sets") when the timing is too predictable. Therefore, quicker stimulus presentation is desirable, even if it leads to overlap in the conditions's HRFs.


Chapter 2: Rapid (or Fast) Event-Related Designs

Rapid event-related designs refer to experimental paradigms in which the ISI between conditions is short enough so that the hemodynamic response function (HRF) for a given condition overlaps with another condition's HRF. Theoretically, given the assumption of linearity of overlapping HRFs, this shouldn't be a problem, and the contribution of each condition's HRF can be determined through a process known as deconvolution. However, this can be an issue when the ISI's are fixed and the presentation of stimuli is fixed as well:


In this example, each condition follows the other in a fixed pattern: A followed by B followed by C. However, the resulting signal arising from the combination of HRFs (shown in purple in the bottom graph) is impossible to disentangle; we have no way of knowing, at any given time point, what amount of the signal is due to condition A, B, or C. In order to resolve this, we must randomize the presentation of the stimuli and also the duration of the ISI's (a process known as "jittering"):


Now we have enough variability so that we can deconvolve the contribution of each condition to the overall signal, and estimate a beta weight for each condition. This is the type of experimental design that optseq deals with.


Chapter 3: Optseq

In order to use optseq, first download the software from the MGH website (or, just google "optseq" and click on the first hit that comes up). Put the shell script into a folder somewhere in your path (such as ~/abin, if you have already downloaded AFNI) so that it will execute from anywhere within the shell. After you have done this you may also want to type "rehash" at the command line in order to update the paths. Then, type "optseq2" to test whether the program executes or not (if it does, you should get the help output, which is generated when no argument are given).

Optseq needs four pieces of information: 1) The number of time points in a block of your experiment; 2) The TR; 3) A permissible range of the minimum and maximum amounts of time after presentation of a stimulus and the start of the next (aka, the post-stimulus delay, or PSD); and 4) How many conditions there are in your experiment, as well as how long they are per trial and how many trials there are per block. Optseq will then generate a series of schedules that randomize the order  of the ITI's and the stimuli in order to find the maximum design efficiency.

Using the IAPS photos example above, let us also assume that there are: 1) 160 time points in the experiment; 2) a TR of 2 seconds; 3) A post-stimulus window of 20 seconds to capture the hemodynamic response; 3) Jitters ranging from 2 to 8 seconds; and 4) The three conditions mentioned above, each last 2 seconds per trial, and with 20 instances of DisgustingPic, 15 instances of AttractivePic, and 30 instances of NeutralPic. Furthermore, let us say that we wish to keep the top three schedules generated by optseq, each prefixed by the word IAPS, and that we want to randomly generate 1000 schedules.

Lastly, let us say that we are interested in a specific contrast - for example, calculating the difference in activation between the conditions disgustingPic and attractivePic.

In order to do this, we would type the following:

optseq2 --ntp 160 --tr 2 --psdwin 0 20 2 --ev disgustingPic 2 20 --ev attractivePic 2 15 --ev neutalPic 2 30 --evc 1 -1 0 --nkeep 3 --o IAPS --tnullmin 2 --tnullmax 8 --nsearch 1000

This will generate stimuli presentation schedules and only keep the top three; examples of these schedules, as well as an overview of how the efficiencies are calculated, can be seen in the video.


Chapter 4: The Reckoning - Comparison of optseq to 1d_tool.py

Optseq is a great tool for generating templates for stimuli presentation. However, if only one schedule is generated and applied to all blocks and all subjects, there is the potential for ordering effects to be confounded with your design, however minimal that possibility may be. If optseq is used, I recommend that a unique schedule be generated for each block, and then counterbalance or randomize those blocks across subjects. Furthermore, you may also want to consider randomly sampling the stimuli and ITI's and simulate several runs of your experiment, and then run them through AFNI's 1d_tool.py in order to see how efficient they are. With a working knowledge of optseq and 1d_tool.py, in addition to a steady diet of HotPockets slathered with Nutella, you shall soon become invincible.

You shall be...Perfect.


36 comments:

  1. My imperfect understanding is that the efficiency optseq optimizes refers to an FIR-type estimation of the shape/timecourse of your response, when if fact most people are interested in detection efficiency for a contrast of interest. My further imperfect understanding is that optseq-style efficiency actually HURTS constrast/detection efficiency. So the question is not should you optimize your design, but rather are you hurting yourself by using optseq, if your interest is in detection/contrast efficiency? If correct, this seems like an important caveat to raise here.

    ReplyDelete
  2. Hi Anonymous,

    You are correct that optimizing efficiency in estimating beta weights for each condition does not imply efficient contrast estimation; however, you can use the --evc option to optimize one or more contrasts. (I mentioned this in the video, but not in the post; I will fix that.)

    Thanks for raising this important issue!

    -Andy

    ReplyDelete
    Replies
    1. Hi Andy,

      I'm a beginner Optseq user, and I'm trying to optimise multiple contrasts.
      In your post above you suggest that this is possible.

      However, ran Optseq2 asking it to generate also the "summary" file as output using the option --sum . After completion, this file contains a line which says:

      EV Contrast: 1.000 -1.000 -1.000 1.000 0.000

      and then, no other contrast is listed.
      This happens to be the *LAST* of the (eight) contrasts that I specified and would like to optimise.

      How can we be sure that Optseq2 optimizes *ALL* of the contrasts, and not just the last one??

      Many thanks in advance,
      Max

      Delete
    2. Hey Max,

      Apologies for the late reply - you would think that optseq could optimize multiple contrasts and not just one, but according to an email exchange between a user and Doug Greve (the creator of optseq), this is not possible.

      I'm not positive what the tradeoff is when optimizing one contrast at the expense of the rest, and whether this leads to huge differences compared to merely using optseq to optimize the beta weights.

      -Andy

      Delete
  3. Thanks Andy - This is a huge help. I'm having trouble figuring out how to model a more involved design...wondering if you have any insights:

    I have two conditions (milkshake and water).

    Participants get a squirt of milkshake (4s) then swallow (jittered?) then a rinse of water (no interest) then swallow (jitter again?).

    In the other condition, they just get a squirt of water and a swallow.

    In order to model all of this, would I need to put in every event that's taking up time as an event, and then just do an event contrast comparing the two conditions of interest?

    Or would I make the "events" be long enough to encompass all the actions that are taking place during that event (even though I'm just interested in the first 4s when the person is receiving a taste)?

    ReplyDelete
    Replies
    1. Hey John,

      Yes, you would need to model any events you are interested in for a contrast. In AFNI, any events you don't model are automatically put into the baseline model (which, for the F-statistics usually generated by AFNI, can be thought of as a null hypothesis model).

      When you ask about making the events long enough, are you talking about making a duration regressor as long as the entire condition? I would try to avoid that, as it would be difficult to determine what exactly is leading to the differences between the conditions. Furthermore, from your design it looks like one condition is longer than the other, which could potentially be a confound.

      Also, in case you were asking about the jitters, it looks fine to me where you have them right now. If you were modeling over the duration of the entire condition then they wouldn't matter as much, but if you are looking at each condition separately, you will need the jitters to be able to estimate any parameters associated with those events.


      Hope this helps!

      -Andy

      Delete
  4. Very helpful blog, thanks! I am trying to figure out what the efficiency numbers mean. I can see that higher numbers are "better", but is there a heuristic for what range this should be in? For example, one study of mine yielded values near .11, another near .58, and your example has values near 1.x. I am wondering if low values warrant changing the design e.g., number of conditions.

    ReplyDelete
    Replies
    1. Hi Anonymous,

      From what I can tell, the efficiency estimates do not have any magnitude associated with them, but rather are relative measures; in other words, you could try a few different designs and timing schedules, and then see which one is more efficient than the other. Also remember that efficiency is only one slice of the pie; you should also take into account the trial yield for each design and how amenable it will be to the subjects, among other factors.

      All that aside, the short answer is that efficiency is relative - all things being equal, choose the design with the highest efficiency. Just because someone else's design for a completely different type of experiment has better efficiency for their regressors does not mean that their experiment is necessarily better; it's all relative. And that's a good thing. I think.

      Hope this helps,

      -Andy

      Delete
  5. Hi Andy,

    I'm a beginner Optseq user, and I'm trying to optimise multiple contrasts.
    In your post above (dated October 2, 2012) you suggest that this is possible.

    However, I ran Optseq2 asking it to generate also the "summary" file as output using the option --sum . After completion, this file contains a line which says:

    EV Contrast: 1.000 -1.000 -1.000 1.000 0.000

    and then, no other contrast is listed.
    This happens to be the *LAST* of the (eight) contrasts that I specified and would like to optimise.

    How can we be sure that Optseq2 optimizes *ALL* of the contrasts, and not just the last one??

    Many thanks in advance,
    Max

    ReplyDelete
  6. Hi Andy,

    When I identify the min null duration as 2 seconds I still get few trials with null duration= 0. Do you have any suggestions for how I can fix this problem?

    ReplyDelete
    Replies
    1. Hey there, can you post the code you used for the optseq command? I haven't used optseq in a while, but maybe I can spot the problem.

      -Andy

      Delete
  7. Hello,

    I'm a little confused about what determines the length of the "NULL" periods in the schedules generated by optseq. Originally, I thought you could set these parameters through the min/max PSD, but when I did so I got an output with a few NULL trials longer than the 8 second PSD max that I set. Here's the code I used:

    optseq2 --ntp 420 --tr 2 --psdwin 2 8 2 --ev OFR 8 8 --ev OFI 8 8 --ev ONR 8 8 --ev ONI 8 8 --ev new 8 32 --ev different 8 8 --ev same 8 8 --evc 0.25 0.25 0.25 0.25 1 0 0 --nkeep 3 --o OSC_new --nsearch 1000


    Some googling told me that one can also add --tnullmin and --tnullmax to the code, so now I'm also confused about the difference between PSD and the length of the "NULL" periods.

    Any help would be appreciated, thank you in advance!

    ReplyDelete
  8. Dear Andy
    Thank you for the insightful post
    An embarrassingly technical question - I'm trying to run optseq2 using cygwin (default instillation) and just can't seem to run it correctly
    I've added the directory to the PATH (and verified using which)
    ran chmod a+x optseq2
    and nothing - is there something basic I'm missing?

    ReplyDelete
    Replies
    1. Hi there,

      What kind of error are you getting? I've found that sometimes you need to run chmod 777 optseq2 to get it to work. If that doesn't fix it, I'm not sure what it is; it might be something specific to Cygwin.


      Good luck!

      -Andy

      Delete
  9. Hi Andy
    Thanks for your suggestion.
    No matter what I did Cygwin just wouldn't run Optseq
    I switched to Ubuntu + virtual box and it worked perfectly
    Best,
    Greg

    ReplyDelete
    Replies
    1. Good to know! Maybe Cygwin just doesn't work with Optseq.

      -Andy

      Delete
  10. Dear Andy,

    thank you for your very useful post. My question regards using optseq2 with two levels of events - one to be optimized but not randomized (encoding always followed by recognition, but with jittered ISI) and second: optimized and randomized (3 types of stimuli to be encoded and recognized). Is it possible to do it with optseq?

    Best regards,
    Barbara

    ReplyDelete
  11. I'm desperately looking for optseq2 but the file I downloaded on the official website seems to be corrupted.
    Does anyone could send it to me, please ? (I'm on Mac)
    Alex

    ReplyDelete
    Replies
    1. Hi Alex,

      How is it corrupted? Is it just not running? In that case, I've found that it is usually because of permission settings. Try going to the directory where optseq2 is located and doing chmod 755 optseq2.

      -Andy

      Delete
    2. Hi Andy,
      I finally found a way to obtain optseq2: I downloaded freesurfer...

      Delete
  12. Hi Andy, thanks for the information on this. Would it be correct to say that you cannot enter both multiple events and multiple conditions per event in optseq when defining your design? For the task I am using, each trial has 3 events. The first two events have 4 levels each, the second has two levels. I'm interested in what ISI ranges I should use between the events. I do plan to jitter.
    -Jennifer

    ReplyDelete
  13. Greetings,
    Has anyone found an easy way (relatively) to show the results in a .par file graphically?

    Cheers,
    Jason

    ReplyDelete
    Replies
    1. Hey Jason, try a program such as AFNI's 1dplot; it's easy to use, and has a -sepscl option to present each regressor on its own scale.

      -Andy

      Delete
  14. Greetings,

    I've tried but cannot figure out how to use optseq2 to develop a schedule with the following parameters: TR=2.5sec, stimulus presentation time 1second or less. Does anyone have any suggestions?
    Cheers,
    Jason

    ReplyDelete
    Replies
    1. Hey Jason, optseq will restrict you to a tr that is a multiple of your event duration; so, you could have a tr of 2.5 seconds and an event duration of 0.5 seconds, but not a tr of 2.5 seconds and an event duration of 1 second.

      Also make sure that the dPSD is a multiple of your event duration. E.g.:

      optseq2 --tprescan -6 --ntp 100 --tr 2.5 --psdwin 0 20 0.5 --ev evt1 0.5 30 --ev evt2 0.5 30 --nkeep 3 --o ex3 --nsearch 10000


      -Andy

      Delete
  15. Greetings,
    I am using optseq2 to estimate fMRI runs and cannot figure out how to get optseq2 to make at least the first time period in the schedule to be a null event. Does anyone know how to accomplish this task?
    Cheers,
    Jason

    ReplyDelete
    Replies
    1. Hi Jason,

      Try using the --tprescan option, and setting it to a negative number. For example, if I want the first event to occur six seconds after scanning starts, then I could use a command like the following:

      optseq2 --tprescan -6 --ntp 100 --tr 2 --psdwin 0 20 --ev evt1 2 30 --ev evt2 2 30 --nkeep 3 --o ex3 --nsearch 10000


      Best,

      -Andy

      Delete
    2. Hi,

      Using a negative number for --tprescan caused a spectacular fail for me (macOS):

      optseq2 --tprescan -1 --tnullmin 1 --ntp 1800 --tr 1 --psdwin 0 20 --ev a 4 16 --ev b 4 16 --ev c 4 16 --ev d 4 16 --ev e 4 16 --ev f 4 16 --ev g 4 16 --ev h 4 16 --ev i 4 16 --ev j 4 16 --nkeep 30 --o FDMT --nsearch 1000
      INFO: Setting srand48() seed to 963450
      optseq2
      $Id: optseq2.c,v 2.15 2009/05/26 18:13:45 greve Exp $
      NoSearch = 0
      nSearch = 1000
      nKeep = 30
      PctUpdate = 10.000000
      nCB1Opt = 0
      seed = 963450
      Ntp = 1800
      TR = 1
      TPreScan = -1
      PSD Window = 0 20 1
      nEvTypes = 10
      EvNo Label Duration nRepsNom
      1 a 4.000 16
      2 b 4.000 16
      3 c 4.000 16
      4 d 4.000 16
      5 e 4.000 16
      6 f 4.000 16
      7 g 4.000 16
      8 h 4.000 16
      9 i 4.000 16
      10 j 4.000 16
      PctVarEvReps = 0
      VarEvRepsPerCond = 0
      PolyOrder = -1
      tNullMax = -1
      tNullMin = 1
      outstem = FDMT
      AR1 = 0
      No refractory penalty
      Cost = eff
      OutStem = FDMT
      Summary File = FDMT.sum
      nTaskAvgs = 200
      INFO: LogFile is FDMT.log
      outstem = FDMT
      0 7 0.0 0.0721536 0.0721536 0.6538 14.4367 0.293691 13.625 15.3177 1.69273 6550.52 0
      /usr/pubsw/packages/vxl/1.9.0/src/core/vnl/algo/vnl_qr.txx: vnl_qr::solve() : matrix is rank-deficient by 91
      /usr/pubsw/packages/vxl/1.9.0/src/core/vnl/algo/vnl_qr.txx: vnl_qr::solve() : matrix is rank-deficient by 91
      /usr/pubsw/packages/vxl/1.9.0/src/core/vnl/algo/vnl_qr.txx: vnl_qr::solve() : matrix is rank-deficient by 91
      /usr/pubsw/packages/vxl/1.9.0/src/core/vnl/algo/vnl_qr.txx: vnl_qr::solve() : matrix is rank-deficient by 91
      /usr/pubsw/packages/vxl/1.9.0/src/core/vnl/algo/vnl_qr.txx: vnl_qr::solve() : matrix is rank-deficient by 91
      /usr/pubsw/packages/vxl/1.9.0/src/core/vnl/algo/vnl_qr.txx: vnl_qr::solve() : matrix is rank-deficient by 91
      /usr/pubsw/packages/vxl/1.9.0/src/core/vnl/algo/vnl_qr.txx: vnl_qr::solve() : matrix is rank-deficient by 91

      It could be because I have multiple events? Or some other gremlin.

      Fortunately, if you just want an offset at the beginning of your experiment all you'd need to do is add an offset (e.g. 12 sec) to all the onset times provided by optseq. One thing to note is that you should probably use an offset that is a multiple of your TR (e.g. 4 * TR3s = 12sec) to avoid messing up the jittering that optseq has introduced.

      Cheers, Jon

      Delete
  16. Hi Andy

    I have a question about jittering. Is it a big problem to jitter the length of events rather than the ISI between the events? Here is an example of this type of design:
    http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001684

    The reason I am interested in this is that it allows me to separate the events that I am interested in without having to include lots of ISI's between events. This way I can include more trials. Plus subjects won't have their working memories strained too much.

    I have tried to find more info on this type of design but to no avail, so any guidance on this would be great!

    Thanks
    David


    ReplyDelete
    Replies
    1. Hi David,

      Both methods do the same thing: Vary the onset of different trials relative to each other, so that different points of the HRF are sampled for each condition. But why add the extra time to your condition when you can assign that time to an ISI? I can see how it might reduce working memory demands, but I don't see how it allows you to add more trials.

      Also keep in mind that longer durations can lead to BOLD effects that may not be captured by simply using duration regressors. This all depends on your experimental design: Is there any reason to believe that the longer presentation of a stimulus (or cognitive process) would interact with any of the other conditions, as opposed to keeping everything the same? I don't have any examples at hand of how this might happen, but it's something to keep in mind.

      -Andy

      Delete
    2. That's good to know. Thanks very much, Andy. I can shave off a few seconds in each trial by getting rid of the ISI and incorporating the jitter into the event. For example, an event of 2s plus a jitter averaging 3.5s (av length 5.5s) is longer that just the jittered event (av length 3.5). Considering that I want to do this twice within each trial, this should give me considerably more room across the whole task to play with.

      I am a bit concerned about the effect of the variable lengths but I'm given some confidence by the fact that the same kind of task has been done like this in the previous literature.

      Once again, thanks a lot for your advice :)

      David

      Delete
  17. Hi Andy, Thanks so much for your incredibly useful blog and videos!
    I have a question about optseq2. I am looking for a jitter of between 0.5 and 4 secs to be generated. Thus my command was

    optseq2 --ntp 300 --tr 2 --psdwin 0 4 .5 --ev common 2 42 --ev rare 2 18 --o Aliens3_outstem --nsearch 10000 --nkeep 4.

    However, it ends up generating ITIs of from 0.5 up to 32! Any idea why it would do this?

    Thanks,
    Maria

    ReplyDelete
    Replies
    1. Hey Maria,

      psdwin actually is to capture the full range of the hemodynamic response (which can be from about 20 to 30 seconds, depending on if you want to capture the undershoot as well). If you want to limit the range of jitters, you can use the options --tnullmin and --tnullmax. Note that you will need to have enough trials to make sure the jitters stay within a certain range; else, the extra time will be filled up with longer jitters.

      -Andy

      Delete
    2. Hi Andy,

      Can you clear up some confusion I have with optseq? When you specify the psdwin with 0 and 4 (as above) I take this to mean that you are implying that the BOLD signal associated with an event will last (on average) between 0 and 4 seconds from the onset of a single event? I understand that this is more related to BOLD modelling than considerations for jittering (although the two are intertwined), the question I have is about the 0.5 at the end of Maria's psdwin argument, what purpose does this serve? Shouldn't this always be equal to the TR? Any help getting to grips with this would be much appreciated!

      Cheers, Jon

      Delete