Specifically, the effect of relative

uncertainty in right

Specifically, the effect of relative

uncertainty in right RLPFC was reliable for the explore participants [t(7) = 4.5, p < 0.005] but not the nonexplore participants [t(6) = 1.2], and the direct comparison between groups was significant [t(13) > 4.4, p < 0.005]. Further ROI analysis also demonstrated these effects using ROIs in RLPFC defined based on coordinates from prior studies of exploration (i.e., Daw et al., 2006 and Boorman et al., 2009; see Supplemental Information). The primary model of learning and decision making in this task was drawn directly from prior work (Frank et al., 2009) to permit consistency and comparability between studies. However, we next sought to establish that the effects of relative uncertainty observed in RLPFC were not wholly dependent on specific choices made in constructing the computational model itself. Thus, we constructed Trichostatin A in vivo three alternative models that relied on the same relative uncertainty computation as the primary model but differed in other details of their implementation that may affect which specific subjects are identified as explorers (see Supplemental Information for modeling details). First, we eased the constraint that ε be greater than or equal to 0. In the primary model, we added this constraint so that model fits could not leverage this parameter

to account for variance related to perseveration, particularly on exploit trials. However, in certain BMS-354825 in vivo task contexts some individuals may consistently avoid uncertain choices (i.e., uncertainty aversion; Payzan-LeNestour of and Bossaerts, 2011 and Strauss et al., 2011). It follows, then, that these individuals might track uncertainty in order to avoid it, perhaps reflected by a negative ε parameter. Alternatively, ε may attain negative values if participants simply exploit on the majority of trials, such that the exploitative option is selected most

often and hence has the most certain reward statistics (assuming that value-based exploitation is not perfectly captured by the model). Thus a negative ε need not necessarily imply uncertainty aversion, and it could be that the smaller proportion of exploratory trials is still guided toward uncertainty. Thus, we conducted three simulations in which ε was unconstrained (see also earlier model of RT swings). In an initial simulation, we categorized responses as exploratory or not, where exploration is defined by selecting responses with lower expected value (Sutton and Barto, 1998 and Daw et al., 2006). While we fit the remaining model parameters across all trials, we fixed ε = 0 on all exploitation trials and allowed it to vary only in trials defined as exploratory.

Comments are closed.