Figure 1.
(A) Average mouse weight before and during training in the operant conditioning task. On day 0, mice are weighed, and their home cage water is replaced with 2% citric acid water. On days 0, 1, and 2, mice are handled to habituate them to the operator. Following this habituation period, mice undergo the three phases of the operant conditioning task (days 3 to 20). Mice exhibit a slight (~10%) weight loss due to reduced water intake, but their weight remains stable for the rest of the experiment, indicating that daily sucrose consumption during the task does not affect their weight. (B) Raw sample fluorescence traces recorded during baseline with excitation at 405 nm (bottom trace, blue), GCaMP6f isosbestic point, in comparison to calcium signal at 465 nm (top, green). (C) Sample trace of a trained mouse under anesthesia. The playing of the tone while the mouse is under anesthesia did not provide any direct sensory-related response, nor did a toe pinch, suggesting that dSPNs in the VS are not responsive to sensory stimulation.
Figure 1.
(A) Average mouse weight before and during training in the operant conditioning task. On day 0, mice are weighed, and their home cage water is replaced with 2% citric acid water. On days 0, 1, and 2, mice are handled to habituate them to the operator. Following this habituation period, mice undergo the three phases of the operant conditioning task (days 3 to 20). Mice exhibit a slight (~10%) weight loss due to reduced water intake, but their weight remains stable for the rest of the experiment, indicating that daily sucrose consumption during the task does not affect their weight. (B) Raw sample fluorescence traces recorded during baseline with excitation at 405 nm (bottom trace, blue), GCaMP6f isosbestic point, in comparison to calcium signal at 465 nm (top, green). (C) Sample trace of a trained mouse under anesthesia. The playing of the tone while the mouse is under anesthesia did not provide any direct sensory-related response, nor did a toe pinch, suggesting that dSPNs in the VS are not responsive to sensory stimulation.
![Biomedicines 12 02755 g001]()
Figure 2.
(A) Sample micrographs of coronal sections of the VS of D1-cre animals injected with AAV vectors containing Flex-GCaMP6f. GCaMP6f expression is confirmed by GFP immunolabeling (cyan) co-stained with TH antibody (yellow) to visualize the striatum. (B) Schematic of location of optic fibers for mice implanted in the VS (figure from Allen Brain Atlas).
Figure 2.
(A) Sample micrographs of coronal sections of the VS of D1-cre animals injected with AAV vectors containing Flex-GCaMP6f. GCaMP6f expression is confirmed by GFP immunolabeling (cyan) co-stained with TH antibody (yellow) to visualize the striatum. (B) Schematic of location of optic fibers for mice implanted in the VS (figure from Allen Brain Atlas).
Figure 3.
Schematic of the experimental setup. (A) Mice were placed in a rectangular box made of infrared-transmitting acrylic. An Arduino controlled various aspects of the experiment, including spout availability, reward delivery, lever presses, the force required to activate the lever, tone and light cues, and initiation of calcium imaging via a Doric Fiber Photometry system. (B) A camera was positioned below the box and another to the side, providing simultaneous views from both the bottom and side of the setup.
Figure 3.
Schematic of the experimental setup. (A) Mice were placed in a rectangular box made of infrared-transmitting acrylic. An Arduino controlled various aspects of the experiment, including spout availability, reward delivery, lever presses, the force required to activate the lever, tone and light cues, and initiation of calcium imaging via a Doric Fiber Photometry system. (B) A camera was positioned below the box and another to the side, providing simultaneous views from both the bottom and side of the setup.
Figure 4.
(A–C) Schematic of the behavioral protocol. (A) In Phase I, mice are placed in the operant box (step 1), where a spout (center of the wall at the top of the image) is always accessible. However, the spout delivers a sucrose reward only when the mouse licks it while a 3 kHz tone is playing (step 2). Mice learn to associate the tone with reward availability (classical conditioning). The tone lasts for 5 s, followed by a 15 s interval during which the spout does not deliver a reward, even when licked. This sequence repeats for multiple trials (each tone signals the start of a trial) daily, over 8 consecutive days. Two tests were performed each day. There were 5 to 10 trials in each test, depending on the success rate for each individual mouse, and to avoid satiation. (B) In Phase II, the tone is not played. The spout is only activated if the mouse presses a lever (operant conditioning; step 1). Each lever press initiates a trial, and the spout is released making the reward available (step 2), and this process continues for multiple trials daily over 6 consecutive days, with two tests each day. Within each test, trials were counted by each lever press event and lasted 10 s. Mice could press the lever as many times as they wished within the 5 min test. (C) In Phase III, the mouse is placed in the box (step 1) and the lever can only be pressed during the tone (step 2), and only then it releases the spout for the mouse to access the reward (step 3). The tone starts each trial, which repeats multiple times daily for 5 consecutive days. During this phase, there were two tests each day and 5 to 10 trials within each test. Behavior is recorded using an infrared camera placed beneath the box. DeepLabCut is used to track the positions of the mouse’s body parts, and an LED light signals the start of each trial (triggered by the tone in Phases I and III, or by the lever press in Phase II). (D–I) Schematic of DeepLabCut results. The legend on the right shows the color code for the different body parts. Each point represents the position of a body part in one frame, and all frames from a session are combined to create a 2D spatial representation. Samples from the same mouse across the three phases are shown. (D) Day 1 of Phase I: The mouse explored the entire box. (E) By day 8, the mouse spent most of its time near the spout, ignoring the rest of the box. (F) Day 1 of Phase II: The mouse was mostly near the spout but, since the spout only appears after pressing the lever, the mouse continued to explore the box. (G) Once the mouse learned that pressing the lever releases the spout, it spent more time around the lever and spout. (H,I) In Phase III, the mouse has already associated both the lever and spout with the reward and spent most of its time around these two objects. (J) Labeled frame from one mouse. The arrows indicate the LED light, the lever, and the spout.
Figure 4.
(A–C) Schematic of the behavioral protocol. (A) In Phase I, mice are placed in the operant box (step 1), where a spout (center of the wall at the top of the image) is always accessible. However, the spout delivers a sucrose reward only when the mouse licks it while a 3 kHz tone is playing (step 2). Mice learn to associate the tone with reward availability (classical conditioning). The tone lasts for 5 s, followed by a 15 s interval during which the spout does not deliver a reward, even when licked. This sequence repeats for multiple trials (each tone signals the start of a trial) daily, over 8 consecutive days. Two tests were performed each day. There were 5 to 10 trials in each test, depending on the success rate for each individual mouse, and to avoid satiation. (B) In Phase II, the tone is not played. The spout is only activated if the mouse presses a lever (operant conditioning; step 1). Each lever press initiates a trial, and the spout is released making the reward available (step 2), and this process continues for multiple trials daily over 6 consecutive days, with two tests each day. Within each test, trials were counted by each lever press event and lasted 10 s. Mice could press the lever as many times as they wished within the 5 min test. (C) In Phase III, the mouse is placed in the box (step 1) and the lever can only be pressed during the tone (step 2), and only then it releases the spout for the mouse to access the reward (step 3). The tone starts each trial, which repeats multiple times daily for 5 consecutive days. During this phase, there were two tests each day and 5 to 10 trials within each test. Behavior is recorded using an infrared camera placed beneath the box. DeepLabCut is used to track the positions of the mouse’s body parts, and an LED light signals the start of each trial (triggered by the tone in Phases I and III, or by the lever press in Phase II). (D–I) Schematic of DeepLabCut results. The legend on the right shows the color code for the different body parts. Each point represents the position of a body part in one frame, and all frames from a session are combined to create a 2D spatial representation. Samples from the same mouse across the three phases are shown. (D) Day 1 of Phase I: The mouse explored the entire box. (E) By day 8, the mouse spent most of its time near the spout, ignoring the rest of the box. (F) Day 1 of Phase II: The mouse was mostly near the spout but, since the spout only appears after pressing the lever, the mouse continued to explore the box. (G) Once the mouse learned that pressing the lever releases the spout, it spent more time around the lever and spout. (H,I) In Phase III, the mouse has already associated both the lever and spout with the reward and spent most of its time around these two objects. (J) Labeled frame from one mouse. The arrows indicate the LED light, the lever, and the spout.
![Biomedicines 12 02755 g004]()
Figure 5.
Behavior data of mice tested with our operant conditioning paradigm. For each graph, data are organized by day and test (x-axis, two tests per day, totaling 16 time points for Phase I, 12 for Phase II, and 10 for Phase III), and each data point is an average of all the trials within each test. The averages for each individual animal are shown in gray (n = 8), and orange indicates the average of all mice. During Phase I, mice learned to reach the spout more rapidly at the beginning of the tone ((A), latency to the first lick; RM 1-way ANOVA Day p < 0.0001, F(3.768, 25.88) = 22.05, Dunnett’s multiple comparisons test p< 0.001 from day 4 to day 8) and make a higher number of licks ((B), # of licks; RM 1-way ANOVA Day p < 0.0001, F(3.796, 26.57) = 19.02, Dunnett’s multiple comparisons test p < 0.001 from day 4 to day 8). Overall animals make more successful trials as training progresses ((C), % of successful trials; RM 1-way ANOVA Day p < 0.001, F(3.153, 22.07) = 64.69, Dunnett’s multiple comparisons test p < 0.01 on day 3, p < 0.001 on day 4 to 8). During Phase II, mice learned to press the lever more rapidly ((D), latency to the first lick; RM 1-way ANOVA Day p < 0.01, F(3.390, 24.96) = 6.262, Dunnett’s multiple comparisons test p < 0.01 from day 4 to day 6) and increase the total number of lever presses ((E), # of presses; RM 1-way ANOVA Day p < 0.05, F(2.938, 21.90) = 3.697, Dunnett’s multiple comparisons test did not show any significance). There was a significant reduction in the time it took the animal to reach the spout after pressing the lever, demonstrating a learned association between the operant action and the reward consumption ((F), latency to the first lick; RM 1-way ANOVA Day p < 0.05, F(3.399, 22.25) = 22.05, Dunnett’s multiple comparisons test p < 0.05 from day 4 to day 6). During Phase III, mice already exhibited a low latency to press the lever from the beginning of the tone and low latency to first lick after lever press ((G), latency to press, and (H), latency to first lick from lever press; both are not significant by RM 1-way ANOVA). There was however an increase in the number of successful trials, as indicated by the ratio of trials in which the mouse successfully pressed the lever during the tone and successfully consumed the reward after the lever press vs. the trials in which one or both steps were missing ((I), % of successful trials; RM 1-way ANOVA Day p < 0.05, F(2.793, 19.55) = 4.978, Dunnett’s multiple comparisons test p < 0.05 on day 5).
Figure 5.
Behavior data of mice tested with our operant conditioning paradigm. For each graph, data are organized by day and test (x-axis, two tests per day, totaling 16 time points for Phase I, 12 for Phase II, and 10 for Phase III), and each data point is an average of all the trials within each test. The averages for each individual animal are shown in gray (n = 8), and orange indicates the average of all mice. During Phase I, mice learned to reach the spout more rapidly at the beginning of the tone ((A), latency to the first lick; RM 1-way ANOVA Day p < 0.0001, F(3.768, 25.88) = 22.05, Dunnett’s multiple comparisons test p< 0.001 from day 4 to day 8) and make a higher number of licks ((B), # of licks; RM 1-way ANOVA Day p < 0.0001, F(3.796, 26.57) = 19.02, Dunnett’s multiple comparisons test p < 0.001 from day 4 to day 8). Overall animals make more successful trials as training progresses ((C), % of successful trials; RM 1-way ANOVA Day p < 0.001, F(3.153, 22.07) = 64.69, Dunnett’s multiple comparisons test p < 0.01 on day 3, p < 0.001 on day 4 to 8). During Phase II, mice learned to press the lever more rapidly ((D), latency to the first lick; RM 1-way ANOVA Day p < 0.01, F(3.390, 24.96) = 6.262, Dunnett’s multiple comparisons test p < 0.01 from day 4 to day 6) and increase the total number of lever presses ((E), # of presses; RM 1-way ANOVA Day p < 0.05, F(2.938, 21.90) = 3.697, Dunnett’s multiple comparisons test did not show any significance). There was a significant reduction in the time it took the animal to reach the spout after pressing the lever, demonstrating a learned association between the operant action and the reward consumption ((F), latency to the first lick; RM 1-way ANOVA Day p < 0.05, F(3.399, 22.25) = 22.05, Dunnett’s multiple comparisons test p < 0.05 from day 4 to day 6). During Phase III, mice already exhibited a low latency to press the lever from the beginning of the tone and low latency to first lick after lever press ((G), latency to press, and (H), latency to first lick from lever press; both are not significant by RM 1-way ANOVA). There was however an increase in the number of successful trials, as indicated by the ratio of trials in which the mouse successfully pressed the lever during the tone and successfully consumed the reward after the lever press vs. the trials in which one or both steps were missing ((I), % of successful trials; RM 1-way ANOVA Day p < 0.05, F(2.793, 19.55) = 4.978, Dunnett’s multiple comparisons test p < 0.05 on day 5).
![Biomedicines 12 02755 g005]()
Figure 6.
Phase I: Classical Conditioning. Mice have access to the spout, but the reward is only delivered if the mouse licks the spout during the tone. (A,B) Sample calcium traces of dSPNs in the VS from a non-rewarded trial (A) and a rewarded trial (B) show a peak in calcium activity at the start of each licking bout, followed by a decrease until the licking ends, with no significant difference in the amplitude of each licking bout action (whether leading to reward or no-reward). This suggests the activity is associated with the initiation of licking, rather than the cue itself (blue-shaded area). (C,D) Averaged calcium traces from dSPNs recorded during Phase I. Trials where the mouse successfully receives the reward are labeled as rewarded (orange), while those where the mouse licks outside the tone window and does not receive the reward are labeled as non-rewarded (magenta). Trials with no licks are labeled as no licks (grey). Calcium activity is significantly higher at the beginning of the tone during rewarded trials ((A), 2-way ANOVA Group p < 0.0001, Tukey HSD post hoc test * p < 0.05 for rewarded vs. non-rewarded and rewarded vs. no licks). The peak amplitude during the tone (max peak within the first 2 s of tone) was also significantly higher for rewarded trials ((B), 1-way ANOVA * p < 0.05, F(1.892, 17.02) = 5.941, Dunnett’s multiple comparisons test * p < 0.05 for rewarded vs. non-rewarded and rewarded vs. no licks).
Figure 6.
Phase I: Classical Conditioning. Mice have access to the spout, but the reward is only delivered if the mouse licks the spout during the tone. (A,B) Sample calcium traces of dSPNs in the VS from a non-rewarded trial (A) and a rewarded trial (B) show a peak in calcium activity at the start of each licking bout, followed by a decrease until the licking ends, with no significant difference in the amplitude of each licking bout action (whether leading to reward or no-reward). This suggests the activity is associated with the initiation of licking, rather than the cue itself (blue-shaded area). (C,D) Averaged calcium traces from dSPNs recorded during Phase I. Trials where the mouse successfully receives the reward are labeled as rewarded (orange), while those where the mouse licks outside the tone window and does not receive the reward are labeled as non-rewarded (magenta). Trials with no licks are labeled as no licks (grey). Calcium activity is significantly higher at the beginning of the tone during rewarded trials ((A), 2-way ANOVA Group p < 0.0001, Tukey HSD post hoc test * p < 0.05 for rewarded vs. non-rewarded and rewarded vs. no licks). The peak amplitude during the tone (max peak within the first 2 s of tone) was also significantly higher for rewarded trials ((B), 1-way ANOVA * p < 0.05, F(1.892, 17.02) = 5.941, Dunnett’s multiple comparisons test * p < 0.05 for rewarded vs. non-rewarded and rewarded vs. no licks).
![Biomedicines 12 02755 g006]()
Figure 7.
Phase II: Operant conditioning. Averaged dSPNs calcium traces recorded during Phase II. Mice must press the lever to make the spout available for reward consumption. Trials start at the pressing of the lever but are considered rewarded (orange) only if the mouse actively consumes the reward (licks are detected after the lever press); otherwise, the trial is considered non-rewarded (magenta). (A) Mice show a peak of calcium activity at the initiation of the lever press (dotted line; reward availability labeled with pink-shaded area), although peak amplitude is on average greater if the mouse moves to consume the reward (rewarded trials, orange) compared to trials where the mouse presses the lever but does not approach and lick the spout (non-rewarded trials, blue; 2-way ANOVA Group p < 0.0001, Tukey HSD post hoc test * p < 0.05 for rewarded vs. no licks). This suggests that dSPN neurons in the VS have higher activity when the mouse intends to press the lever and then lick the spout, compared to when the mouse accidentally presses the lever but does not intend to lick the spout. (B) Quantification of the peak amplitude at the maximum peak during the trial shows a much higher amplitude for trials, in which the reward is consumed versus trials that are not rewarded (2-tailed t-test * p < 0.05). (C) Average amplitude of peaks at the lever press (light orange) or at the first lick in a bout (dark orange), plotted by day of training. Interestingly, calcium peak amplitude increases over training, as mice become faster at pressing the lever and then reaching and consuming the reward (2-way ANOVA Group * p < 0.001, Tukey HSD post hoc test * p < 0.05, F(1,14) = 24.21 for presses vs. licks, Dunnett’s multiple comparisons test * p < 0.05 for day and day 5).
Figure 7.
Phase II: Operant conditioning. Averaged dSPNs calcium traces recorded during Phase II. Mice must press the lever to make the spout available for reward consumption. Trials start at the pressing of the lever but are considered rewarded (orange) only if the mouse actively consumes the reward (licks are detected after the lever press); otherwise, the trial is considered non-rewarded (magenta). (A) Mice show a peak of calcium activity at the initiation of the lever press (dotted line; reward availability labeled with pink-shaded area), although peak amplitude is on average greater if the mouse moves to consume the reward (rewarded trials, orange) compared to trials where the mouse presses the lever but does not approach and lick the spout (non-rewarded trials, blue; 2-way ANOVA Group p < 0.0001, Tukey HSD post hoc test * p < 0.05 for rewarded vs. no licks). This suggests that dSPN neurons in the VS have higher activity when the mouse intends to press the lever and then lick the spout, compared to when the mouse accidentally presses the lever but does not intend to lick the spout. (B) Quantification of the peak amplitude at the maximum peak during the trial shows a much higher amplitude for trials, in which the reward is consumed versus trials that are not rewarded (2-tailed t-test * p < 0.05). (C) Average amplitude of peaks at the lever press (light orange) or at the first lick in a bout (dark orange), plotted by day of training. Interestingly, calcium peak amplitude increases over training, as mice become faster at pressing the lever and then reaching and consuming the reward (2-way ANOVA Group * p < 0.001, Tukey HSD post hoc test * p < 0.05, F(1,14) = 24.21 for presses vs. licks, Dunnett’s multiple comparisons test * p < 0.05 for day and day 5).
![Biomedicines 12 02755 g007]()
Figure 8.
Phase III: Cued-Operant Conditioning. Averaged dSPN calcium traces recorded during
Phase III, where mice were required to press the lever during the tone to make the spout available for reward consumption. (
A) Analyzed sample trace of dSPN calcium signals recorded in the VS of a mouse performing
Phase III of the operant conditioning task. The image indicates the start of the auditory cue (black dotted line), the initiation of the lever press (red dotted line), showing the increase in activity of the motor action (as explained in
Figure 4), and the beginning of the lick event (each lick is labeled by an orange tick at the top of the image). Calcium activity is reduced for the duration of the lick event and only returns once the mouse is finished licking (when the spout retracts after the end of the reward availability period −5 s from the lever press). (
B) Trials began at the tone onset and were classified as
rewarded (orange) if the mouse pressed the lever within the tone’s time window and consumed the reward (licks detected after the lever press): Otherwise, the trial was categorized as a
no-presses (magenta) trial. Mice displayed a peak in calcium activity when the lever was pressed within the time in which the tone was playing (blue-shaded area), with calcium signals showing a depression during reward consumption, similar to the activity observed in
Phase II. The dotted line indicates the onset of the tone, but no specific tone response was detected during
Phase III. (
C) When signals were aligned by event onset, the data showed a significant difference between the peak at the initiation of the lever press (light orange) and the activity during reward consumption (dark orange; 2-way ANOVA Group,
p < 0.0001,
Tukey HSD post hoc test *
p < 0.05 for
lever presses vs.
licks). (
D) Calcium signals were recorded in the mouse home cage before training to detect baseline and acclimatize the animal to the optic fiber. Signals were then recorded after the mouse had completed the three phases of the experiment. Average peak amplitudes of both naïve (light grey) and trained animals (light brown) were comparable to signals detected during training (in-training values detected from day 3 of
Phase II), indicating that there is no loss in signal after multiple days of recording. Nevertheless, the average peak amplitudes were higher for the peaks at lever press (1-way ANOVA Group, **
p < 0.01, F
(3,28) = 5.784,
Tukey HSD post hoc test *
p < 0.05 and **
p < 0.01 as indicated in the graph). (
E) The average amplitude of calcium peaks during lever presses or at the beginning of the first lick bout, plotted by day of training, revealed that the calcium peak amplitude during lever presses (light orange) or the activity at lick events (light orange) did not increase over time. This result suggests that there was no significant learning progression in
Phase III, as the animals were already proficient in the task from day 1 of
Phase III and did not require an additional 4 days to improve their performance.
Figure 8.
Phase III: Cued-Operant Conditioning. Averaged dSPN calcium traces recorded during
Phase III, where mice were required to press the lever during the tone to make the spout available for reward consumption. (
A) Analyzed sample trace of dSPN calcium signals recorded in the VS of a mouse performing
Phase III of the operant conditioning task. The image indicates the start of the auditory cue (black dotted line), the initiation of the lever press (red dotted line), showing the increase in activity of the motor action (as explained in
Figure 4), and the beginning of the lick event (each lick is labeled by an orange tick at the top of the image). Calcium activity is reduced for the duration of the lick event and only returns once the mouse is finished licking (when the spout retracts after the end of the reward availability period −5 s from the lever press). (
B) Trials began at the tone onset and were classified as
rewarded (orange) if the mouse pressed the lever within the tone’s time window and consumed the reward (licks detected after the lever press): Otherwise, the trial was categorized as a
no-presses (magenta) trial. Mice displayed a peak in calcium activity when the lever was pressed within the time in which the tone was playing (blue-shaded area), with calcium signals showing a depression during reward consumption, similar to the activity observed in
Phase II. The dotted line indicates the onset of the tone, but no specific tone response was detected during
Phase III. (
C) When signals were aligned by event onset, the data showed a significant difference between the peak at the initiation of the lever press (light orange) and the activity during reward consumption (dark orange; 2-way ANOVA Group,
p < 0.0001,
Tukey HSD post hoc test *
p < 0.05 for
lever presses vs.
licks). (
D) Calcium signals were recorded in the mouse home cage before training to detect baseline and acclimatize the animal to the optic fiber. Signals were then recorded after the mouse had completed the three phases of the experiment. Average peak amplitudes of both naïve (light grey) and trained animals (light brown) were comparable to signals detected during training (in-training values detected from day 3 of
Phase II), indicating that there is no loss in signal after multiple days of recording. Nevertheless, the average peak amplitudes were higher for the peaks at lever press (1-way ANOVA Group, **
p < 0.01, F
(3,28) = 5.784,
Tukey HSD post hoc test *
p < 0.05 and **
p < 0.01 as indicated in the graph). (
E) The average amplitude of calcium peaks during lever presses or at the beginning of the first lick bout, plotted by day of training, revealed that the calcium peak amplitude during lever presses (light orange) or the activity at lick events (light orange) did not increase over time. This result suggests that there was no significant learning progression in
Phase III, as the animals were already proficient in the task from day 1 of
Phase III and did not require an additional 4 days to improve their performance.
![Biomedicines 12 02755 g008]()
Figure 9.
(
A) Analyzed average sample traces of dSPN calcium signals during
rewarded trials from each individual mouse recorded during
Phase III of the operant conditioning task. The image indicates the start of the auditory cue (black dotted line and blue-shaded area), showing a consistent increase in activity (as explained in
Figure 8). Each mouse peak timing was different, probably due to differences in the latency to approach the lever from the beginning of the tone and engagement with the lever. Nevertheless, all mice showed a peak in activity after the tone when they pressed the lever. (
B,
C) All calcium traces from
non-rewarded (
B) and
rewarded (
C) trials from one mouse during
Phase III. Trials are aligned at time 0 (beginning of the trial) and plotted with the last 2 s of the previous trial. For the first trial of each test each day, there is no previous trial, so there is no data from the 2 s prior to the beginning of the trial. Yellow indicates high calcium activity, while blue indicates low calcium activity. (
B) Data show no clear alignment of activity around the beginning of the tone (red dotted line). (
C) There is a clear peak in activity immediately after the start of the tone, indicating when the mouse is pressing the lever. There is also a consistent depression in activity after the lever press, probably indicating the depression during reward consumption, consistent with the averaged traces in
Figure 8.
Figure 9.
(
A) Analyzed average sample traces of dSPN calcium signals during
rewarded trials from each individual mouse recorded during
Phase III of the operant conditioning task. The image indicates the start of the auditory cue (black dotted line and blue-shaded area), showing a consistent increase in activity (as explained in
Figure 8). Each mouse peak timing was different, probably due to differences in the latency to approach the lever from the beginning of the tone and engagement with the lever. Nevertheless, all mice showed a peak in activity after the tone when they pressed the lever. (
B,
C) All calcium traces from
non-rewarded (
B) and
rewarded (
C) trials from one mouse during
Phase III. Trials are aligned at time 0 (beginning of the trial) and plotted with the last 2 s of the previous trial. For the first trial of each test each day, there is no previous trial, so there is no data from the 2 s prior to the beginning of the trial. Yellow indicates high calcium activity, while blue indicates low calcium activity. (
B) Data show no clear alignment of activity around the beginning of the tone (red dotted line). (
C) There is a clear peak in activity immediately after the start of the tone, indicating when the mouse is pressing the lever. There is also a consistent depression in activity after the lever press, probably indicating the depression during reward consumption, consistent with the averaged traces in
Figure 8.
![Biomedicines 12 02755 g009]()
Figure 10.
Working model. The schematic illustrates distinct components of dSPN activity associated with different goal-directed actions across learning phases, rather than changes in peak dynamics or kinetics, which can be variable depending on the timing of the actions. In Phase I, dSPNs show activity related to reward consumption following lick initiation. In Phase II, dSPNs are prominently active during lever pressing in late training, indicating their involvement in learned motor actions critical for reward acquisition. This activity pattern suggests that dSPNs are integral to reinforcing motor behaviors linked to obtaining rewards. By Phase III, after the task is learned, dSPNs display activity related to both lever pressing and reward consumption (wider peak), though the intensity of this activity is generally lower than the lever press activity during Phase II. This reduced activity implies that while dSPNs continue to support goal-directed actions, their role lessens as the behavior becomes more habitual. Overall, these findings indicate that dSPNs are essential for encoding the action–reward relationship during the initial learning stages, but their involvement remains constant once the task is habitual.
Figure 10.
Working model. The schematic illustrates distinct components of dSPN activity associated with different goal-directed actions across learning phases, rather than changes in peak dynamics or kinetics, which can be variable depending on the timing of the actions. In Phase I, dSPNs show activity related to reward consumption following lick initiation. In Phase II, dSPNs are prominently active during lever pressing in late training, indicating their involvement in learned motor actions critical for reward acquisition. This activity pattern suggests that dSPNs are integral to reinforcing motor behaviors linked to obtaining rewards. By Phase III, after the task is learned, dSPNs display activity related to both lever pressing and reward consumption (wider peak), though the intensity of this activity is generally lower than the lever press activity during Phase II. This reduced activity implies that while dSPNs continue to support goal-directed actions, their role lessens as the behavior becomes more habitual. Overall, these findings indicate that dSPNs are essential for encoding the action–reward relationship during the initial learning stages, but their involvement remains constant once the task is habitual.
![Biomedicines 12 02755 g010]()