Basic Rescorla-Wagner Model Tutorial

This tutorial introduces SPICE using a simple Rescorla-Wagner learning model. You’ll learn how to:

Set up a basic SPICE model
Train it on simulated data
Extract and interpret the discovered equations

Prerequisites

Before starting this tutorial, make sure you have:

SPICE installed (pip install autospice)
Basic understanding of reinforcement learning
Familiarity with Python and NumPy

The Rescorla-Wagner Model

The Rescorla-Wagner model is a fundamental model of associative learning that describes how associations between stimuli and outcomes are learned through experience. The basic equation is:

ΔV = α(λ - V)

where:

V is the associative strength
α is the learning rate
λ is the maximum possible associative strength
ΔV is the change in associative strength

Tutorial Contents

Setting up the environment
Creating simulated data
Training the SPICE model
Analyzing the results
Interpreting the equations

Interactive Version

This is the static web version of the tutorial. For an interactive version:

Go to the SPICE repository
Navigate to tutorials/1_rescorla_wagner.ipynb
Run the notebook in Jupyter

Full Tutorial

View or download the complete notebook

Step-by-Step Guide

1. Setup

First, let’s import the necessary modules:

from spice.estimator import SpiceEstimator
from spice.precoded import RescorlaWagnerRNN, RESCOLA_WAGNER_CONFIG
from spice.resources.bandits import BanditsDrift, AgentQ, create_dataset
import numpy as np

2. Create the Environment

We’ll create a two-armed bandit environment:

environment = BanditsDrift(
    sigma=0.2,  # Noise level
    n_actions=2  # Number of arms
)

3. Create the Agent

Set up a Q-learning agent with specific learning parameters:

agent = AgentQ(
    n_actions=2,
    alpha_reward=0.6,   # Learning rate for rewards
    alpha_penalty=0.6,  # Learning rate for penalties
    forget_rate=0.3,    # Rate of forgetting
)

4. Generate Data

Create a synthetic dataset for training:

dataset, _, _ = create_dataset(
    agent=agent,
    environment=environment,
    n_trials=200,    # Trials per session
    n_sessions=256,  # Number of sessions
)

5. Create and Train SPICE Model

Set up and train the SPICE model:

spice_estimator = SpiceEstimator(
    rnn_class=RescorlaWagnerRNN,
    spice_config=RESCOLA_WAGNER_CONFIG,
    hidden_size=8,
    learning_rate=5e-3,
    epochs=16,
    verbose=True
)

spice_estimator.fit(dataset.xs, dataset.ys)

6. Extract Learned Features

Examine what SPICE has learned:

features = spice_estimator.spice_agent.get_spice_features()
for id, feat in features.items():
    print(f"\nAgent {id}:")
    for model_name, (feat_names, coeffs) in feat.items():
        print(f"  {model_name}:")
        for name, coeff in zip(feat_names, coeffs):
            print(f"    {name}: {coeff}")

7. Make Predictions

Use the trained model to make predictions:

pred_rnn, pred_spice = spice_estimator.predict(dataset.xs)

Understanding the Results

The SPICE model should discover equations similar to the Rescorla-Wagner update rule. Key things to look for:

The relationship between reward prediction error and value updates
The learning rate parameter
How well the discovered equations match the original agent’s parameters

Next Steps

After completing this tutorial, you can:

Experiment with different parameter values
Try more complex environments
Move on to the Rescorla-Wagner with Forgetting tutorial

Common Issues and Solutions

Poor Convergence: Try increasing the number of epochs or adjusting the learning rate
Overfitting: Reduce the hidden size or increase the dataset size
Unstable Training: Adjust the optimizer parameters or reduce the learning rate