The ability to process events in their temporal and sequential context is a fundamental skill made mandatory by constant interaction with a dynamic environment. Sequence learning studies have demonstrated that subjects exhibit detailed — and often implicit — sensitivity to the sequential structure of streams of stimuli. Current connectionist models of performance in the so-called Serial Reaction Time Task (SRT), however, fail to capture the fact that sequence learning can be based not only on sensitivity to the sequential associations between successive stimuli, but also on sensitivity to the associations between successive responses, and on the predictive relationships that exist between these sequences of responses and their effects in the environment. In this paper, we offer an initial exploration of an alternative architecture for sequence learning, based on the principles of Forward Models.