GECCO 2003 - LNCS 2723

Limits in Long Path Learning with XCS

Alwyn Barry

Department of Computer Science
University of Bath
Claverton Down
Bath, BA2 7AY, UK
A.M.Barry@bath.ac.uk

Abstract. The development of the XCS Learning Classifier System [26] has produced a stable implementation, able to consistently identify the accurate and optimally general population of classifiers mapping a given reward landscape [15,16,29]. XCS is particularly powerful within direct-reward environments, and notably within problems suitable for commercial application [3]. The application of XCS within delayed reward environments has also shown promise, although early investigations were within enviroments with a comparatively short delay to reward (e.g. [28,19]). Subsequent systematic investigation [19,20,1,2] have suggested that XCS has difficulty accurately mapping and exploiting even simple environments with moderate reward delays. This paper summarises these results and presents new results that identify some limits and their implications. A modification to the error computation within XCS is introduced that allows the minimum error parameter to be applied relative to the magnitude of the payoff to each classifier. First results demonstrate that this modification enables XCS to successfully map longer delayed-reward enviroments.

LNCS 2724, p. 1832 ff.

Full article in PDF