The sequence was acquired on-board under normal urban driving conditions. The images are monochrome and of 480X960 pixels. We used a 4mm focal length lens, so providing a wide field of view. We drove during 30 minutes approximately, giving rise to a sequence of around 60,000 frames. Then, using steps of 10 frames we annotated all the pedestrians. This turns out in 7,900 annotated pedestrians, 5,400 reasonable and non occluded. We have divided the video sequence into three sequential parts, the first one for training, the last one for testing, in the middle we have leaved a gap for avoiding testing and training with the same persons. Overall we train with 3,600 reasonable pedestrians, and test on 1,300 reasonable ones.