We are interested particularly in the estimation of passenger flows entering or exiting from buses. To achieve this measurement, we propose a counting system based on stereo vision. To extract three-dimensional information in a reliable way, we use a dense stereo-matching procedure in which the winner-takes-all technique minimizes a correlation score. This score is an improved version of the sum of absolute differences, including several similarity criteria determined on pixels or regions to be matched. After calculating disparity maps for each image, morphological operations and a binarization with multiple thresholds are used to localize the heads of people passing under the sensor. The markers describing the heads of the passengers getting on or off the bus are then tracked during the image sequence to reconstitute their trajectories. Finally, people are counted from these reconstituted trajectories. The technique suggested was validated by several realistic experiments. We showed that it is possible to obtain counting accuracy of 99% and 97% on two large realistic data sets of image sequences showing realistic scenarios.