Hostname: page-component-77c89778f8-swr86 Total loading time: 0 Render date: 2024-07-19T22:41:27.308Z Has data issue: false hasContentIssue false

Multi-agent system for people detection and tracking using stereo vision in mobile robots

Published online by Cambridge University Press:  30 September 2008

R. Muñoz-Salinas*
Department of Computing and Numerical Analysis, University of Córdoba, 14071 Córdoba, Spain.
E. Aguirre
Department of Computer Science and Artificial Intelligence, University of Granada, 18071 Granada, Spain.
M. García-Silvente
Department of Computer Science and Artificial Intelligence, University of Granada, 18071 Granada, Spain.
A. Ayesh
Intelligent Mobile Robots and Creative Computing Research Group, Computer Engineering Division - School of Computing, De Montfort University, Leicester, UK.
M. Góngora
Intelligent Mobile Robots and Creative Computing Research Group, Computer Engineering Division - School of Computing, De Montfort University, Leicester, UK.
*Corresponding author. E-mail:


People detection and tracking are essential capabilities in order to achieve a natural human–robot interaction. A great portion of the research in that area has been focused on monocular techniques. However, the use of stereo vision for these purposes concentrates a great interest nowadays. This paper presents a multi-agent system that implements a basic set of perceptual-motor skills providing mobile robots with primitive interaction capabilities. The skills designed use stereo and ultrasound information to enable mobile robots to (i) detect an interested user who desires to interact with the robot, (ii) keep track of the user while they move in the environment without confusing them with other people, and (iii) follow the user along the environment avoiding obstacles in the way. The system presented has been evaluated in several real-life experiments achieving good results and real-time performance on modest computers.

Copyright © Cambridge University Press 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


1. Aherne, F., Thacker, N. and Rockett, P., “The Bhattacharyya metric as an absolute similarity measure for frequency coded data,” Kybernetica 32, 17 (1997).Google Scholar
2. Arkin, R. C., “Path Planning for a Vision-Based Autonomous Robot,” In: Proceedings of the SPIE Conference on Mobile Robots, Cambridge, MA (1986) pp. 240249.Google Scholar
3. Arkin, R. C., Behavior-Based Robotics (MIT Press, Cambridge, MA, 1998).Google Scholar
4. Bennewitz, M., Burgard, W., Cielniak, G. and Thrun, S., “Learning motion patterns of people for compliant motion,” Int. J. Robot. Res. 24, 3148 (2005).CrossRefGoogle Scholar
5. Beymer, D. and Knolige, K., “Tracking people from a mobile platform,” Experimental Robotics VIII 5 (2003) 234244.CrossRefGoogle Scholar
6. Borenstein, J. and Koren, Y., “Real-time obstacle avoidance for fast mobile robots,” IEEE Trans. Syst. Man Cybernet. 19 (5), 11791187 (1989).CrossRefGoogle Scholar
7. Breazeal, C., “Social interactions in HRI: The robot view,” IEEE Trans. Syst. Man Cybernet., Part C 34, 181186 (2004).CrossRefGoogle Scholar
8. Brooks, R. A., “A robust layered control system for a mobile robot,” IEEE J. Robot. Autonom., RA-2, 1423 (1986).CrossRefGoogle Scholar
9. Brown, M. Z., Burschka, D. and Hager, G. D., “Advances in computational stereo,” IEEE Trans. Pattern Anal. Mach. Intell. 25, 9931008 (2003).CrossRefGoogle Scholar
10. Burgard, W., Cremers, A. B., Fox, D., Hähnel, D., Lakemeyer, G., Schulz, D., Steiner, W. and Thurn, S., “Experiences with an interactive museum tour-guide robot,” Artif. Intell. 144, 355 (1999).CrossRefGoogle Scholar
11. Checka, N., Wilson, K., Rangarajan, V. and Darrell, T., “A Probabilistic Framework for Multi-modal Multi-person Tracking,” In: Conference on Computer Vision and Pattern Recognition Workshop (2003) pp. 100–107.Google Scholar
12. Cipolla, R. and Yamamoto, M., “Stereoscopic tracking of bodies in motion,” Image Vis. Comput. 8, 8590 (1990).CrossRefGoogle Scholar
13. Colombo, C., Del Bimbo, A. and Valli, A., “Visual capture and understanding of hand pointing actions in a 3-D environment,” IEEE Trans. Syst. Man Cybernet., Part B 33, 677686 (2003).CrossRefGoogle Scholar
14. Darrell, T., Gordon, G., Harville, M. and Woodfill, J., “Integrated person tracking using stereo, color, and pattern detection,” Int. J. Comput. Vis. 37, 175185 (2000).CrossRefGoogle Scholar
15. Eklundh, J.-O., Nordlund, P. and Uhlin, T., “Issues in Active Vision: Attention and Cue Integration/selection,” In: British Machine Vision Conference (September 1996) pp. 1–12.Google Scholar
16. Falcone, E., Gockley, R., Porter, E. and Nourbakhsh, I., “The personal rover project: The comprehensive design of a domestic personal robot,” Robot. Autonom. Syst. 42, 245258 (2003).CrossRefGoogle Scholar
17. Foley, J. D. and van Dam, A., Fundamentals of Interactive Computer Graphics (Addison Wesley, Boston, MA, 1982).Google Scholar
18. Fong, T., Nourbakhsh, I. and Dautenhahn, K., “A survey of socially interactive robots,” Robot. Autonom. Syst. 42, 143166 (2003).CrossRefGoogle Scholar
19. Franklin, D., Kahn, R. E., Swain, M. J. and Firby, R. J., “Happy Patrons make Better Tippers: Creating a Robot Waiter using Perseus and the Animate Agent Architecture,” In: International Conference on Automatic Face and Gesture Recognition (1996) pp. 14–16.Google Scholar
20. Fritsch, J., Kleinehagenbrock, M., Lang, S., Plötz, T., Fink, G. A. and Sagerer, G., “Multi-modal anchoring for human-robot interaction,” Robot. Autonom. Syst. 43, 133147 (2003).CrossRefGoogle Scholar
21. Fujita, M. and Kitano, H., “Development of an autonomous quadruped robot for robot entertainment,” Autonom. Robot. 5, 718 (1998).CrossRefGoogle Scholar
22. Gat, E., Reliable Goal-Directed Reactive Control of Autonomous Mobile Robots Ph.D. Thesis (Virginia Polytechnic Institute, 1991).Google Scholar
23. Ghidary, S. S., Nakata, Y., Takamori, T. and Hattori, M., “Human Detection and Localization at Indoor Environment by Home Robot,” IEEE International Conference on System Man, and Cybernetics, 2 (2000) pp. 13601365.CrossRefGoogle Scholar
24. Grewal, M. S. and Andrews, A. P., Kalman Filtering: Theory and Practice (Prentice Hall, Englewood Cliff, NJ, 1993).Google Scholar
25. Haritaoglu, I., Harwood, D. and Davis, L. S., “W4: Real-time surveillance of people and their activities,” IEEE Trans. Pattern Anal. Mach. Intell. 22, 809830 (2000).CrossRefGoogle Scholar
26. Harville, M., “Stereo person tracking with adaptive plan-view templates of height and occupancy statistics,” Image Vis. Comput. 2, 127142 (2004).CrossRefGoogle Scholar
27. Hayashi, K., Hashimoto, M., Sumi, K. and Sasakawa, K., “Multiple-person Tracker with a Fixed Slanting Stereo Camera,” In: 6th IEEE International Conference on Automatic Face and Gesture Recognition (2004) pp. 681–686.Google Scholar
28. Hayashi, K., Hirai, T., Sumi, K. and Sasakawa, K., “Multiple-person tracking using a plan-view map with error estimation,” Computer Vision – ACCV 2006. Lectures Notes on Computer Sciences, 3851 (2006), pp. 359368.CrossRefGoogle Scholar
29. Hirai, N. and Mizoguchi, H., “Visual Tracking of Human Back and Shoulder for Person Following Robot,” In: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 1 (2003) pp. 527532.Google Scholar
30. Hoppenot, P. and Colle, E., “Localization and control of a rehabilitation mobile robot by close human-machine cooperation,” IEEE Trans. Neural Syst. Rehabil. Eng. 9, 181190 (2001).CrossRefGoogle ScholarPubMed
31. Intel. OpenCV: Open source Computer Vision library. Scholar
32. Jojic, N., Brumitt, B., Meyers, B., Harris, S. and Huang, T., “Detection and Estimation of Pointing Gestures in Dense Disparity Maps,” In: 4th IEEE International Conference Automatic Face and Gesture Recognition (2000) pp. 468–475.Google Scholar
33. Kahn, R. E., Swain, M. J., Prokopowicz, P. N. and Firby, R. J., “Gesture Recognition using the Perseus Architecture,” In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '96) (1996) pp. 734–741.Google Scholar
34. Kehl, R. and Van Gool, L., “Real-time Pointing Gesture Recognition for an Immersive Environment,” In: 6th IEEE International Conference on Automatic Face and Gesture Recognition (2004) pp. 577–582.Google Scholar
35. Kruppa, H., Castrillon-Santana, M. and Schiele, B., “Fast and Robust Face Finding via Local Context,” Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (2003).Google Scholar
36. Lauria, S., Bugmann, G., Kyriacou, T., Bos, J. and Klein, A., “Training personal robots using natural language instruction,” IEEE Intell. Syst. Appl. 16, 3845 (2001).CrossRefGoogle Scholar
37. Lienhart, R. and Maydt, J., “An Extended Set of Haar-like Features for Rapid Object Detection,” IEEE Conference on Image Processing (2002) pp. 900–903.Google Scholar
38. Maes, P., Darrell, T., Blumberg, B. and Pentland, A., “The alive system: Full-body interaction with autonomous agents,” IEEE Press in Computer Animation (1995) pp. 11–18.Google Scholar
39. Mu noz-Salinas, R., Aguirre, E., García-Silvente, M. and Gómez, M., “A multi-agent system architecture for mobile robot navigation based on fuzzy and visual behaviours,” Robotica 23, 689699 (2005).CrossRefGoogle Scholar
40. Pineau, J., Montemerlo, M., Pollack, M., Roy, N. and Thrun, S., “Towards robotic assistants in nursing homes: Challenges and results,” Robot. Autonom. Syst. 42, 271281 (2003).CrossRefGoogle Scholar
41. Saito, H., Ishimura, K., Hattori, M. and Takamori, T., “Multi-modal Human Robot Interaction for Map Generation,” In: 41st SICE Annual Conference (SICE 2002), 5 (2002) pp. 27212724.CrossRefGoogle Scholar
42. Schulz, D., Burgard, W., Fox, D. and Cremers, A. B., “People tracking with mobile robots using sample-based joint probabilistic data association filters,” Int. J. Robot. Res. 22 (2), 99116 (2003).CrossRefGoogle Scholar
43. Severinson-Eklundh, K., Green, A. and Hüttenrauch, H., “Social and collaborative aspects of interaction with a service robot,” Robot. Autonom. Syst. 42, 223234 (2003).CrossRefGoogle Scholar
44. Sidenbladh, H., Kragic, D. and Christensen, H. I., “A Person Following Behaviour for a Mobile Robot,” In: IEEE International Conference on Robotics and Automation, 1 (1999) pp. 670675.CrossRefGoogle Scholar
45. Siegwart, R., Arras, K. O., Bouabdallah, S., Burnier, D., Froidevaux, G., Greppin, X., Jensen, B., Lorotte, A., Mayor, L. and Meisser, M., “Robox at expo.02: A large-scale installation of personal robots,” Robot. Autonom. Syst. 42, 203222 (2003).CrossRefGoogle Scholar
46. Sigal, L., Sclaroff, S. and Athitsos, V., “Skin color-based video segmentation under time-varying illumination,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 862877 (2004).CrossRefGoogle ScholarPubMed
47. Viola, P. and Jones, M., “Rapid Object Detection using a Boosted Cascade of Simple Features,” In: IEEE Conference on Computer Vision and Pattern Recognition (2001) pp. 511–518.Google Scholar
48. Zhao, T., Aggarwal, M., Kumar, R. and Sawhney, H., “Real-time Wide Area Multi-camera Stereo Tracking,” In: Computer Vision and Pattern Recognition (CVPR 2005) (2005) pp. 976– 983.Google Scholar