Skip to main content

Robot-to-human feedback and automatic object grasping using an RGB-D camera–projector system

  • Jinglin Shen (a1) and Nicholas Gans (a2)

This paper presents a novel system for human–robot interaction in object-grasping applications. Consisting of an RGB-D camera, a projector and a robot manipulator, the proposed system provides intuitive information to the human by analyzing the scene, detecting graspable objects and directly projecting numbers or symbols in front of objects. Objects are detected using a visual attention model that incorporates color, shape and depth information. The positions and orientations of the projected numbers are based on the shapes, positions and orientations of the corresponding objects. Users select a grasping target by indicating the corresponding number. Projected arrows are then created on the fly to guide a robotic arm to grasp the selected object using visual servoing and deliver the object to the human user. Experimental results are presented to demonstrate how the system is used in robot grasping tasks.

Corresponding author
*Corresponding author. E-mail:
Hide All
1. Saxena A., Driemeyer J. and Ng A. Y., “Robotic grasping of novel objects using vision,” Int. J. Robot. Res., 27 (2), 2008 (2008).
2. Jyh-Hwa T. and Kuo L. Su, “The Development of the Restaurant Service Mobile Robot with a Laser Positioning System,” Proceedings of 27th Chinese Control Conference, CCC 2008, Kunming, Yunan, China, (Jul. 2008) pp. 662–666.
3. Yamazaki K., Ueda R., Nozawa S., Kojima M., Okada K., Matsumoto K., Ishikawa M., Shimoyama I. and Inaba M., “Home-assistant robot for an aging society,” Proc. IEEE 100 (8), 24292441 (2012).
4. Johnson-Roberson M., Bohg J., Skantze G., Gustafson J., Carlson R., Rasolzadeh B. and Kragic D., “Enhanced Visual Scene Understanding Through Human-Robot Dialog,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, CA, USA (2011) pp. 3342–3348.
5. Nguyen H., Jain A., Anderson C. and Kemp C. C., “A Clickable World: Behavior Selection Through Pointing and Context for Mobile Manipulation,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, Nice, France (2008) pp. 787–793.
6. Pandey A. K., Saut J.-P., Sidobre D. and Alami R., “Towards Planning Human–Robot Interactive Manipulation Tasks: Task Dependent and Human Oriented Autonomous Selection of Grasp and Placement,” Proceedings of the 4th IEEE RAS EMBS International Conference on iomedical Robotics and Biomechatronics (BioRob), Rome, Italy (2012) pp. 1371–1376.
7. Zhou J. and Hoang J., “Real Time Robust Human Detection And Tracking System,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition – Workshops, 2005. CVPR Workshops, San Diego, CA, USA (2005) pp. 149149.
8. Dardas N. H. and Georganas N. D., “Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques,” IEEE Trans. Instrum. 60 (11), 35923607 (2011).
9. Lowe DG, “Object Recognition from Local Scale-Invariant Features,” Proceedings of the IEEE International Conference Computer Vision, volume 2, Kerkyra, Greece (1999) pp. 1150–1157.
10. Ma J. and Burdick J. W., “A Probabilistic Framework for Stereo-Vision Based 3d Object Search With 6d Pose Estimation,” Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Anchorage, Alaska, USA (2010) pp. 2036–2042.
11. Omachi S. and Omachi M., “Fast template matching with polynomials,” IEEE Trans. Image Process. 16 (8), 21392149 (2007).
12. Rasolzadeh B., Björkman M., Huebner K. and Kragic D., “An active vision system for detecting, fixating and manipulating objects in the real world,” Int. J. Rob. Res. 29 (2–3), 133154 (2010).
13. Itti L., Koch C. and Niebur E., “A model of saliency-based visual attention for rapid scene analysis,” IEEE Trans. Pattern Anal. Mach. Intell. 20 (11), 12541259 (1998).
14. Johnson-Roberson M., Bohg J., Bjo andrkman M. and Kragic D., “Attention-Based Active 3d Point Cloud Segmentation,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan (2010) pp. 1165–1170.
15. Salvi J., Pages J. and Batlle J., “Pattern codification strategies in structured light systems,” Pattern Recognit. 37, 827849 (2004).
16. Bo L., Ren X. and Fox D., “Depth Kernel Descriptors For Object Recognition,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Francisco, CA, USA (Sep. 2011) pp. 821–826.
17. Gams A., van den Kieboom J., Dzeladini F., Ude A. and Ijspeert A. Jan, “Real-time full body motion imitation on the coman humanoid robot,” Robotica 33, 10491061 (2015).
18. Takizawa H., Yamaguchi S., Aoyagi M., Ezaki N. and Mizuno S., “Kinect Cane: An Assistive System for the Visually Impaired Based on Three-Dimensional Object Recognition,” Proceedings of the IEEE/SICE International Symposium on System Integration (SII), Fukuoka, Japan (Dec. 2012) pp. 740–745.
19. Hiroi Y. and Ito A., “Asahi: Ok for Failure a Robot for Supporting Daily Life, Equipped With a Robot Avatar,” Proceedings of the 8th ACM/IEEE International Conference on Human–Robot Interaction (HRI), Tokyo, Japan (2013) pp. 141–142.
20. Mak J. N., Arbel Y., Minett J. W., McCane L. M., Yuksel B., Ryan D., Thompson D., Bianchi L. and Erdogmus D., “Optimizing the p300-based brain-computer interface: Current status, limitations and future directions,” J. Neural Eng. 8 (2), 025003 (2011).
21. Beardsley P., van Baar J., Raskar R. and Forlines C., “Interaction using a handheld projector,” IEEE Comput. Graph. Appl. 25 (1), 3943 (2005).
22. Kjeldsen R., Pinhanez C., Pingali G., Hartman J., Levas T. and Podlaseck M., “Interacting With Steerable Projected Displays,” Proceedings of the 5th IEEE International Conference on Automatic Face and Gesture Recognition, Washington, D.C., USA (2002) pp. 402–407.
23. Choi S.-W., Kim W.-J. and Lee C. Ho, “Interactive Display Robot: Projector Robot With Natural User Interface,” Proceedings of the 8th ACM/IEEE International Conference on Human–Robot Interaction (HRI), Tokyo, Japan (2013) pp. 109–110.
24. Shen J. and Gans N., “A Trifocal Tensor Based Camera-Projector System for Robot–Human Interaction,” Proceedings of the IEEE International Conference on Robotics and Biomimetics, Guangzhou, China (2012).
25. Shen J., Jin J. and Gans N., “A Multi-View Camera-Projector System for Object Detection and Robot–Human Feedback,” Proceedings of the IEEE International Conference on Robotics and Automation, Karsruhe, Germany (2013).
26. Gans N. R., Hu G., Shen J., Zhang Y. and Dixon W. E., “Adaptive visual servo control to simultaneously stabilize image and pose error,” Mechatronics 22 (4), 410422 (2012).
27. Chaumette F. and Hutchinson S., “Visual servo control part I: Basic approaches,” IEEE Robot. Autom. Mag. 13 (4), 8290 (2006).
28. Malis E. and Chaumette F., “2 1/2D visual servoing with respect to unknown objects through a new estimation scheme camera displacement,” Int. J. Comput. Vis. 37 (1), 7997 (2000).
29. Pages J., Collewet C., Chaumette F. and Salvi J., “A Camera–Projector System for Robot Positioning by Visual Servoing,” Proceedings of the Conference on Computer Vision and Pattern Recognition Workshop, CVPRW '06, New York, NY, USA (2006) pp. 2–2.
30. Rotenstein A., Andreopoulos A., Fazl E., Jacob D., Robinson M., Shubina K., Zhu Y. and Tsotsos J., “Towards the Dream of Intelligent, Visually Guided Wheelchairs,” Proceedings of the 2nd International Conference on Technology and Aging, Toronto, Canada (2007).
31. Vijayakumar S., Conradt J., Shibata T. and Schaal S., “Overt Visual Attention for a Humanoid Robot,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Maui, USA (2001).
32. Frintrop S. and Jensfelt P., “Attentional landmarks and active gaze control for visual slam,” IEEE Trans. Robot. 24 (5), 10541065 (2008).
33. Heidemann G., Rae R., Bekel H., Bax I. and Ritter H., “Integrating Context Free and Context-Dependent Attentional Mechanisms for Gestural Object Reference,” Proceedings of the 3rd International Conference on Computer Vision Systems, Graz, Austria (2003).
34. Niu Y., Geng Y., Li X. and Liu F., “Leveraging stereopsis for saliency analysis,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Providence, RI, USA (2012) pp. 454461.
35. Han B. and Zhou B., “High Speed Visual Saliency Computation on gpu,” Proceedings of the IEEE International Conference on Image Processing, volume 1, San Antonio, TX, USA (2007) pp. I–361–I–364.
36. Ma Y., Soatto S., Kosecka J. and Sastry S. S., An Invitation to 3-D Vision (New York, USA, Springer, Nov. 2003).
37. Deng L., Janabi-Sharif F. and Wilson W. J., “Hybrid motion control and planning strategies for visual servoing,” IEEE Trans. Indust. Eng. 52 (4), 10241040 (2005).
38. Gans N. R. and Hutchinson S. A., “Stable visual servoing through hybrid switched-system control,” IEEE Trans. Robot. 23 (3), 530540 (2007).
39. Hu G., Mackunis W., Gans N., Dixon W. E., Chen J., Behal A. and Dawson D. M., “Homography-based visual servo control with imperfect camera calibration,” IEEE Trans. on Autom. Control 54 (6), 13181324 (2009).
40. Burrus N., “Rgbdemo,” Available at: (2012).
41. Bouguet J.-Y., “Camera calibration toolbox for matlab,” Available at: (2010).
42. Falcao G., Hurtos N. and Massich J., Plane-based calibration of a projector-camera system. VIBOT Master 9 (1), 112 (2008).
43. Tsai R., Synopsis Recent Progress on Camera Calibration for 3D Machine Vision (MIT Press, Cambridge, MA, USA, 1989).
44. Illingworth J. and Kittler J., “A survey of the hough transform,” Graphical Models/graphical Models and Image Processing /computer Vision, Graphics, and Image Processing 44, 87116 (1988).
45. Tomasi C. and Kanade T., “Detection and tracking of point features,” Technical report, Carnegie Mellon University (1991).
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

  • ISSN: 0263-5747
  • EISSN: 1469-8668
  • URL: /core/journals/robotica
Please enter your name
Please enter a valid email address
Who would you like to send this to? *



Full text views

Total number of HTML views: 3
Total number of PDF views: 60 *
Loading metrics...

Abstract views

Total abstract views: 281 *
Loading metrics...

* Views captured on Cambridge Core between 23rd August 2017 - 18th January 2018. This data will be updated every 24 hours.