Skip to main content

Selection of trajectory parameters for dynamic pouring tasks based on exploitation-driven updates of local metamodels

  • Joshua D. Langsfeld (a1), Krishnanand N. Kaipa (a2) and Satyandra K. Gupta (a3)

We present an approach that allows a robot to generate trajectories to perform a set of instances of a task using few physical trials. Specifically, we address manipulation tasks which are highly challenging to simulate due to complex dynamics. Our approach allows a robot to create a model from initial exploratory experiments and subsequently improve it to find trajectory parameters to successfully perform a given task instance. First, in a model generation phase, local models are constructed in the vicinity of previously conducted experiments that explain both task function behavior and estimated divergence of the generated model from the true model when moving within the neighborhood of each experiment. Second, in an exploitation-driven updating phase, these generated models are used to guide parameter selection given a desired task outcome and the models are updated based on the actual outcome of the task execution. The local models are built within adaptively chosen neighborhoods, thereby allowing the algorithm to capture arbitrarily complex function landscapes. We first validate our approach by testing it on a synthetic non-linear function approximation problem, where we also analyze the benefit of the core approach features. We then show results with a physical robot performing a dynamic fluid pouring task. Real robot results reveal that the correct pouring parameters for a new pour volume can be learned quite rapidly, with a limited number of exploratory experiments.

Corresponding author
*Corresponding author. E-mail:
Hide All
1. Aboaf E., Atkeson C. G. and Reinkensmeyer D. J., Task Level Robot Learning: Ball Throwing. Technical report, MIT, Cambridge, MA, 1987.
2. Abu-Dakka F. J., Valero F. J., Suner J. Luis and A V., “Mata direct approach to solving trajectory planning problems using genetic algorithms with dynamics considerations in complex environments,” Robotica 33 (3), 669683 (2015).
3. Akgun B., Cakmak M., Jiang K. and Thomaz A. L., “Keyframe-based learning from demonstration,” Int. J. Soc. Robot. 4 (4), 343355 (2012).
4. Al-Shuka H. F. N., Corves B. and Zhu W.-H., “Function approximation technique-based adaptive virtual decomposition control for a serial-chain manipulator,” Robotica 32 (3), 375399 (2014).
5. Arif M., Ishihara T. and Inooka H., “Incorporation of experience in iterative learning controllers using locally weighted learning,” Automatica 37 (6), 881888 (2001).
6. Atkeson C. G., Moore A. W. and Schaal S., “Locally weighted learning,” Artif. Intell. 11, 1173 (1997).
7. Berenson D., Abbeel P. and Goldberg K., “A Robot Path Planning Framework that Learns from Experience,” Proceedings of the International Conference on Robotics and Automation (ICRA), St. Paul, MN, USA (2012) pp. 3671–3678.
8. Bocsi B., Csato L. and Peters J., “Alignment-Based Transfer Learning for Robot Models,” Proceedings of the International Joint Conference on Neural Networks, Dallas, TX (2013) pp. 1–7.
9. Bowen C., Ye G. and Alterovitz R., “Asymptotically optimal motion planning for learned tasks using time-dependent cost maps,” IEEE Trans. Autom. Sci. Eng. 12 (1), 171182 (2015).
10. Brandi S., Kroemer O. and Peters J., “Generalizing Pouring Actions Between Objects using Warped Parameters,” Proceedings of the 14th IEEE-RAS International Conference on Humanoid Robots, Madrid, Spain (2014) pp. 616–621.
11. Branicky M. S., Knepper R. A. and Kuffner J. J., “Path and Trajectory Diversity: Theory and Algorithms,” Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Pasadena, CA, USA (2008) pp. 1359–1364.
12. Broun A., Beck C., Pipe T., Mirmehdi M. and Melhuish C., “Bootstrapping a robot's kinematic model,” Robot. Auton. Syst. 62 (3), 330339 (2014).
13. Castro da Silva B., Konidaris G. and Barto A. G., “Learning Parameterized Skills,” Proceedings of the 29th International Conference on Machine Learning (ICML-12), Edinburgh, Scotland (2012) pp. 1679–1686.
14. Deisenroth M. P. and Rasmussen C. E., “PILCO: A Model-Based and Data-Efficient Approach to Policy Search,” Proceedings of the 28th International Conference on Machine Learning, Bellevue, WA, USA (2011) pp. 465–472.
15. El-Fakdi A. and Carreras M., “Two-step gradient-based reinforcement learning for underwater robotics behavior learning,” Robotics and Autonomous Systems 61 (3), 271282 (2013).
16. Esfandiar H., Daneshmand S. and Kermani R. D., “On the control of a single flexible arm robot via Youla-Kucera parameterization,” Robotica 34 (01), 150172 (2016).
17. Grollman D. H. and Jenkins O. C., “Sparse Incremental Learning for Interactive Robot Control Policy Estimation,” Proceedings of the IEEE International Conference on Robotics and Automation, Pasadena, CA, USA (2008) pp. 3315–3320.
18. Jamone L., Damas B. and Santos-Victor J., “Incremental Learning of Context-Dependent Dynamic Internal Models for Robot Control,” Proceedings of the IEEE International Symposium on Intelligent Control (ISIC), Antibes, France (2014) pp. 1336–1341.
19. Kakade S. M., Kearns M. J. and Langford J., “Exploration in Metric State Spaces,” Proceedings of the 20th International Conference on Machine Learning (ICML), Washington, D.C., USA (2003) pp. 306–312.
20. Kim B., Kim A., Dai H., Kaelbling L. and Lozano-perez T., “Generalizing over Uncertain Dynamics for Online Trajectory Generation,” Proceedings of the International Symposium on Robotics Research (ISRR), Sestri Levante, Italy (2015) pp. 1–16.
21. Kober J., Wilhelm A., Oztop E. and Peters J., “Reinforcement learning to adjust parametrized motor primitives to new situations,” Auton. Robots 33, 361379 (2012).
22. Lehnert C. and Wyeth G., “Locally Weighted Learning Model Predictive Control for Nonlinear and Time Varying Dynamics,” Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany (2013) pp. 2619–2625.
23. Lovell C., Jones G., Zauner K.-P. and Gunn S. R., “Exploration and Exploitation with Insufficient Resources,” JMLR: Workshop and Conference Proceedings, Bellevue, WA, USA, vol. 26 (2012) pp. 37–61.
24. Luo J. and Hauser K., “Robust Trajectory Optimization Under Frictional Contact with Iterative Learning,” Lydia E. Kavraki, David Hsu, and Jonas Buchli, editors. Robotics: Science and Systems (RSS), Rome, Italy (2015) ISBN 978-0-9923747-1-6.
25. Mihalkova L. and Mooney R., “Using Active Relocation to Aid Reinforcement Learning,” Proceedings of the 19th International FLAIRS Conference, Melbourne Beach, FL, USA (2006) pp. 580–585.
26. Mihai Moldovan T., Levine S., Jordan M. I. and Abbeel P., “Optimism-Driven Exploration for Nonlinear Systems,” Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA (2015) pp. 3239–3246.
27. Mordatch I. and Todorov E., “Combining the Benefits of Function Approximation and Trajectory Optimization,” Dieter Fox, Lydia E. Kavraki and Hanna Kurniawati, editors. Robotics: Science and Systems (RSS), Berkeley, CA USA (2014) ISBN 978-0-9923747-0-9.
28. Nemec B., Forte D., Vuga R., Tamosiunaite M., Worgotter F. and Ude A., “Applying Statistical Generalization to Determine Search Direction for Reinforcement Learning of Movement Primitives,” IEEE-RAS International Conference on Humanoid Robots, Osaka, Japan (2012) pp. 65–70.
29. Nguyen-Tuong D. and Peters J., “Model learning for robot control: A survey,” Cognitive Science 12 (4), 319–40 (2011).
30. Pajak G. and Pajak I., “Sub-optimal trajectory planning for mobile manipulators,” Robotica 33 (06), 11811200 (2015).
31. Park C., Pan J. and Manocha D., “High-DOF robots in dynamic environments using incremental trajectory optimization,” Int. J. Humanoid Robot. 11 (02) (2014).
32. Pastor P., Hoffmann H., Asfour T. and Schaal S., “Learning and Generalization of Motor Skills by Learning from Demonstration,” Proceedings of the IEEE International Conference on Robotics and Automation, ICRA '09, Kobe, Japan (May 2009) pp. 763–768.
33. Peters J. and Schaal S., “Reinforcement learning of motor skills with policy gradients,” Neural Netw. 21, 682697 (2008).
34. Petrič T., Gams A., Žlajpah L. and Ude A., “Online Learning of Task-Specific Dynamics for Periodic Tasks,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014), Chicago, IL, USA (2014) pp. 1790–1795.
35. Posa M. and Tedrake R., “Direct Trajectory Optimization of Rigid Body Dynamical Systems Through Contact,” In: Algorithmic Foundations of Robotics X (Frazzoli E., Lozano-Perez T., Roy N., Rus D., eds.), volume 86 (Springer Berlin Heidelberg, 2013) pp. 527542.
36. Rasmussen C. E. and Williams C. K. I., Gaussian Processes for Machine Learning (MIT Press, Boston, Massachusetts, United States, 2006).
37. Rosales C., Ajoudani A., Gabiccini M. and Bicchi A., “Active Gathering of Frictional Properties from Objects,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014), Chicago, IL, USA (Sep. 2014) pp. 3982–3987.
38. Rozo L., Jimenez P. and Torras C., “Force-Based Robot Learning of Pouring Skills using Parametric Hidden Markov Models,” International Workshop on Robot Motion and Control, RoMoCo, Wasowo, Poland (Jul. 2013) pp. 227232.
39. Schulman J., Levine S., Jordan M. and Abbeel P., “Trust Region Policy Optimization,” Proceedings of the International Conference on Machine Learning (ICML), Lille, France (2015) pp. 1889–1897.
40. Srinivas N., Krause A., Kakade S. M. and Seeger M., “Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design,” Proceedings of the 27th International Conference on Machine Learning (ICML 2010), Haifa, Israel (2010) pp. 1015–1022.
41. Tamosiunaite M., Nemec B., Ude A. and Wörgötter F., “Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives,” Robot. Auton. Syst. 59 (11), 910922 (2011).
42. Theodorou E., Buchli J. and Schaal S., “Learning Policy Improvements with Path Integrals,” International Conference on Artificial Intelligence and Statistics (AISTATS), Sardinia, Italy (2010).
43. Zhang Y., Luo J. and Hauser K., “Sampling-Based Motion Planning with Dynamic Intermediate State Objectives: Application to Throwing,” IEEE International Conference on Robotics and Automation (ICRA), St. Paul, MN, USA (2012) pp. 2551–2556.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

  • ISSN: 0263-5747
  • EISSN: 1469-8668
  • URL: /core/journals/robotica
Please enter your name
Please enter a valid email address
Who would you like to send this to? *



Full text views

Total number of HTML views: 0
Total number of PDF views: 40 *
Loading metrics...

Abstract views

Total abstract views: 213 *
Loading metrics...

* Views captured on Cambridge Core between 8th May 2017 - 21st November 2017. This data will be updated every 24 hours.