Skip to main content
×
Home

Building machines that learn and think for themselves

  • Matthew Botvinick (a1), David G. T. Barrett (a1), Peter Battaglia (a1), Nando de Freitas (a1), Darshan Kumaran (a1), Joel Z Leibo (a1), Timothy Lillicrap (a1), Joseph Modayil (a1), Shakir Mohamed (a1), Neil C. Rabinowitz (a1), Danilo J. Rezende (a1), Adam Santoro (a1), Tom Schaul (a1), Christopher Summerfield (a1), Greg Wayne (a1), Theophane Weber (a1), Daan Wierstra (a1), Shane Legg (a1) and Demis Hassabis (a1)...
Abstract
Abstract

We agree with Lake and colleagues on their list of “key ingredients” for building human-like intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand engineering. We believe an approach centered on autonomous learning has the greatest chance of success as we scale toward real-world complexity, tackling domains for which ready-made formal models are not available. Here, we survey several important examples of the progress that has been made toward building autonomous agents with human-like abilities, and highlight some outstanding challenges.

Copyright
References
Hide All
Andrychowicz M., Denil M., Gomez S., Hoffman M. W., Pfau D., Schaul T., Shillingford B. & de Freitas N. (2016). Learning to learn by gradient descent by gradient descent. Presented at the 2016 Neural Information Processing Systems conference, Barcelona, Spain, December 5–10, 2016. In: Advances in neural information processing systems 29 (NIPS 2016), ed. Lee D. D., Sugiyama M., Luxburg U. V., Guyon I. & Garnett R., pp. 3981–89). Neural Information Processing Systems.
Battaglia P., Pascanu R., Lai M. & Rezende D. J. (2016) Interaction networks for learning about objects, relations and physics. Presented at the 2016 Neural Information Processing Systems conference, Barcelona, Spain, December 5–10, 2016. In: Advances in neural information processing systems 29 (NIPS 2016), ed. Lee D. D., Sugiyama M., Luxburg U. V., Guyon I. & Garnett R., pp. 4502–10. Neural Information Processing Systems.
Bellemare M., Srinivasan S., Ostrovski G., Schaul T., Saxton D. & Munos R. (2016) Unifying count-based exploration and intrinsic motivation. Presented at the 2016 Neural Information Processing Systems conference, Barcelona, Spain, December 5–10, 2016. In: Advances in neural information processing systems 29 (NIPS 2016), ed. Lee D. D., Sugiyama M., Luxburg U. V., Guyon I. & Garnett R., pp. 1471–79. Neural Information Processing Systems.
Blundell C., Uria B., Pritzel A., Li Y., Ruderman A., Leibo J. Z., Rae J., Wierstra D. & Hassabis D. (2016) Model-free episodic control. arXiv preprint 1606.04460. Available at: https://arxiv.org/abs/1606.04460.
Botvinick M. M. & Cohen J. D. (2014) The computational and neural basis of cognitive control: Charted territory and new frontiers. Cognitive Science 38:1249–85.
Botvinick M., Weinstein A., Solway A. & Barto A. (2015) Reinforcement learning, efficient coding, and the statistics of natural tasks. Current Opinion in Behavioral Sciences 5:7177.
Denil M., Agrawal P., Kulkarni T. D., Erez T., Battaglia P. & de Freitas N. (2016). Learning to perform physics experiments via deep reinforcement learning. arXiv preprint:1611.01843. Available at: https://arxiv.org/abs/1611.01843.
Duan Y., Schulman J., Chen X., Bartlett P. L., Sutskever I. & Abbeel P. (2016) RL2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint 1611.02779. Available at: https://arxiv.org/pdf/1703.07326.pdf.
Eslami S. M., Heess N., Weber T., Tassa Y., Kavukcuoglu K. & Hinton G. E. (2016) Attend, infer, repeat: Fast scene understanding with generative models. Presented at the 2016 Neural Information Processing Systems conference, Barcelona, Spain, December 5–10, 2016. In: Advances in Neural Information Processing Systems 29 (NIPS 2016), ed. Lee D. D., Sugiyama M., Luxburg U. V., Guyon I. & Garnett R., pp. 3225–33. Neural Information Processing Systems Foundation.
Graves A., Wayne G., Reynolds M., Harley T., Danihelka I., Grabska-Barwińska A., Colmenarejo S. G., Grefenstette E., Ramalho T., Agapiou J., Badia A. P., Hermann K. M., Zwols Y., Ostrovski G., Cain A., King H., Summerfield C., Blunsom P., Kayukcuoglu K. & Hassabis D. (2016) Hybrid computing using a neural network with dynamic external memory. Nature 538(7626):471–76.
Hamrick J. B., Ballard A. J., Pascanu R., Vinyals O., Heess N. & Battaglia P. W. (2017) Metacontrol for adaptive imagination-based optimization. In: Proceedings of the 5th International Conference on Learning Representations (ICLR).
Hochreiter S. A., Younger S. & Conwell P. R. (2001) Learning to learn using gradient descent. In: International Conference on Artificial Neural Network—ICANN 2001, ed. Dorffner G., Bischoff H. & Hornik K., pp. 8794. Springer.
Kahneman D. (2011) Thinking, fast and slow. Macmillan.
Krizhevsky A., Sutskever I. & Hinton G. E. (2012). ImageNet classification with deep convolutional neural networks. Presented at the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, December 3–6, 2012. In: Advances in Neural Information Processing Systems 25 (NIPS 2012), ed. Pereira F., Burges C. J. C., Bottou L. & Weinberger K. Q., pp. 1097–105. Neural Information Processing Systems Foundation.
Lake B. M., Lawrence N. D. & Tenenbaum J. B. (2016) The emergence of organizing structure in conceptual representation. arXiv preprint 1611.09384. Available at: http://arxiv.org/abs/1611.09384.
Lake B. M., Salakhutdinov R. & Tenenbaum J. B. (2015a) Human-level concept learning through probabilistic program induction. Science 350(6266):1332–38.
Mnih V., Kavukcuoglu K., Silver D., Rusu A. A., Veness J., Bellemare M. G., Graves A., Riedmiller M., Fidjeland A. K., Ostrovski G., Petersen S., Beattie C., Sadik A., Antonoglous I., King H., Kumaran D., Wierstra D. & Hassabis D. (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–33.
Ranzato M., Szlam A., Bruna J., Mathieu M., Collobert R. & Chopra S. (2016) Video (language) modeling: A baseline for generative models of natural videos. arXiv preprint 1412.6604. Available at: https://www.google.com/search?q=arXiv+preprint+1412.6604&ie=utf-8&oe=utf-8.
Raposo D., Santoro A., Barrett D. G. T., Pascanu R., Lillicrap T. & Battaglia P. (2017) Discovering objects and their relations from entangled scene representations. Presented at the Workshop Track at the International Conference on Learning Representations, Toulon, France, April 24–26, 2017. arXiv preprint 1702.05068. Available at: https://openreview.net/pdf?id=Bk2TqVcxe.
Ravi S. & Larochelle H. (2017) Optimization as a model for few-shot learning. Presented at the International Conference on Learning Representations, Toulon, France, April 24–26, 2017. Available at: https://openreview.net/pdf?id=rJY0-Kcll.
Reed S. & de Freitas N. (2016) Neural programmer-interpreters. Presented at the 4th International Conference on Learning Representations (ICLR), San Juan, Puerto Rico, May 2–5, 2016. arXiv preprint 1511.06279. Available at: https://arxiv.org/abs/1511.06279.
Rezende D. J., Mohamed S., Danihelka I., Gregor K. & Wierstra D. (2016) One-shot generalization in deep generative models. Presented at the International Conference on Machine Learning, New York, NY, June 20–22, 2016. Proceedings of Machine Learning Research 48:1521–29.
Santoro A., Bartunov S., Botvinick M., Wierstra D. & Lillicrap T. (2016). Meta-learning with memory-augmented neural networks. Presented at the 33rd International Conference on Machine Learning, New York, NY, June 19–24, 2016. Proceedings of Machine Learning Research 48:1842–50.
Schaul T., Quan J., Antonoglou I. & Silver D. (2016) Prioritized experience replay. Presented at International Conference on Learning Representations (ICLR), San Diego, CA, May 7–9, 2015. arXiv preprint 1511.05952. Available at: https://arxiv.org/abs/1511.05952.
Silver D., Huang A., Maddison C. J., Guez A., Sifre L., Driessche G. V. D., Schrittwieser J., Antonoglou I., Panneershelvam V., Lanctot M., Dieleman S., Grewe D., Nham J., Kalchbrenner N., Sutskever I., Lillicrap T., Leach M., Kavukcuoglu K, Graepel T. & Hassabis D. (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7585):484–89.
Silver D., van Hasselt H., Hessel M., Schaul T., Guez A., Harley T., Dulac-Arnold G. Reichert D., Rabinowitz N., Barreto A. & Degris T. (2017) The predictron: End-to-end learning and planning. In: Proceedings of the 34rd International Conference on Machine Learning, Sydney, Australia, ed. Balcan M. F. & Weinberger K. Q..
van den Oord A., Kalchbrenner N. & Kavukcuoglu K. (2016). Pixel recurrent neural networks. Presented at the 33rd International Conference on Machine Learning, New York, NY. Proceedings of Machine Learning Research 48:1747–56.
Vinyals O., Blundell C., Lillicrap T. & Wierstra D. (2016) Matching networks for one shot learning. Vinyals O., Blundell C., Lillicrap T. Kavukcuoglu K. & Wierstra D. (2016). Matching networks for one shot learning. Presented at the 2016 Neural Information Processing Systems conference, Barcelona, Spain, December 5–10, 2016. In: Advances in Neural Information Processing Systems 29 (NIPS 2016), ed. Lee D. D., Sugiyama M., Luxburg U. V., Guyon I. & Garnett R., pp. 3630–38. Neural Information Processing Systems Foundation.
Wang J. X., Kurth-Nelson Z., Tirumala D., Soyer H., Leibo J. Z., Munos R., Blundell C., Kumaran D. & Botvinick M. (2017). Learning to reinforcement learn. In: Presented at the 39th Annual Meeting of the Cognitive Science Society, London, July 26–29, 2017. arXiv preprint 1611.05763. Available at: https://arxiv.org/abs/1611.05763.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Behavioral and Brain Sciences
  • ISSN: 0140-525X
  • EISSN: 1469-1825
  • URL: /core/journals/behavioral-and-brain-sciences
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 134
Total number of PDF views: 356 *
Loading metrics...

Abstract views

Total abstract views: 4986 *
Loading metrics...

* Views captured on Cambridge Core between 10th November 2017 - 23rd November 2017. This data will be updated every 24 hours.