Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-c47g7 Total loading time: 0 Render date: 2024-04-25T04:04:42.810Z Has data issue: false hasContentIssue false

19 - Large-Scale FPGA-Based Convolutional Networks

from Part Four - Applications

Published online by Cambridge University Press:  05 February 2012

Clément Farabet
Affiliation:
New York University
Yann Lecun
Affiliation:
New York University
Koray Kavukcuoglu
Affiliation:
NEC Labs America, Princeton, NJ, USA
Berin Martini
Affiliation:
Yale University
Polina Akselrod
Affiliation:
Yale University
Selcuk Talay
Affiliation:
Yale University
Eugenio Culurciello
Affiliation:
Yale University
Ron Bekkerman
Affiliation:
LinkedIn Corporation, Mountain View, California
Mikhail Bilenko
Affiliation:
Microsoft Research, Redmond, Washington
John Langford
Affiliation:
Yahoo! Research, New York
Get access

Summary

Micro-robots, unmanned aerial vehicles, imaging sensor networks, wireless phones, and other embedded vision systems all require low cost and high-speed implementations of synthetic vision systems capable of recognizing and categorizing objects in a scene.

Many successful object recognition systems use dense features extracted on regularly spaced patches over the input image. The majority of the feature extraction systems have a common structure composed of a filter bank (generally based on oriented edge detectors or 2D Gabor functions), a nonlinear operation (quantization, winner-take-all, sparsification, normalization, and/or pointwise saturation), and finally a pooling operation (max, average, or histogramming). For example, the scale-invariant feature transform (SIFT) (Lowe, 2004) operator applies oriented edge filters to a small patch and determines the dominant orientation through a winner-take-all operation. Finally, the resulting sparse vectors are added (pooled) over a larger patch to form a local orientation histogram. Some recognition systems use a single stage of feature extractors (Lazebnik, Schmid, and Ponce, 2006; Dalal and Triggs, 2005; Berg, Berg, and Malik, 2005; Pinto, Cox, and DiCarlo, 2008).

Other models such as HMAX-type models (Serre, Wolf, and Poggio, 2005; Mutch, and Lowe, 2006) and convolutional networks use two more layers of successive feature extractors. Different training algorithms have been used for learning the parameters of convolutional networks. In LeCun et al. (1998b) and Huang and LeCun (2006), pure supervised learning is used to update the parameters. However, recent works have focused on training with an auxiliary task (Ahmed et al., 2008) or using unsupervised objectives (Ranzato et al., 2007b; Kavukcuoglu et al., 2009; Jarrett et al., 2009; Lee et al., 2009).

Type
Chapter
Information
Scaling up Machine Learning
Parallel and Distributed Approaches
, pp. 399 - 419
Publisher: Cambridge University Press
Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adams, D. A. 1969. A Computation Model with Data Flow Sequencing. Ph.D. thesis, Stanford University.
Ahmed, A., Yu, K., Xu, W., Gong, Y., and Xing, E. 2008. Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks. In: ECCV. New York: Springer.Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. 2007. Greedy Layer-Wise Training of Deep Networks. In: NIPS.Google Scholar
Berg, A. C., Berg, T. L., and Malik, J. 2005. Shape Matching and Object Recognition Using Low Distortion Correspondences. In: CVPR.Google Scholar
Chellapilla, K., Shilman, M., and Simard, P. 2006. Optimally Combining a Cascade of Classifiers. In: Proceedings of Document Recognition and Retrieval 13, Electronic Imaging, 6067.CrossRefGoogle Scholar
Cho, M. H., ChengC,-C. C,-C., Kinsy, M., Suh, G. E., and Devadas, S. 2008. Diastolic Arrays: Throughput-Driven Reconfigurable Computing.Google Scholar
Coates, A., Baumstarck, P., Le, Q., and Ng, A.Y. 2009. Scalable Learning for Object Detection with GPU Hardware. Pages 4287–4293 of: Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. Citeseer.Google Scholar
Collobert, R. 2008. Torch. Presented at the Workshop on Machine Learning Open Source Software, NIPS.
Dalal, N., and Triggs, B. 2005. Histograms of Oriented Gradients for Human Detection. In: CVPR.Google Scholar
Delakis, M., and Garcia, C. 2008. Text Detection with Convolutional Neural Networks. In: International Conference on Computer Vision Theory and Applications (VISAPP 2008).Google Scholar
Dennis, J. B., and Misunas, D. P. 1974. A Preliminary Architecture for a Basic Data-Flow Processor. SIGARCH Computer Architecture News, 3(4), 126–132.CrossRefGoogle Scholar
Farabet, C., Poulet, C., Han, J. Y., and LeCun, Y. 2009. CNP: An FPGA-Based Processor for Convolutional Networks. In: International Conference on Field Programmable Logic and Applications (FPL'09). Prague: IEEE.Google Scholar
Farabet, C., Martini, B., Akselrod, P., Talay, S., LeCun, Y., and Culurciello, E. 2010. Hardware Accelerated Convolutional Neural Networks for Synthetic Vision Systems. In: International Symposium on Circuits and Systems (ISCAS'10). Paris: IEEE.Google Scholar
Frome, A., Cheung, G., Abdulkader, A., Zennaro, M., Wu, B., Bissacco, A., Adam, H., Neven, H., and Vincent, L. 2009. Large-Scale Privacy Protection in Street-Level Imagery. In: ICCV'09.Google Scholar
Fukushima, K., and Miyake, S. 1982. Neocognitron: A New Algorithm for Pattern Recognition Tolerant of Deformations and Shifts in Position. Pattern Recognition, 15(6), 455–469.CrossRefGoogle Scholar
Garcia, C., and Delakis, M. 2004. Convolutional Face Finder: A Neural Architecture for Fast and Robust Face Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence.CrossRefGoogle ScholarPubMed
Hadsell, R., Sermanet, P., Scoffier, M., Erkan, A., Kavackuoglu, K., Muller, U., and LeCun, Y. 2009. Learning Long-Range Vision for Autonomous Off-Road Driving. Journal of Field Robotics, 26(2), 120–144.CrossRefGoogle Scholar
Hicks, J., Chiou, D., Ang, B. S., and Arvind, . 1993. Performance Studies of Id on the Monsoon Dataflow System.
Hinton, G. E., and Salakhutdinov, R. R. 2006. Reducing the Dimensionality of Data with Neural Networks. Science.CrossRefGoogle ScholarPubMed
Huang, F.-J., and LeCun, Y. 2006. Large-Scale Learning with SVM and Convolutional Nets for Generic Object Categorization. In: Proceedings of Computer Vision and Pattern Recognition Conference (CVPR'06). IEEE.Google Scholar
Jain, V., and Seung, H. S. 2008. Natural Image Denoising with Convolutional Networks. In: Advances in Neural Information Processing Systems 21 (NIPS 2008). Cambridge, MA: MIT Press.Google Scholar
Jarrett, K., Kavukcuoglu, K., Ranzato, M. A., and LeCun, Y. 2009. What Is the Best Multi-Stage Architecture for Object Recognition? In: Proceedings of International Conference on Computer Vision (ICCV'09). IEEE.Google Scholar
Kavukcuoglu, K., Ranzato, M. A., and LeCun, Y. 2008. Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition. Technical Report CBLL-TR-2008-12-01.
Kavukcuoglu, K., Ranzato, M. A., Fergus, R., and LeCun, Y. 2009. Learning Invariant Features through Topographic FilterMaps. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR'09). IEEE.Google Scholar
Kung, H. T. 1986. Why Systolic Architectures? 300–309.
Gaudiot, J. L., Bic, L., Dennis, J., and Dennis, J. B. 1994. Stream Data Types for Signal Processing. In: In Advances in Dataflow Architecture and Multithreading. IEEE.Google Scholar
Lazebnik, S., Schmid, C., and Ponce, J. 2006. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. Pages 2169–2178 of: Proceedings of Computer Vision and Pattern Recognition. IEEE.Google Scholar
LeCun, Y., and Bottou, L. 2002. Lush Reference Manual. Technical Report Code available at http://lush.sourceforge.net.
LeCun, Y., and Cortes, C. 1998. MNIST Dataset. http://yann.lecun.com/exdb/mnist/.
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. 1989. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation.CrossRefGoogle Scholar
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. 1990. Handwritten Digit Recognition with a Back-Propagation Network. In: NIPS'89.Google Scholar
LeCun, Y., Bottou, L., Orr, G., and Muller, K. 1998a. Efficient BackProp. In: Orr, G., and Muller, K., (eds), Neural Networks: Tricks of the Trade. New York: Springer.Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998b. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86(11), 2278–2324.CrossRefGoogle Scholar
LeCun, Y., Huang, F.-J., and Bottou, L. 2004. Learning Methods for Generic Object Recognition with Invariance to Pose and Lighting. In: Proceedings of CVPR'04. IEEE.Google Scholar
Lee, E. A., and David, . 1987. Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing. IEEE Transactions on Computers, 36, 24–35.CrossRefGoogle Scholar
Lee, H., Grosse, R., Ranganath, R., and Ng, A. Y. 2009. Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. In: Proceedings of the 26th International Conference on Machine Learning (ICML'09).CrossRefGoogle Scholar
Lowe, D. G. 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision.CrossRefGoogle Scholar
Lyu, S., and Simoncelli, E. P. 2008. Nonlinear Image Representation Using Divisive Normalization. In: CVPR.Google ScholarPubMed
Mozer, M. C. 1991. The Perception of Multiple Objects: A Connectionist Approach. Cambridge, MA: MIT Press.Google Scholar
Mutch, J., and Lowe, D. G. 2006. Multiclass Object Recognition with Sparse, Localized Features. In: CVPR.Google Scholar
Nasse, F., Thurau, C., and Fink, G. A. 2009. Face Detection Using GPU-Based Convolutional Neural Networks.
Ning, F., Delhomme, D., LeCun, Y., Piano, F., Bottou, L., and Barbano, P. 2005. Toward Automatic Phenotyping of Developing Embryos from Videos. IEEE Transactions on Image Processing. Special issue on Molecular and Cellular Bioimaging.Google ScholarPubMed
Nowlan, S., and Platt, J. 1995. A Convolutional Neural Network Hand Tracker. Pages 901–908 of: Neural Information Processing Systems. San Mateo, CA: Morgan Kaufmann.Google Scholar
Olshausen, B. A., and Field, D. J. 1997. Sparse Coding with an Overcomplete Basis Set: A Strategy Employed by V1?Vision Research.CrossRefGoogle ScholarPubMed
Osadchy, M., LeCun, Y., and Miller, M. 2007. Synergistic Face Detection and Pose Estimation with Energy-Based Models. Journal of Machine Learning Research, 8(May), 1197–1215.Google Scholar
Pinto, N., Cox, D. D., and DiCarlo, J. J. 2008. Why Is Real-World Visual Object Recognition Hard?PLoS Computer Biology, 4(1), e27.CrossRefGoogle ScholarPubMed
Ranzato, M. A., Boureau, Y.-L., and LeCun, Y. 2007a. Sparse Feature Learning for Deep Belief Networks. In: NIPS'07.Google Scholar
Ranzato, M. A., Huang, F.-J., Boureau, Y.-L., and LeCun, Y. 2007b. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition. In: Proceedings of Computer Vision and Pattern Recognition Conference (CVPR'07). IEEE.Google Scholar
Serre, T., Wolf, L., and Poggio, T. 2005. Object Recognition with Features Inspired by Visual Cortex. In: CVPR.Google Scholar
Simard, P. Y., Steinkraus, D., and Platt, J. C. 2003. Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis. In: ICDAR.Google Scholar
Vaillant, R., Monrocq, C., and LeCun, Y. 1994. Original Approach for the Localisation of Objects in Images. IEEE Proceedings on Vision, Image, and Signal Processing, 141(4), 245–250.CrossRefGoogle Scholar
Weston, J., Rattle, F., and Collobert, R. 2008. Deep Learning via Semi-Supervised Embedding. In: ICML.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×