Hostname: page-component-89b8bd64d-n8gtw Total loading time: 0 Render date: 2026-05-05T16:35:30.346Z Has data issue: false hasContentIssue false

Sensification of computing: adding natural sensing and perception capabilities to machines

Published online by Cambridge University Press:  18 January 2017

Achintya K. Bhowmik*
Affiliation:
Intel Corporation, Santa Clara, California, USA
*
Corresponding author: A.K. Bhowmik achintya.k.bhowmik@intel.com

Abstract

The world of intelligent and interactive systems is undergoing an era of unprecedented innovation and advanced development. With the rapid progress in natural sensing and perceptual computing technologies, devices and machines are increasingly being endowed with the abilities to sense and understand the world, navigate in the environment, and interact with humans in natural ways. Interfaces based on touch sensing and speech recognition are now ubiquitous, and the race is on to the next frontiers of machine intelligence and interactions based on three-dimensional (3D) sensing. In this paper, we review the recent progress in the development of real-time 3D-sensing technologies and their deployment in a new class of interactive and autonomous systems. As an example of a commercially available platform, we describe the Intel® RealSense technologies, as well as its deployments in a new class of interactive and autonomous systems and applications.

Information

Type
Industrial Technology Advances
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Authors, 2017
Figure 0

Fig. 1. Human visual system: left figure shows the construction of the human eye, right figure shows the binocular 3D imaging scheme. The depth or distance of the objects in the scene is discerned from the binocular disparity along with other visual cues such as motion parallax, occlusion, focus, etc. [1].

Figure 1

Fig. 2. The semi-circular canals in the inner ear, along with otoliths, form the sensors of the human vestibular sensing system. While the ears and the auditory cortex help us sense and understand sound, the vestibular system provides important cues on movement, orientation, and balance [1].

Figure 2

Fig. 3. Functional block diagram of an interactive system. The inputs and the actions modules orchestrate the interactions between the system and the world or user, while the signal processing and computing functions facilitate these interactions.

Figure 3

Fig. 4. Basics of the stereo-3D imaging method, illustrated with the simple case of parallel and calibrated camera pair with optical centers at C1 and C2, respectively, separated by the baseline distance of L. The point, P, in the 3D world is imaged at points A and B on the left and the right cameras, respectively. A on the right image plane corresponds to the point A on the left image plane. The distance between B and A on the epipolar line is called the binocular disparity, Δ, which can be shown to be inversely proportional to the distance of the point P from the baseline [1].

Figure 4

Fig. 5. Principles of a projected structured-light 3D image capture method [4]. (a) An illumination pattern is projected onto the scene and the reflected image is captured by a camera. The depth of a point is determined from the relative displacement of it in the pattern and the image. (b) An illustrative example of a projected stripe pattern. In practical applications, typically IR light is used with more complex patterns. (c) The captured image of the stripe pattern reflected from the 3D object.

Figure 5

Fig. 6. Principles of 3D imaging using the time-of-flight-based range measurement technique [1]. The solid sinusoidal curve is the amplitude-modulated IR light that is emitted onto the scene by a source, and the dashed curve is the reflected signal that is detected by an imaging device. Note that the reflected signal is attenuated and phase-shifted by an angle ϕ relative to the emitted signal, and includes a background signal that is assumed to be constant. The distance or the depth map is determined using the phase shift and the modulation wavelength.

Figure 6

Fig. 7. Intel® RealSense camera modules. The top figure shows the F200 version based on coded-light 3D-imaging technique, whereas the bottom figure shows the R200 version based on stereo-3D imaging technique. The imaging processors consist of power-efficient hardware for 3D computation and processing.

Figure 7

Fig. 8. A pair of RGB-D images captured with Intel® RealSense camera. The left figure shows a color image, while the right figure shows the corresponding pseudo-colored depth image where the nearer points are shown in bluer colors and farther points are shown in redder colors. The background objects that are further away from the range of the depth sensor are shown in dark blue.

Figure 8

Fig. 9. Examples of 3D computer vision middleware libraries included in the RealSense software development kit. Top left: 3D hand skeleton tracking; top right: face detection and tracking; bottom left: 3D background segmentation; bottom right: 3D scanning and reconstruction.

Figure 9

Fig. 10. Examples of commercially available computers with embedded RealSense technologies. Left: an interactive all-in-one desktop computer, right: a 2-in-1 laptop/tablet device.

Figure 10

Fig. 11. Dense 3D reconstruction of an office environment captured with a mobile device incorporating RealSense technology. The real-time depth imaging with high-density point cloud allows rapid reconstruction of 3D spaces, objects, and humans.

Figure 11

Fig. 12. Real-time 3D spatial tracking with six degrees of freedom using visual-inertial odometry. The image in the middle shows the view from the fish-eye camera, the image on the left shows the 2D view map traced while navigating within a 3D space. The figure on the right shows large-scale 3D mapping and navigation spanning an entire floor of a office building.

Figure 12

Fig. 13. Examples of autonomous robots equipped with RealSense technology. Left, a hotel butler robot from Savioke; middle, Segway personal transporter robot from Ninebot; right, a personal assistant home robot from Asus.

Figure 13

Fig. 14. The left image shows the Yuneec Typhoon H drone with integrated RealSense device as demonstrated in CES 2016. The right image shows a demonstration of real-time automatic collision avoidance as the drone follows a person biking through trees.

Figure 14

Fig. 15. Left image shows an interactive mixed-reality device incorporating RealSense and visual-inertial spatial motion tracking technology. The image on the right shows an example of mixed-reality capability of the device, where the 3D images of the user's hands as well as a person standing in front of the user are brought into the virtual world. This capability is also used to allow the user to avoid colliding into physical objects.

Figure 15

Fig. 16. Augmentation of the real physical world with virtually rendered 3D objects using a device with embedded RealSense module. Here a digitally rendered car is shown racing on a real kitchen table and colliding into a physical bowl, with realistic physical effects such as collision with real objects, correct occlusion, shadows, etc.