Visual servoing of a laser beam through a mirror

In this paper, we present a new approach to improving vocal fold access to perform phonomicrosurgery. It is done by shooting the laser through a mirror to reach the vocal fold hidden parts. A geometrical study of laser shooting path was conducted for vocal fold anatomical constraints, followed by devising a laser-shooting system conceptual design. Control laws were developed and tested by simulation and validated experimentally on a test bench in a monocular and stereoscopic configuration. Simulation and experimental results are provided to demonstrate the effectiveness of the developed approach.


Introduction
The demand to improve health quality has led to much research including, phonomicrosurgeries, which involves delicate surgical operations on the vocal fold and requires a highly skilled surgeon [1], [2].Vocal fold surgery requires precision and accuracy due to the tissue's nature being resected, thin, fragile, and viscous.Their lesions might be less than 1mm [3], [4].The most common procedure to resect those lesions relies on laser surgery.Systems for laser phonosurgery, such as Acupulse Duo by Lumenis [5], are based on a laser manipulator mounted onto an external microscope.The patient is placed in extreme neck extension so that a rigid straight laryngoscope can be placed in the patient mouth and throat to allow a direct line of sight between the laser manipulator and vocal fold in the larynx.However, certain vocal fold portions are inaccessible in such a placement as lateral and posterior sides because the laser source is located out of the patient's body.A small area laser beam could be moved into the laryngoscope, which prevents the surgeon from conducting a surgical operation on those portions.Another system, a flexible endoluminal robotic system, was developed during the European project µRALP [6].That concept has a miniaturized laser manipulator, having micro cameras embedded at the endoscope tip and shooting laser from within the larynx.Nevertheless, since the laser source was placed above the vocal fold and light only travels along straight lines hindered the surgeon from operating on the anterior vocal fold but hardly had access to lateral sides and, to the best of our knowledge, no access to the posterior side.The µRALP project also proposed improving laser steering accuracy by automatically controlling the laser [7] to follow a surgeon-drawn path [8], rather than having the surgeon manually steer the laser beam through a poorly ergonomic joystick.This automatic control is done by visually servoing laser spots from one [7] or two [9], [10] endoscopic cameras.
There are several works reported in the literature, especially in visually guided laser ablation catheter [11].Which was designed to allow the operator to directly visualize target tissue for ablation and then deliver laser energy to perform point-to-point circumferential ablation.Also, the velocity independent visual path following for laser surgery in [12] where nonholonomic control of the unicycle model and path following at high frequency to satisfy the constraints of laser-tissue interaction were explored.Another example is reported in [13], where a robotic system for skin photo-rejuvenation, which uniformly delivers laser thermal stimulation to a subject's facial skin tissue, was investigated.Yet, as far as we could understand it, none of those work automatically steered laser along hidden paths.Visual servoing techniques use visual information extracted from the images to design a control law [14], [15], and [16].It is a systematic way to control a robot using the information provided by one or multiple cameras [17].Standard stereo sensors used for visual servoing have a limited view and consequently limit their application range.Hence, using planar mirrors has been prioritized to enlarge the field of view of classic pinhole cameras [18] and for high-speed gaze control [19], [20].A planar mirror is a mirror with a planar reflective surface.An image of an object in front of it appears to be behind the mirror plane.This image is equivalent to one being captured by a virtual camera located behind the mirror; additionally, the virtual camera's position is symmetric to the position of the real camera.In our case, the camera's reflection in the planar mirror is used to track virtual features points in the vocal fold hidden scene.For instance, by using mirror reflections of a scene, stereo images are captured [21].Tracking is the problem of estimating the trajectory of an object in the image plane as it moves around a scene.A tracker allocates unswerving labels to the tracked objects in different video frames.In addition, depending on the tracking field, a tracker can provide object-centric information, such as the area or shape of an object.Simple algorithms for video tracking rely on selecting the region of interest in the first frame associated with the moving objects.Tracking algorithms can be classified into three categories: point tracking [22], [23], kernel tracking [24], [25], and silhouette tracking [26].Occlusion can significantly undermine the performance of the object tracking algorithms.Occlusion often occurs when two or more objects come too close and seemingly merge or combine.Image processing systems with object tracking often wrongly track the occluded objects [27].After occlusion, the system will wrongly identify the initially tracked objects as new objects [28].If the geometry and placement of static objects in the real surroundings are known, the so-called phantom model is a common approach for handling the occlusion of virtual objects.A method for detecting a dynamic occlusion in front of static backgrounds is described in [29].This algorithm does not require any previous knowledge about the occluding objects but relies on a textured graphical model of planar elements in the scene.Some approaches solve the occlusion problem using depth information delivered by stereo matching [30].In our approach to the occlusion problem, we use the triangulation method, where we pay attention to the pixels which are well reconstructed when an image is reproduced and ignore the ones which are not well reconstructed.The paper focused on a conceptual method of servoing laser to hidden parts of the vocal fold.Having been inspired by this clinical need for improved access and those work-pieces on mirror reflections, we propose an analysis of anatomical constraints of the vocal fold, devised a conceptual design, and formulated a controller, which was evaluated experimentally on a tabletop set-up.The first contribution is to propose a method to access parts of the vocal fold work-space that are not directly visible during phonomicrosurgery, for instance, the posterior side of the vocal fold by seeing through an auxiliary mirror to overpass the limited micro-cameras field of view of a flexible endoscopy system that was missing in [7], and shooting surgical laser using the same auxiliary mirror to access those invisible parts in the vocal fold workspace.The second contribution is to derive the control equations for automatically steering the laser through the auxiliary mirror to the hidden parts of the vocal fold by updating the control in [9] and [10].However, through modelling, simulation, and experimentation, the addition of the auxiliary mirror is shown to have no impact, which is thereby demonstrated as being used as it is.Nonetheless, we took the opportunity of this study to derive a variant of the control in [9] based on the geodesic error (cross-product) rather than on the linear error.The figure 1 below shows a sample of a simulated image of the vocal fold.The remainder of this paper is presented as follows.Section 2 gives a detailed description of the conceptual system design to access parts of the vocal fold.Section 3 deals with modelling the proposed system both in monocular and stereoscopic cases to establish a controller.Section 4 focuses on the simulation results of the controller for both the monocular case and stereoscopic case.Section 5 presents the performed experimental validations in a tabletop setup.

System configuration
As illustrated in figure 2, the objective is to devise a method that improves access to hidden parts of the vocal fold.In our approach, we propose a system with two cameras to give stereo view and visual feedback to the scene, a laser source to provide surgical laser needed for tissue ablation, illumination guidance, auxiliary mirror guide, and an auxiliary mirror manipulator through which laser is steered to hidden parts of the vocal fold.In practice, human tissues will never contact the designed micro-robotic device, just the endoscope outer shell, which can be readily sterilized and biocompatible.An auxiliary mirror would be inserted at the beginning of the surgical process and remain stationary until the end of surgery.All those parts must be miniaturized and enclosed in a flexible endoscope during fabrication which is out of this paper's scope.However, details on packing all those hardware components (miniaturized) into an endoscope can be found in [31] Figure 2. System configuration

System model for accessing hidden parts of vocal fold
From the system configuration above, enlarging its distal arrangements of a flexible endoscope and focusing on how to access hidden features of the vocal fold is shown in figure 3. The system has a micro-robot, a tip/tilt actuating mirror to steer the laser through an auxiliary mirror to reach hidden parts.The two cameras also observe the same hidden scene through a mirror reflection.Hence, providing a clear vision of the surgeon's stage to define a trajectory followed automatically by surgical laser in those hidden parts.

Enlarged anterior, lateral and posterior view for accessing hidden parts of vocal fold
The former configuration of figure 2 has a limited field of view, even on anterior parts of the vocal fold, because of technical constraints (miniaturization for endoluminal systems, direct line of sight for extracorporeal systems).Thus, the proposed method for accessing hidden parts on the vocal fold anterior side is in figure 4().Based on the orientation and position of the auxiliary mirror, the surgical laser can be steered to all parts on the anterior side of the vocal fold, which is the surface of the vocal fold that is visible from the larynx.For instance, the laser can reach parts outside of the direct view field ( in yellow).In figure 4(), a surgical laser is first shot to an auxiliary mirror, then reflected towards the tissues located above the vocal fold in the larynx, such as the ventricular fold.As demonstrated in figure 4(), rare portions of a vocal fold being the surface visible from the trachea can be accessed by opening the vocal lips using forceps, exposing the backside for the auxiliary mirror to be oriented and pushed.Epipolar lines in right and left images, respectively

Mirror reflection
From a technical point of view, these control equations also differ from the already published ones [7], [9], [10] by the use of an alternative formulation of the perspective projection model and by the servoing of geodesic image errors instead of linear image errors.Consider the reflection of  into   through the auxiliary mirror plane  = (   , )  where  is the vector normal to the mirror plane, and  is the distance of the reference frame origin to the mirror plane.Using homogeneous coordinates for the points and following [32], one has where Implementing these equations depends on the chosen reference frame and can thus be expressed either in the world frame.
or in a camera frame:

Camera projection based on a cross-product concept
When a camera captures an image of a scene, depth information is lost as objects, and points in 3D space are mapped onto a 2D image plane.For the work in this paper, depth information is crucial since there is a need for scene reconstruction from the information provided by the 2D image to know the distance between the actuated mirror and the scene without prior knowledge of where they are.Therefore the used approach of the perspective (pinhole) image projection  of a 3D point  is stated as; where  K  represents calibrated intrinsic camera parameters, I 3×4 represents a canonical perspective projection in the form of 3 × 4 identity matrix,  T  represents Euclidean 3D transformation (rotation and translation) between the two coordinate systems of camera and world through a mirror.The ≡ sign represents depth loss in the projection up to some scale factor.In practice, the ≡ sign can be removed through division operation, which introduces non -linearity.Alternately, using the cross product, we can make the projection equation a linear constraint equation.Since the light ray emitted from the camera centre point aligns with the light ray coming from the 3D point.If we treat light ray emitted from the camera centre point as vector  and light ray coming from the 3D point as vector .The cross-product of two 3D vectors  and  gives another vector with a magnitude equal to that of the area enclosed by the parallelogram formed between the two vectors.The direction of this vector is perpendicular to the plane enclosed by  and  in the direction given by the Right-hand rule, and the magnitude of the cross product will be given by |  ||  | .However, if these two vectors are in the same direction, just like in our case, the angle between them will be zero.The magnitude of the cross product will be zero since (0) = 0.The resultant vector will be the zero vector. ×  = 0 Hence for notation simplicity, let; Consequently,  (6) rewrites more simply as

Laser spot kinematics
The time derivative of  (9) is considered to servo the spot position from the current position to the desired one while the camera and mirror remain stationary.
Since  x is a unit vector (i.e.∥  x∥ = 1) and using  (9) yields with   > 0 the unknown depth along the line of sight passing through  x hence,  (10) becomes

Scanning laser mirror as a virtual camera
Scanning a mirror as a virtual camera is considered with a virtual image plane.Therefore the mathematical relationship between it and the 3D spot on the reflected vocal fold is established in  (13) below.Some parameters of  (6) have changed as;  = , and K = I 3×3 since when using the mirror as a camera, focal length, optical centre, and lens distortion are no longer a problem hence K is taken to be one.
is the virtual projected spot on the mirror virtual image plane,  T  transformation matrix relating micro-mirror frame () with world frame through the auxiliary mirror () hence  T  constant.Differentiating  (13) gives the velocity at which the laser is servoed from one point to another in the image.The resultant equation is;

To be virtual or not to be?
The overall static model for both the laser steering system through the auxiliary mirror and a camera observing the laser spot through the same mirror is given by the constraints in  (6) and  (13).This forms an implicit model of the geometry at play, from which one can, depending on what is known beforehand and what's needed, explicitly try to get the unknown values from the known ones.The easiest is to find the laser direction and its spot projection in the image from a known place of the spot in 3D and the 3D locations of the camera  T  , the steering mirror  T  , and auxiliary mirror  D. However, in practice, one would like to "triangulate through the mirror" the 3D spot from the laser orientation and the spot image projection.And even more helpful, one would like to steer the laser (i.e., change , thus ) from an image-based controller (i.e., a desired motion of ).Then, the question is whether one should explicitly reconstruct  or can the controller be derived without this explicit reconstruction.
A large part of the answer to that question lies in the auxiliary mirror location  D. If it is known, then triangulation can potentially be done, but this imposes strong practical constraints.However, looking closely at the above equations and figure 4, one can remark that there exists a virtual spot location,   =  D  which lies behind the mirror.Replacing  D  by   in the  (6) and  (13) yields a solution independent of the auxiliary mirror location.
Of course, this simplification is only valid when both the laser and the camera reflect through the same mirror, forcing the user to check that the laser spot is visible in the image.This also reduces the calibration burden to determine the relative location  T  between the steering mirror and the camera since the steering mirror frame can arbitrarily be chosen as the world frame of the virtual scene.As a consequence, from a modelling point of view, working with the virtual scene reduces the problem to its core: As will be seen in the following sections, this allows to derive a controller without making an explicit triangulation, that is, without necessarily having sensors for .
Consequently, placing the problem in the virtual space allows for a simple solution, independent from prior knowledge of the auxiliary mirror location.Which just needs to be held stable during control so that the desired visual feature and the current one are geometrically consistent.

Geodesic error
Geodesic error differs from linear error since error reduction is made along the unit sphere's surface for geodesic error rather than within the image plane, resulting in linear error minimization.
where  x is the detected position of the laser spot in the image and  x * is the desired one, which is chosen arbitrarily by users in the visual image, and   representing the shortest arc between the two points defining the rotation vector orthogonal to the arc plane.Once  x is a unit vector, its derivative takes the form of  x =  ×  x where  is a pseudo-control signal on the sphere and replacing in  (12) yields since × and the virtual 3D laser spot velocity  x , to be controlled, is thus constrained by

Single-camera case of observing hidden portions of vocal fold
We can effectively model and control the laser path with one camera, actuating mirror, and auxiliary mirror.By first establishing angular velocity of the actuating mirror to control the orientation of the laser beam.The general solution to  (22) is where   x can be interpreted as the motion of  along its line of sight (thus a variation of   ) that is not observable by the camera.It can be due to the irregular shape of the surface hit by the laser or made by a specific motion of that surface.Observing that allows solving for    in  (23) Hence, substituting    with  (25) result in: which simplifies into where  ′ =     and  ′ =    are the control gains and can be tuned without explicit reconstruction of the depths   and   .Again, the controller is independent of the mirror's position because both the image and the laser go through it. can be taken as zero unless one wishes to estimate and compensate for the surface shape and ego-motion.The relationship between laser speed velocity and angular velocity of the actuated mirror is given us: [9]   =  ×   (28) Making  the subject of the formula from  (29) Figure 5.The system model workflow

Trifocal geometry
Let us now investigate the effect of using two cameras, in addition to the actuating mirror and an auxiliary mirror.Vector  is a unit vector in the direction of a laser beam,  is the shortest distance from the centre of the micro-mirror to the plane of the auxiliary mirror.Vector  2 is the reflected unit vector of  ,  1 is the shortest distance between the auxiliary mirror and vocal fold plane, and 2 is the distance along the reflected laser beam.This simulation aims to validate the laser monocular visual servoing through an auxiliary mirror, controlled by  (30).

Stereo-view imaging system and auxiliary mirror simulation result in a realistic case
The second simulation implies a stereoscopic imaging system.Thus, a second camera is added to the first simulation setup, and the control in  (34) is applied.The obtained results in figure 9  Figure 10.Photography of the experimental setup

Single-camera and auxiliary mirror experimental results
Using the setup discussed in figure 10, with one camera.The monocular case was validated experimentally, and results obtained in figure 12() were similar to a simulated case in section 4.

Stereo-view imaging system and auxiliary mirror experimental results
Experimental validation of stereo-view imaging was performed with the setup discussed in figure 10.Even though both trajectories were straight but for the right image, the path didn't reach the desired target; this could be due to laser spot size differences; hence, their center of gravity moved slightly during control.

Conclusion
The study shows that vocal fold accessibility improved by seeing through a mirror and servoing surgical laser to reach those hidden portions of the vocal fold.Also, the mirror did not affect the controller.The derived control laws could work in both 2D and 3D paths without any prior knowledge of the scene.They were successfully validated in both simulation and experimentally; in all cases, the laser steering control law showed its ability to operate accurately.The experimental results further demonstrated that the proposed control laws were accurate, fully decoupled with exponential decay of the image errors.
The next stages of this paper will involve adapting the controller to work under different conditions, for instance, in the influence of perturbations and experimenting on a vocal fold mock-up.

Figure 1 .
Figure 1.Simulated image of vocal fold anatomy

Figure 6 .
Figure 6.Model schematic In figure 6, three cameras with optical centres   ,   , and   observe a 3D point  = ( , , )  through a mirror as point    which is projected in 2D points   = ( , )  ,   = (  x,  ỹ)  and   = (  x,  ỹ)  in the images planes   ,   and   respectively.The fundamental matrices    and    and the epipolar lines   and   showing a relation between the cameras and actuated mirrors.There are mathematical relations between the Epipolar lines (     ) and (     ) and 2D point  ,

Figure 7 .Figure 7
Figure 7. Simulated set-upFigure7shows the simulation setup used.Where point  corresponds to the laser spot position on the vocal fold.Vector  is a unit vector in the direction of a laser beam,  is the shortest distance from the centre of the micro-mirror to the plane of the auxiliary mirror.Vector  2 is the reflected unit vector of  ,  1 is the shortest distance between the auxiliary mirror and vocal fold plane, and 2 is the distance along the reflected laser beam.This simulation aims to validate the laser monocular visual servoing through an auxiliary mirror, controlled by (30). Figure8() orange colour asterisk is the laser spot's initial position, red plus colour is the desired location of spot, and the magenta cross colour is geometric coherence.The trajectory path shown in the image of figure 8() marked with a blue line is the laser beam path followed by the steering laser in an image from the initial position to the desired place at hidden parts

Figure 8 (
Figure7shows the simulation setup used.Where point  corresponds to the laser spot position on the vocal fold.Vector  is a unit vector in the direction of a laser beam,  is the shortest distance from the centre of the micro-mirror to the plane of the auxiliary mirror.Vector  2 is the reflected unit vector of  ,  1 is the shortest distance between the auxiliary mirror and vocal fold plane, and 2 is the distance along the reflected laser beam.This simulation aims to validate the laser monocular visual servoing through an auxiliary mirror, controlled by (30). Figure8() orange colour asterisk is the laser spot's initial position, red plus colour is the desired location of spot, and the magenta cross colour is geometric coherence.The trajectory path shown in the image of figure 8() marked with a blue line is the laser beam path followed by the steering laser in an image from the initial position to the desired place at hidden parts

Figure 8 (
Figure 8 (a).Image () and figure 9() showed that the laser beam's trajectory path from the initial position to the desired position was straight.Error versus time plot, in figure 9() converged to zero.Similarly, as in figure 9(), mirror velocity had exponential decay.

Figure 13 (Figure 13 (Figure 14 .Figure 15 .
Figure 13 (c).Error Vs.Time Figure 13 (d).Mirror Velocity Figure 13() and figure 13(), error versus time for each image, both , and  error components had exponential decay.Figures 14 and 15 show live video screenshots of laser servoing for the conducted experiments.

Table 1 :
List of symbols used in the paper Symbol Remarks XVelocity of the 3D point The unknown depth along the line of sight passing through  x  The virtual projected spot on the mirror virtual image plane  T  Transformation matrix relating micro-mirror frame () with world frame through the auxiliary mirror  The velocity of the virtual projected spot on the mirror virtual