Multistage approach for trajectory optimization for a wheeled inverted pendulum passing under an obstacle

Abstract A robotic system constructed as a wheeled inverted pendulum (WIP) serves as an impressive demonstrator, since this kind of system is inherently nonlinear, unstable, and nonminimum phase. These properties may pose several difficulties, when it comes to control and trajectory planning. This paper shows a method for deriving a highly dynamic trajectory compliant with the system dynamics by means of solving an optimal control problem (OCP) using multiple shooting. The assumed task includes that the WIP should pass a height-restricting barrier. This can be achieved by leaning back or forth, in order to reduce the overall height of the WIP. The constraints inherent to the definition of this trajectory are nonconvex due to the shape of the robot. The constraint functions have a local minimum in an infeasible region. This can cause problems when the initial guess is within this infeasible region. To overcome this, a multistage approach is proposed for this special OCP to evade the infeasible local minimum. After solving four stages of subsequent optimization problems, the optimal trajectory is obtained and can be used as feedforward for the real system.


Introduction
Mobile robotic systems with the structure of a wheeled inverted pendulum (WIP) are of ongoing interest in research, since this type of robotic system has a wide field of application, reaching from personal transport systems, e.g., the one commonly known as Segway or the TransBOT [1], over robotic wheelchairs [2], up to wheeled humanoid robots [3], to give only a few examples. This kind of system has the ability to turn on the spot, which results in high maneuverability and allows it to be used in confined spaces. This advantage comes at the cost that a self-balancing control has to be implemented, which is not a trivial task, since the underlying system is inherently nonlinear, unstable, and nonminimum phase. On the contrary, these properties make a mobile robot constructed as a WIP an ideal demonstrator platform, like the robotic system addressed in this paper. In order to create an impressive demonstration, an exceptional trajectory, as indicated in Fig. 1, should be followed. The desired task involves moving under an obstacle which is lower than the overall height of the WIP. This special type of obstacle avoidance is inspired by a limbo dance and pushes the system to its limits. Therefore, it can be used to test how a path following control performs near the system limits. Furthermore, it also has some interesting mathematical properties regarding the solvability of the optimal control problem (OCP). Before the OCP can be stated and solved, the nonlinear system dynamics has to be modeled thoroughly. To this end, different approaches can be used, as indicated by numerous contributions, listed in the review paper [4], and also more recent contributions like [5] and [6]. The latter also implements trajectory planning via optimal control, but without obstacles. Opposed to [7] or [8], where obstacles on the ground are addressed, this paper addresses obstacle avoidance as indicated in Fig. 1. A similar task has been considered in [9] by using two different controllers, one for stabilizing and one for holding a target inclination angle, and optimizing the switching point. Whereas in this paper the obstacle avoidance is incorporated directly into an OCP. In order to ensure, that the wheels do not slip, the ground reaction forces between the wheels and the ground surface are derived and used as constraints in the OCP, which is also implemented in [10]. As already mentioned, the desired task is set up in such a way, that at a defined position the overall height of the WIP has to be lower than the lowest point of the obstacle, while at the same time maintain forward movement. Furthermore, the considered WIP has a flat surface on top. Special about that is that the overall height of the point H 8 , depicted in Fig. 1, starts increasing if the robot is tilted backwards until its maximum height is reached. Only then the height of H 8 decreases if the robot is tilted further. This results in nonconvex constraints for the obstacle avoidance with a possibly infeasible local minimum and infeasible initial guess. This paper addresses this issue, by means of a multistage approach for solving the according OCP [11]. Despite finding a feasible solution for the OCP, the proposed procedure allows to obtain a close to minimal obstacle height, for which a feasible solution exists. Since the so gained optimal trajectory already satsifies the system dynamics and limits, an LQR approach can be used to stabilize the system along this trajectory, if the optimal control is used as feedforward.

Mathematical modeling
The derivation of the exact dynamic model of a WIP, as shown in Fig. 2, taking into account movements on the two-dimensional ground surface and two independently driven wheels, can be found in [10]. There, no slipping is assumed between the ground and the wheels, thus the WIP can only be moved along the instantaneous R x-axis, rotated about the R z-axis, and tilted about the R y-axis. In order to model these restrictions, non-holonomic constraints at velocity level have to be taken into account. On the contrary, the desired trajectory, adressed in this paper, is guided by a straight line on the ground surface along the I x-axis. Therefore, the model can be simplified by setting the orientation angle γ = 0 and the I ycoordinate of the position y = 0. Considering only longitudinal movements, the generalized coordinates z = x θ ξ η are sufficient to describe the positions and orientations of the basis and the two wheels. The coordinate x describes the position of the ground contact point along the I x-axis, θ is the inclination angle of the basis, and ξ and η denote the relative angles of the wheels. Since ideal rolling of the wheels is assumed, the relative angular velocities of the wheels are prescribed bẏ with the diameter of a wheel D W . Furthermore, the wheels are rotationally symmetric. Consequently, the relative angles of the wheels do not influence the system dynamics and are of no special interest otherwise. Thus it is sufficient to choose the minimal coordinates as q = x θ and accordingly define the minimal velocities byṡ = ẋθ . The minimal coordinates and the minimal velocities can then be combined to the vector of states x = q ṡ . Due to the symmetric structure of the overall robot and the mentioned simplification of the model, the driving torques have to be equal for each wheel. Therefore, the vector of inputs can be stated by u = M with M denoting the driving torque for each wheel.

Kinematics
Because of γ = 0, the reference frames I and R have always the identical orientation and the orientation of the body fixed reference frame B is defined solely by the inclination angle θ . The rotation matrix can be used to transform vectors represented in frame I, indicated by the left index I (·), to vectors represented in the frame B, indicated by the left index B (·). The inverse transformation can be aquired by R IB = R BI . The angular velocity of the basis and the two wheels are given by respectively. With the position vectors according to Fig. 2, the velocity vectors for the centers of mass (COM) of the basis and the two wheels i ∈ {1, 2}, can be stated by respectively. Hereω denotes the skew symmetric cross-product matrix of the vector ω.

Dynamics
Based on the definitions in Section 2.1, the momentum vectors for the three bodies are derived by with the mass of the basis m B , the mass of a wheel m W and i ∈ {1, 2}. Due to the shape of the basis, the principle axes can be approximated by the axes of the body fixed reference frame B. Therefore, the inertia tensors for the basis and the wheels related to the respective COM can be stated by respectively. Thereby the moment of inertia of the drive rotor J M is transformed to the gear output side with the gear ratio i G . The vectors of angular momentum can then be defined by Using the forces due to gravity with the gravitational acceleration g, and the torques due to the motors and viscous friction with the viscous friction coefficient d v , the reaction forces and torques for the basis and the wheels can be stated by respectively. According to [12], the equations of motion (EOM) are then determined as By evaluating and rearranging (15), the resulting EOM satisfy the form with the symmetric positive definite mass matrix M(θ ), the vector of nonlinear terms g(θ ,ṡ), and the constant input matrix B. Since the mass matrix M(θ ) is invertible, (16) can be converted to state space representation with affine-input structure, resulting iṅ In order to obtain the equilibrium state of the system, defined byṡ =s = 0, the equation system g(θ , 0) = Bu has to be solved. One solution to this equation system is the inclination angle of the upper equilibrium, denoted θ e , and the drive torque M e = 0.

Ground reaction forces
As mentioned above, the kinematic relation (1) is based on the assumption of ideal rolling of the wheels.
In order to ensure that no slipping occurs, the reaction forces between wheels and ground have to stay within static friction bounds. The ground reaction forces/torques, assumed to act at point G for simplicity, can be derived by summing up the reaction forces/torques (13) to (14) properly shifted into point G. This results in with I r GC B = I r GP + R IBB r PC B , and I r GC W i = I r GP + I r PC W i . As long as the EOM (16) are fulfilled, the ground reaction torques satisfy I M z G ≡ 0, which can be easily verified by plugging (17) into (18). As expected for the simplified model, the y-component of I f z G results to f z G,y ≡ 0. In order to ensure proper ground contact, the z-component has to satisfy the condition and to ensure ideal rolling the x-component has to be bound by with the static friction coefficient μ 0 .

Obstacle avoidance
As can be seen in Fig. 1, the WIP has to get from one side of the obstacle to the other. If only the upper part of the WIP, regarded as the "head," is considered, three possible outcomes can be observed. First, the head and thus the entire WIP passes under the obstacle and no collision takes place. Second, the head collides directly with the obstacle. And finally, the head passes above the obstacle, which would mean that a different part of the WIP collides with the obstacle. Therefore, it is sufficient to test for collisions between the head and the obstacle, as long as it can be enssured, that the head would not pass above the obstacle. In order to simplify the collision detection between the head and the obstacle, the cross section of the head is approximated, as shown in Fig. 3, by five circles located at the points with a radius r H i = h H 2 and i ∈ {1, . . . , 5}. To account for the sharp edge at the front of the head, the approximation is refined by adding two additional circles at the points  with r H 8 = 0 for consistency. With the absolute position vectors with i ∈ {1, . . . , 8} and the position vector to the center of the obstacle the relative position vectors between the head and the obstacle can be stated by A noncolliding trajectory always has to satisfy with i ∈ {1, . . . , 8} and the radius of the obstacle r O . In order to ensure that the head does not pass above the obstacle, a test point may never lie inside the triangular shaped region above the center of the obstacle, as indicated in Fig. 3. To achieve this, the condition with I n O = 0 0 −1 and i ∈ {1, . . . , 8} must always be met.

Continuous time domain
Combining the results of Sections 2 and 3, a weighted time and energy optimal trajectory for a fixed obstacle height a Oz can be obtained by solving an OCP with variable terminal time T E . The according OCP can be stated as with the time weight ν t , the input weight ν u , the initial state x 0 = 0 θ e 0 0 , and the terminal state x E = x E θ e 0 0 . The vectors x min and x max form box constraints for the states and are used to restrict especially the inclination angle θ to reasonable values, whereas the box constraints for the input, given by u min and u max , account for the maximal permissible driving torque. The angular velocity of the wheels ω W (x), according to (1), is limited to the maximum value ω W,max and the drive power P M (x, u) = ω W (x)M is constrained to the maximal drive power P M,max . The constraints regarding the ground reaction forces, (39) and (40), are according to (19) and (20)

Discrete time domain
To obtain a solution for the infinite-dimensional OCP (30), a direct multiple shooting approach [13][14][15] is chosen. Therefore, the time span t ∈ [0, T E ] is discretized with the sampling time T S = T E N , N = 1000, resulting in the vector of time stepst = 0 T S · · · iT S · · · T E ∈ R N+1 . Between each time step the value of the input vector u is assumed to be constant. This results in the matrices of discrete input valueŝ u = û 0û1 · · ·û N−1 ∈ R 1×N and state valuesx = x 0x1 · · ·x N ∈ R 4×(N+1) . As integration scheme the explicit Runge-Kutta method of fourth order RK4 is used, which is implemented explicitly in order to exploit the automatic differentiation capability of the used solver. Consequently, the state of the next time step can be obtained by a function f RK4 (x i ,û i , T E ). The OCP can then be approximated by the finitedimensional optimization problem x 0 = x 0 (45) for a fixed obstacle height a Oz . Constraint (44) accounts for the shooting gap between integration of one time step and the optimization varibale of the according next time step.

Minimal obstacle height
In order to obtain the minimal obstacle height a Oz for which a feasible solution exists, the nested optimization problem min u,x,T E ,a Oz a Oz (56) with the discretized OCP as constraint has to be solved. The obstacle height is restricted to reasonable values by a Oz,min > 0, which is chosen in such a way that it does not restrict the actual solution.

Numerical solution
For solving the nonlinear optimization problem (43), the optimization framework CasADi [16,17] is used with the solver Ipopt, which is based on [18]. In order to solve the nested optimization problem (56) a simple line search algorithm can be used for the top-level optimization problem. Thereby the obstacle height a Oz is initialized to some value a Oz > D W 2 + H + r O and then incremenetally reduced by  a fixed step size a Oz , which in turn is reduced once no feasible solution for the inner optimization problem can be found. A minimum step size can be used as terminal condition. To find an initial guess for the inner optimization problem, a straight forward approach would be to linearly interpolate the initial and terminal state and to find a hard lower bound for the terminal time, which is then relaxed by some factor. A reasonable lower bound for T E can be derived, for example, by the maximum drive speed and the distance between initial and terminal position. The resulting trivial initial guess can be stated aŝ But if this problem is tried to be solved, the optimization process tends to terminate prematurely once the obstacle height reaches a Oz < D W 2 + H + r O . Figure 4 shows the system states of the WIP, for an examplary trajectory, where such a premature termination occurs, and Fig. 5 indicates the resulting trajectory by showing the cross section of the head during motion and the obstacle. Remarkably, the WIP is not tilted at all directly underneath the obstacle, although it is known that a better feasible solution exists.
Since the essence of the desired task is that the WIP has to pass underneath the obstacle, there exists a point in time where the highest point of the WIP has to be below the lowest point of the obstacle. Figure 6 shows the overall height of the WIP for varying inclination angle θ . It can be seen that there exists a local minimum at θ = 0, which is caused by the shape of the head. If the initial guess is near this local minimum directly underneath the obstacle, the solver may fail to escape this local infeasible minimum.  Of course it is possible to adapt the initial guess or the solver settings such that the optimization process converges more robustly. But this involves tedious handcrafting of initial solutions, which can be avoided by using a multistage approach as proposed in the following section.

Multistage approach
The basis for the multistage approach is that the OCP (43) can be solved very robustly, if the constraints regarding the obstacle are neglected. The idea is then to approach the original problem via multiple stages of optimization problems by tightening the constraints and minimizing the obstacle height. Thereby the result of one stage is used as initial guess for the next stage. As a side effect, the minimal obstacle height can also be derived along the way and a higher level line search is no longer necessary.

Without obstacle
As already mentioned the first stage consists of finding a valid solution for the OCP (43) without the constraints regarding the obstacle. The optimization problem thus simplifies to s.t. (44) to (53) Using the trivial initial guess (59), the problem (60) can be solved and the results for the optimization variablesû,x, and T E can be used as initial guess for the next stage. The resulting trajectory is indicated in Fig. 7, which shows the cross section of the head during motion.

Head with circular shape and variable obstacle height
In the second stage, the constraints regarding the obstacle are modified in such a way that these constraints no longer have a local minimum at θ = 0, which has been discussed in Section 5. This can be achieved by using a coarser approximation of the head shape with only one larger circle. To this end, the absolute position vector to the center of the head is introduced. The relative position vector between the head and the obstacle can then be stated by Satisfying the relation leads to a more restrictive constraint, which no longer has a local minimum at θ = 0. Furthermore, the height of the obstacle is minimized starting at a height which is known to not interact with the initial guess obtained by the previous stage. The resulting optimization problem can be stated as with the weight ν O ν t and ν O ν u . Again the resulting trajectory, which is indicated by Fig. 8 and the detailed view in Fig. 9, can be used as initial guess for the next stage.

Head with original shape and variable obstacle height
As can be seen in Fig. 9, the previously introduced coarser approximation of the head shape still leaves a gap between the actual head and the obstacle. In order to further reduce the height of the obstacle and to push the system to its limits, the optimization problem (65) is solved again, but now with the original  constraints regarding the obstacle. The according optimization problem is given by (55) and (68) and Fig. 10 shows that the head now touches but does not intersect the obstacle.

Original problem
Due to the very high weight for the obstacle height ν O the cost functions of the problems, (65) and (69) are significantly different in contrast to the original problem. Therefore in the final stage, the obstacle height is fixed to the result for a Oz of (69) and the original problem (43) is solved using the remaining results of (69) as initial guess.
The results of this optimization are shown in Figs. 11-14. Figure 11 depicts the states of the WIP over time. In Fig. 12, it can be seen, that the driving torque and the angular velocity of the wheels, as well as the drive power stay always within the according maximum values (red). The ground reaction forces over time are shown in Fig. 13. Since the actual force I f z G,x between the wheels and the ground is always within the limits given by static friction (red), it can be assumed, that no slipping occurs. In Fig. 14 Table I shows how the terminal time and the obstacle height evolve over the four stages.

Experimental setup
In order to test the optimal trajectory on the real system, a linear quadratic regulator (LQR) based approach is used as shown in Fig. 15. A more elaborated controller design would be conceivable, given the nonlinearity of the underlying system. But the optimal trajectory already satisfies the system dynamcis and limits, thus the controller only has to compensate for model uncertainties and measurement errors. The actual system is equipped with encoders and an inertial measurement unit (IMU), providing measurments for the wheel angles ξ and η and the inclination angle θ as well as the respective    angular velocitiesξ ,η, andθ. In order to reduce the measurement noise, especially for the angular velocities, a Kalman observer is added to the control loop. The derivation of the LQR and the observer follows standard procedures and is based on an extended model, which also accounts for lateral deviations off the trajectory. For the extended model, a state vector x E = ξ η θξηθ and an input vector u E = M 1 M 2 are used and the derivation of the EOM is similar to the approach in Section 2.
Converting the optimal state values to the desired states of the extended model x E,des can be performed by ξ = η = 2x D W − θ and (1) and a feedforward for the control input u E,FF = M 1,FF M 2,FF is implemented by setting M 1,FF = M 2,FF = M. The LQR and the observer are based on the extended system dynamics linearized at the upper equilibrium point defined by x E,equ = 0 0 θ e 0 0 0 and u E,equ = 0. This results in the linearized state vector x E = x E − x E,equ and the linearized input vector u E = u E − u E,equ . As indicated in Fig. 15, the measured values are collected in x E,meas and the observed system state is denoted x E,obs with the deviations from the equilibrium point x E,meas and x E,obs , respectively. The control error is defined by e E,ctrl = x E,des − x E,obs and the actual system input results in u E = u E,ctrl + u E,FF with the optimized feedforward u E,FF and the output of the controller u E,ctrl . Experiments have shown that the proposed controller is capable of keeping the state of the actual WIP close enough to the optimal trajectory, such that a obstacle can be passed without a collision. Thereby the obstacle is installed at the same position, resulted by the optimization problem (69). Unfortunately the controller is implemented on an embedded micro processor board, which does not provide the possibility to track signals. As a partial compensation, the experiment has been put on our YouTube channel and is accessible via [19].

Conclusion
In this work, an OCP for a WIP has been derived for a desired trajectory with obstacle avoidance inspired by a limbo dance. This task involves to pass an obstacle, which is placed at a lower height than the overall WIP height in equilibrium state. The resulting optimization problem has non-convex constraints with an infeasible local minimum. If the initial guess for the OCP is near this infeasible local minimum, gradient-based solvers tend to terminate without providing a valid solution. In order to prevent tedious manual tuning of the initial guess, a multistage approach has been proposed to avoid entering the region of this local minimum. It has been shown that solving four subsequent optimization problems reliably result in an optimal solution for the trajectory and a close to optimal value for the minimum feasible obstacle height. The obtained optimal state and control values are then used as desired trajectory and feedforward for the controller of the real system. Finally, experiments have shown that the WIP can successfully pass under the obstacle.
Obviously, the proposed approach is closely related to the problem setup consisting of the type of the robot and the definition of the obstacle. But the idea of relaxing or tightening certain constraints which impede the solution of an OCP and advance the original problem via multiple stages of subsequent optimization problems might as well generalize to other problem setups.  Ethical approval. Not applicable.