Welcome back! Last time, we combined trajectory optimization with LQR stabilization that successfully managed the swing-up maneuver with two inverted pendulums. However, the computing time using our home-grown direct collocation algorithm was (unsurprisingly) high. Don't worry, in this article, we will make the trajectory solver ~100 times faster. For example, the trajectory in the cover image only took ~2.5 seconds to finish! But first, we'll go over some revisions to the dynamics of the model.

Switching to Stepper Motors

At the moment, me and a friend (mostly my friend) are assembling the hardware for the cart-pole system. After some thinking and testing, the DC motor we had planned to use is not up to par. It can provide enough angular velocity and torque for the swing-up maneuver, but it lacks the precision needed to accurately balance the inverted pendulums and provide accurate feedback control. Therefore, we decided to switch to a stepper motor instead. Stepper motors have the complete opposite strengths and weaknesses. They have high precision control, but they can lack speed and torque. However, the one that we have should be sufficient for our needs.

Introducing a stepper motor simplifies the dynamics of the system. Similar to the triple pendulum system of Glück et al. [1], we have assumed that there is an ideal velocity and acceleration control of the motor. Using this property, we can choose the angular acceleration of the cart as the control variable u. Therefore, as long as the stepper motor can provide enough torque, the dynamics simplify to:

The states of the system remain the same, but the symbols of the position, velocity and acceleration of the cart have changed to s. The previous equation that described the cart's acceleration has also been rewritten to describe the output torque of the stepper motor. This equation is used as a constraint in the trajectory solver.

These changes are reflected in the code. There is a new class with a way too long name called CartPoleStepperMotorSystem with updated methods, including a new method called constrain_states which returns the output torque of the motor given a state and control. As always, the code is available on GitHub.

...

    def constraint_states(self, state: np.ndarray, control: np.ndarray) -> np.ndarray:
        d_s = state[1]
        dd_s = control[0]
        thetas = state[2::2]
        d_thetas = state[3::2]

        sum1 = sum(self.ms*sin(thetas)*cos(thetas))
        sum2 = sum(self.ms*(self.ls/2)*d_thetas**2*sin(thetas))
        sum3 = sum((self.u_ps*d_thetas*cos(thetas))/(self.ls/2))
        sum4 = sum(self.ms*cos(thetas)**2)

        f = 3/7*(self.g*sum1-7/3*(sum2-self.cart.u_c*d_s)-sum3-(sum4-7/3*self.M)*dd_s)
        torque = f*self.motor.r

        return np.array([torque])

...

Updated Trajectory Solver

Using the updated dynamics, we are ready to upgrade our solver using an external library that is already fast and optimized. A popular choice in the industry is CasADi, which is "an open-source software framework for numerical optimization" [2]. CasADi can be written in C++, Matlab/Octave and Python and is quite complex. To address this issue, CasADi released Opti which simplifies the syntax of creating and solving nonlinear programs, which is what we'll use since it provides a gentler learning curve.

The code we'll write is specifically made for the CartPoleStepperMotorSystem and is not abstracted for any dynamical system, unlike our previous DirectCollocation class. The aptly named CartPoleStepperMotorDirectCollocation class has a constructor that takes in the number of discrete points of the trajectory to output (N), the number of collocation points that will be used for the spline (N_collocation), the system and the tolerance of the optimizer. The constructor then calls the method calculate_jacobian(), which constructs the jacobian of the dynamics, which will be discussed later in the part about linearization. The class also has its own methods for calculating the dynamics and constraints.

import numpy as np
from scipy.interpolate import CubicSpline #type: ignore
from .cartpolesystem import CartPoleStepperMotorSystem
import casadi as ca

class CartPoleStepperMotorDirectCollocation():
    def __init__(self, N: int, N_collocation: int, system: CartPoleStepperMotorSystem, tolerance=1e-6):
        self.N = N
        self.N_collocation = N_collocation
        self.N_states = system.num_states
        self.N_controls = system.num_controls
        self.tolerance = tolerance
        self.system = system
        self.calculate_jacobian()

    def differentiate(self, x, u):
        d_s = x[1]
        dd_s = u[0]
        next_x = [0 for _ in range(self.system.num_states)]
        next_x[0] = d_s
        next_x[1] = dd_s
        for i in range(self.system.num_poles):
            theta = x[2+2*i]
            d_theta = x[3+2*i]
            u_p = self.system.u_ps[i]
            l = self.system.ls[i]
            m = self.system.ms[i]
            dd_theta = 3/(7*l/2)*(self.system.g*ca.sin(theta)-dd_s*ca.cos(theta)-u_p*d_theta/(m*l/2))
            next_x[2+2*i] = d_theta
            next_x[3+2*i] = dd_theta
        return next_x

    def constraint_states(self, x, u):
        d_s = x[1]
        dd_s = u[0]
        sum1 = 0
        sum2 = 0
        sum3 = 0
        sum4 = 0

        for j in range(self.system.num_poles):
            theta = x[2+2*j]
            d_theta = x[3+2*j]
            l = self.system.ls[j]
            m = self.system.ms[j]
            u_p = self.system.u_ps[j]

            sum1 += m*ca.sin(theta)*ca.cos(theta)
            sum2 += m*l/2*d_theta**2*ca.sin(theta)
            sum3 += u_p*d_theta*ca.cos(theta)/(l/2)
            sum4 += m*ca.cos(theta)**2

        f = 3/7*(self.system.g*sum1-7/3*(sum2-self.system.cart.u_c*d_s)-sum3-(sum4-7/3*self.system.M)*dd_s)
        torque = f*self.system.motor.r

        return np.array([torque])

In the method make_solver, the first thing we do is to create an opti object using ca.Opti(). We then create the state and control variables in each collocation point using opti.variable() whose values will be optimized. Using the initial guess x0 and final state r, an initial guess is made, the objective function is set, and equality and inequality constraints are added using methods that will be described later. Using opti.solver("ipopt"), the solver is activated using the algorithm IPOPT, a popular nonlinear optimization algorithm. The final optimized values are then extracted and splines of the state and control trajectories are returned.

    def make_solver(self, end_time: float, x0, r, x_guess = np.array([]), u_guess = np.array([])):
        self.opti = ca.Opti()   
        self.xs = self.opti.variable(self.N_collocation,self.system.num_states)
        self.us = self.opti.variable(self.N_collocation,self.system.num_controls)
        self.h = end_time/self.N_collocation
        self.end_time = end_time

        self.x0 = x0
        self.r = r

        self.make_guess(x0, r, x_guess, u_guess)
        self.set_objective_function()
        self.set_eq_constraints()
        self.set_ineq_constraints()

        self.opti.solver("ipopt")
        sol = self.opti.solve()

        x_optimal_raw = sol.value(self.xs)
        u_optimal_raw = sol.value(self.us)

        time_collocation = np.linspace(0, self.end_time, self.N_collocation)
        time = np.linspace(0, self.end_time, self.N)

        states = np.vstack([
            CubicSpline(time_collocation, s_row)(time) for s_row in x_optimal_raw.T
        ]).T
        controls = np.vstack(np.interp(time, time_collocation, u_optimal_raw))

        return states, controls

The make_guess method works very similarly to the one in the DirectCollocation class. The only difference is that the guesses are automatically set in the method to be the initial values of the variables using opti.set_initial(). In the set_objective_function, the sums of squares objective function of the control signal is added. Here, since this function isn't called for every iteration, we can afford to use for-loops without slowing down the code, since this method is only called once. The objective function is added using opti.minimize(obj) where obj is the objective to minimize.

    def make_guess(self, x0: np.ndarray, r: np.ndarray, x_guess: np.ndarray, u_guess: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
        self.x0 = x0
        self.r = r

        if x_guess.size == 0 or u_guess.size == 0:
            x_guess = np.linspace(x0, r, self.N_collocation)
            u_guess = np.zeros((self.N_collocation, self.system.num_controls))

        self.x_guess = x_guess
        self.u_guess = u_guess
        self.opti.set_initial(self.xs, x_guess)
        self.opti.set_initial(self.us, u_guess)

        return x_guess, u_guess

    def set_objective_function(self):
        obj = 0
        for i in range(self.N_collocation-1):
            obj += (self.us[i,0]**2+self.us[i+1,0]**2)*self.h/2
        self.opti.minimize(obj)

Constraints, both equality and inequality, are added using opti.subject_to(). In the set_eq_constraints, the initial and final state constraints are added. The equality constraints on the angles of the pendulums look a bit strange. The equations make sure that if the difference between the angle and the goal angle is a multiple of 2π, it is allowed. Similarly, the set_ineq_constraints adds the dynamic constraints, the lower and upper bounds of the variables, as well as the torque constraint.

    def set_eq_constraints(self):
        self.opti.subject_to(self.xs[0,0] == self.x0[0])
        self.opti.subject_to(self.xs[-1,0] == self.r[0])
        self.opti.subject_to(self.xs[0,1] == self.x0[1])
        self.opti.subject_to(self.xs[-1,1] == self.r[1])

        for i in range(self.system.num_poles*2):
            if i % 2 == 0:
                self.opti.subject_to(ca.arctan2(ca.sin(self.x0[i+2]-self.xs[0,i+2]),ca.cos(self.x0[i+2]-self.xs[0,i+2])) == 0)
                self.opti.subject_to(ca.arctan2(ca.sin(self.r[i+2]-self.xs[-1,i+2]),ca.cos(self.r[i+2]-self.xs[-1,i+2])) == 0)
            else:
                self.opti.subject_to(self.xs[0,i+2] == self.x0[i+2])
                self.opti.subject_to(self.xs[-1,i+2] == self.r[i+2])

    def set_ineq_constraints(self):
        for i in range(self.N_collocation):
            s = self.xs[i,0]
            last_s = self.xs[i-1,0]
            d_s = self.xs[i,1]
            last_d_s = self.xs[i-1,1]
            dd_s = self.us[i,0]
            last_dd_s = self.us[i-1,0]
            self.opti.subject_to(self.opti.bounded(self.system.state_lower_bound[0]+self.system.state_margin[0],s,self.system.state_upper_bound[0]-self.system.state_margin[0]))
            self.opti.subject_to(self.opti.bounded(self.system.state_lower_bound[1]+self.system.state_margin[1],d_s,self.system.state_upper_bound[1]-self.system.state_margin[1]))

            d_x = self.differentiate(self.xs[i,:],self.us[i,:])
            last_d_x = self.differentiate(self.xs[i-1,:],self.us[i-1,:])

            if i > 0:
                self.opti.subject_to((d_s+last_d_s)*self.h/2 == (s-last_s)) 
                self.opti.subject_to((dd_s+last_dd_s)*self.h/2 == (d_s-last_d_s)) 

                for j in range(self.system.num_poles):
                    theta = self.xs[i,2+2*j]
                    last_theta = self.xs[i-1,2+2*j]
                    d_theta = self.xs[i,3+2*j]
                    last_d_theta = self.xs[i-1,3+2*j]

                    dd_theta = d_x[3+2*j]
                    last_dd_theta = last_d_x[3+2*j]

                    self.opti.subject_to((d_theta+last_d_theta)*self.h/2 == (theta-last_theta))
                    self.opti.subject_to((dd_theta+last_dd_theta)*self.h/2 == (d_theta-last_d_theta))

            constraint_state = self.constraint_states(self.xs[i,:],self.us[i,:])
            f = constraint_state[0]
            torque = f*self.system.motor.r
            self.opti.subject_to(self.opti.bounded(self.system.motor.torque_bounds[0]*(1-self.system.motor.torque_margin),torque,self.system.motor.torque_bounds[1]*(1-self.system.motor.torque_margin))) #type: ignore

Updated Linearization

The previous linearization algorithm used SymPy which is written in pure Python. Since Python is pretty slow, linearizing the dynamics along an entire trajectory can take up to 10 seconds, which is not desirable. To linearize the dynamics, we will use the standard CasADi syntax. We first create symbolic state and control variables (x_vars and u_vars respectively) using ca.SX.sum(name_of_var). Using differentiate() we can get the dynamic equations (eqs). We then vertically concrete x_vars and u_vars, as well as the dynamic equations, using ca.vertcat(), which places the elements in a row. The Jacobian (J) can then be calculated using ca.jacobian(eqs, vars).

import casadi as ca    
...
    def calculate_jacobian(self):
        x_vars = []
        x_vars.append(ca.SX.sym("s"))
        x_vars.append(ca.SX.sym("d_s"))
        for i in range(self.system.num_poles):
            x_vars.append(ca.SX.sym(f"theta_{i}"))
            x_vars.append(ca.SX.sym(f"d_theta_{i}")) 
        u_vars = [ca.SX.sym("u")]

        eqs = self.differentiate(x_vars, u_vars)
        vars = x_vars + u_vars
        vars = ca.vertcat(*vars)
        eqs = ca.vertcat(*eqs)

        J = ca.jacobian(eqs, vars)
        self.J = J
        self.vars = vars
        self.eqs = eqs

Finally, the linearize method takes in the operating state and control values and substitutes these values in the Jacobian. The substituted Jacobian is then converted to a numeric matrix using ca.DM() and then to NumPy array. The A and B matrices of the linearized system can then be extracted.

    def linearize(self, x_vals: np.ndarray, u_vals: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
        vals = ca.vertcat(*x_vals, *u_vals)
        J_eval = ca.substitute(self.J, self.vars, vals)
        J_eval_np = np.array(ca.DM(J_eval))
        A = np.array(J_eval_np[:,:-1]).astype(np.float64)
        B = np.vstack(np.array(J_eval_np[:,-1]).astype(np.float64))
        return A, B

Performance

Using two poles, a simulation time step of 0.005 seconds, 100 collocation points and a swing-up time of 5 seconds, the algorithm took 1.8 seconds to calculate the trajectory and 0.2 seconds to linearize and calculate the feedback gains along the trajectory. The resulting swing-up can be seen below. I'm really satisfied with how fast the algorithm is now. It's not fast enough for model-predictive control, but it is fast enough to calculate trajectories on the go.

Future Development

Next up, I'll focus on developing a nicer UI and a controller that can be used to send commands to the pendulum live. The goal is for the controller to be able to swing the pendulums to different equilibrium points on the go. This could example be to swing from the up-up position, where the first pendulum is pointing up and the second pendulum is pointing up, to the up-down position. For example, the cover image shows a triple pendulum going from the down-down-down position to the up-up-down position. That's all for now, I hope you enjoyed reading.

-SL

Previous part - Trajectory Stabilization using LQR

References

[1] T. Glück, A. Eder, and A. Kugi, "Swing-up control of a triple pendulum on a cart with experimental validation," Automatica, vol. 49, no. 3, pp. 801-808, Elsevier, 2013, doi:10.1016/j.automatica.2012.12.006.

[2] J.A.E. Andersson, J. Gillis, G. Horn, J.B. Rawlings, and M. Diehl, "CasADi - A software framework for nonlinear optimization and optimal control," Mathematical Programming Computation, vol. 11, no. 1, pp. 1-36, Springer, 2019, doi:10.1007/s12532-018-0139-4.

Sackarias' technical blog