Multi Vehicle Mixed Integer Programming

In this post, we take a look at this [1][1] paper, which introduces a simple, yet interesting approach to solving a multi-vehicle path planning problem.

My implementation of the algorithm that was used to evaluate its performance and generate all the results in this post can be found here

Problem

The paper tries to tackle the issue of finding fuel optimal paths for multiple vehicles, given vehicle-obstacle and vehicle-vehicle collision constraints. The paper being more than two decades old, makes this a rather interesting read because it comes from a time well before the advent of sophisticated trajectory optimization methods we see today.

The fuel optimal cost function is formulated as a linear program, with the collision constraints being represented as linear constraints using binary integer variables. The entire problem is thus cast as a Mixed Integer Program (MIP), and taking into account that it accounts for multiple vehicles, the rest of the post shall refer to the algorithm as MVMIP.

Formulation

Preliminaries

The planning problem under consideration is usually cast as an optimization problem with a quadratic cost.

mins,uJ=mins,u0(sTQs+uTRu)dts.t.  s˙=As+Bu\begin{aligned} \min_{s, u}J = \min_{s, u}\int_{0}^{\infty}(s^TQs + u^TRu)dt\\ \text{s.t.} \ \ \dot{s} = As + Bu \end{aligned}

Where sRns \in \MR^n represents the robot state and uRmu \in \MR^m represents the control input that need to be executed at each time step. QRn×nQ \in \MR^{n \times n} and RRn×mR \in \MR^{n \times m} are the cost matrices for the quadratic cost.

Due to the fact that we want to cast this as an MIP, this cost will be cast as a linear cost that looks like:

mins,uJ=mins,u0(qTs+rTu)dt\min_{s, u}J = \min_{s, u}\int_{0}^{\infty}(q^T|s| + r^T|u|)dt

Here qRnq \in \MR^n and rRmr \in \MR^m are cost vectors that form the linear equivalent of the quadratic cost. v|v| is a vector denoting the absolute values of the vector vv being operated on.

Objective

As is common in these scenarios, to make the problem tractable, it is discretized into NN time steps. Separate costs vectors are also introduced for the final state sNs_N. This is a familiar trope in planning and allows for greater flexibility in generating the results we need, which we will see later.

mins,uJ=mins,ui=0N1(qTsi+rTui)Δt+f(sN)s.t.  si+1=Asi+Bui\begin{aligned} \min_{s, u}J = \min_{s, u}\sum_{i=0}^{N-1}(q^T|s_i| + r^T|u_i|)\Delta t + f(s_N)\\ \text{s.t.} \ \ s_{i+1} = As_i + Bu_i \end{aligned}

sis_i and uiu_i are the variables of the optimization and represent the robot’s trajectory and control inputs required to generate the state trajectory. Solving for these for each robot will essentially solve the MVMIP problem. s0s_0 is the initial state of the robot and is not a variable. sfs_f denotes the expected or required final state of the robot. u0u_0 is the control input at the first time step and is a variable, but uNu_N is not as we don’t expect to apply any control on the last step where the robot is expected to have reached sfs_f.

As mentioned, qq applies to each state vector s1s_1 through sN1s_{N-1}, with sNs_N having a corresponding cost term of f(sN)f(s_N)

The objective as we see now tries to minimize the cost on the state and control. The minimum value for this objective is when both states and control inputs are zero but this is not what we desire. Driving the control inputs to zero is fine as we’re looking for fuel optimal paths, but driving the state also to zero is not the behavior we want as we’re not constructing a regulator.

Ideally we would want the state of the vehicle/robot to reach the final state sfs_f. So instead of driving the state to zero, we would like the problem to drive it to the required final state. This can easily be done by replacing sis_i with sisfs_i - s_f which makes it so that instead of driving si0s_i \to 0, we are driving sisf0s_i - s_f \to 0 which means driving sisfs_i \to s_f.

And finally, we get to the multiple vehicle part of this whole thing, where we have states and control inputs for each required vehicle. The state and control of the pth\pth vehicle will be represented as spi\spi and the control input as upi\upi. The objective function for the pth\pth vehicle is now (Note that some of the constant terms like Δt\Delta t can be ignored):

Jp=i=1N1qpTspispf+i=0N1rpTupi+ppTspNspfJ_p = \sum_{i=1}^{N-1}\qp^T|\spi - \spf| + \sum_{i=0}^{N-1}\rp^T|\upi| + \pp^T|\spn - \spf|

The question now is how does one frame this objective that has  .|\ . \| terms as a linear function with linear constraints? To answer this, we take a quick detour to go over a known result in linear programming.

Consider the following optimization problem:

minxqTx\min_{x}q^T|x|

Where q,xRnq, x \in \MR^n. This can be written as:

minxi=1nqixi\min_{x}\sum_{i=1}^{n}q_i |x_i|

Where xi|x_i| here is just the absolute value of the ith\ith element of xx.
We can now introduce slack variables sis_i for each xi|x_i| such that xisi|x_i| \le s_i and frame the objective as a linear function in ss. The objective will now be a minimization over ss and the constraint makes it so that minimizing ss will minimize each xi|x_i|

minxi=1nqixi=minsqTss.t.  xisi\begin{aligned} \min_{x}\sum_{i=1}^{n}q_i x_i = \min_{s}q^Ts\\ \text{s.t.} \ \ |x_i| \le s_i \end{aligned}

This is nice because it lets us keep a linear objective and represent the absolute inequality as two linear inequalities:

minsqTss.t.  xisixisi\begin{aligned} \min_{s}q^Ts\\ \text{s.t.} \ \ x_i \le s_i\\ -x_i \le s_i \end{aligned}

With this knowledge, we can now introduce slack variables ww and vv for ss and uu respectively. Representing the objective as an LP in these slack terms, along with taking the sum for every vehicle (assuming KK vehicles), we have the final objective function:

minw,vp=1K(i=1N1qpTwpi+i=1N1rpTvpi+ppTwpN)s.t.  spijspfwpijspij+spfwpijupikvpikupikvpiksp,i+1=Apspi+Bpupi\begin{aligned} \min_{w, v} \sum_{p=1}^{K}(\sum_{i=1}^{N-1}\qp^T\wpi + \sum_{i=1}^{N-1}\rp^T\vpi + \pp^T\wpn)\\ \text{s.t.} \ \ \spij - \spf \le \wpij\\ -\spij + \spf \le \wpij\\ \upik \le \vpik\\ -\upik \le \vpik\\ \spinext = \Ap\spi + \Bp\upi \end{aligned}

Note that we have the slack constraints for each element of the state and control vectors.

Vehicle-Obstacle Collision Constraints

Before we get to this, let’s take another detour to look at a concept that is something of a pre-requisite.

Let’s say we have an optimization problem of the form

minxf(x)s.t.  g1(x)c1g2(x)c2g3(x)c3\begin{aligned} \min_{x}f(x)\\ \text{s.t.} \ \ g_1(x) \le c_1\\ g_2(x) \le c_2\\ g_3(x) \le c_3\\ \vdots \end{aligned}

Solving this would mean solving for the fact that ALL of the constraints need to be satisfied.
g1(x)c1g_1(x) \le c_1 AND g2(x)c2g_2(x) \le c_2 AND g3(x)c3g_3(x) \le c_3, etc.

But what if we want don’t want each constraint to be satisfied? What if the our requirement is that at least one of them be active?
g1(x)c1g_1(x) \le c_1 OR g2(x)c2g_2(x) \le c_2 OR g3(x)c3g_3(x) \le c_3, etc.

In this case, we can use a variant of the big-M method to help reframe the constraints. First we introduce binary integer variables bi{0,1}b_i \in \set{0, 1} for each constraint. We also introduce a constant MM that is an extremely large number relative to the range of the gig_i functions. The constraints can then be written as:

g1(x)c1+Mb1g2(x)c2+Mb2g3(x)c3+Mb3gn(x)cn+Mbni=1nbin1\begin{aligned} g_1(x) \le c_1 + Mb_1\\ g_2(x) \le c_2 + Mb_2\\ g_3(x) \le c_3 + Mb_3\\ \vdots\\ g_n(x) \le c_n + Mb_n\\ \sum_{i=1}^nb_i \le n-1 \end{aligned}

Let’s dissect what’s happening here. Each bib_i can be either 0 or 1.
if a particular bi=1b_i = 1 (for the ith\ith constraint), we have:

g1(x)c1+Mg_1(x) \le c_1 + M\\

And given that Mg1(x)M \gg g_1(x), this constraint will always be satisfied, irrespective of the value of xx. if a particular bi=0b_i = 0, we have:

g1(x)c1g_1(x) \le c_1\\

Which is just the original constraint.

So if bi=1,i{1,n}b_i = 1, \forall i \in \set{1, \ldots n}, then this means that each constraint would trivially be satisfied, so this would be equivalent to not having any constraints at all.
On the other hand if bi=0,i{1,n}b_i = 0, \forall i \in \set{1, \ldots n}, then then problem will be treated as an optimization problem where each of the original constraints need to be satisfied.

What happens if at least one of the bib_i values are restricted to be 00? This would mean that at least one of the constraints (the ones where bi=0b_i = 0) will need to be satisfied (active), but the rest wouldn’t need to be active as they would be trivially satisfied due to MM.

Hence by having an additional constraint on the sum of bib_i values, restricting them to a maximum value of n1n-1, we can guarantee that at least one bi=0b_i = 0. This then translates to our desired property of the optimization problem requiring only one of the constraints to be satisfied!

Equipped with this, we can now look at how the paper frames collision avoidance constraints. It is assumed that each obstacle is rectangular and the vehicle moves in a two dimensional space (It can easily be extended to 3D). Each vehicle/robot is also assumed to be a point.

If sis_i is the current state of the vehicle, we assume that the state contains some xix_i and yiy_i, which correspond to the xx and yy coordinates of the vehicle in 2D space. If the mid point of the obstacle is (xc,yc)(x_c, y_c) and its size is (dx,dy)(d_x, d_y), we can compute the bottom left and top right points as:

xmin=xc0.5dxymin=yc0.5dyxmax=xc+0.5dxymax=yc+0.5dy\begin{aligned} \xmin = x_c - 0.5d_x\\ \ymin = y_c - 0.5d_y\\ \xmax = x_c + 0.5d_x\\ \ymax = y_c + 0.5d_y\\ \end{aligned}

The vehicle at (xi,yi)(x_i, y_i) will avoid the obstacle (not be in collision) iff:

xixminOR  xixmaxOR  yiyminOR  yiymax\begin{aligned} x_i \le \xmin\\ \text{OR} \ \ x_i \ge \xmax\\ \text{OR} \ \ y_i \le \ymin\\ \text{OR} \ \ y_i \ge \ymax\\ \end{aligned}

We require at least one of these constraints to be true. So we can use the big-M method to reframe these constraints!

xixmin+Mti1xixmax+Mti2yiymin+Mti3yiymax+Mti4k=14tik3\begin{aligned} x_i \le \xmin + Mt_{i1}\\ -x_i \le -\xmax + Mt_{i2}\\ y_i \le \ymin + Mt_{i3}\\ -y_i \le -\ymax + Mt_{i4}\\ \sum_{k=1}^4t_{ik} \le 3 \end{aligned}

These constraints ensure that the vehicle avoids the obstacle, but offers no additional clearance from it. It also does not account for the vehicle itself having volume. In order to account for these, we can add additional clearance terms cxc_x and cyc_y that encapsulate the size of the vehicle and any additional clearance to obstacles. Additionally we also express the constraints for the pth\pth vehicle and the lth\lth obstacle.

xpixl,min+cpl,x+Mtpli1xpixl,maxcpl,x+Mtpli2ypiyl,min+cpl,y+Mtpli3ypiyl,maxcpl,y+Mtpli4k=14tplik3\begin{aligned} \xpi \le \xlmin + \cplx + Mt_{pli1}\\ -\xpi \le -\xlmax - \cplx + Mt_{pli2}\\ \ypi \le \ylmin + \cply + Mt_{pli3}\\ -\ypi \le -\ylmax - \cply + Mt_{pli4}\\ \sum_{k=1}^4t_{plik} \le 3 \end{aligned}

Moving Obstacles

This framework easily extends to representing moving obstacles as well. Instead of having the min and max coordinates of the lth\lth obstacle fixed at (xl,min,yl,min)(\xlmin, \ylmin) and (xl,max,yl,max)(\xlmax, \ylmax), we can have its coordinates be different at each time step ii. Given the velocity of the object at each time step, we can estimate its coordinates and use them in the constraints as shown:

xpixli,min+cpl,x+Mtpli1xpixli,maxcpl,x+Mtpli2ypiyli,min+cpl,y+Mtpli3ypiyli,maxcpl,y+Mtpli4k=14tplik3\begin{aligned} \xpi \le \xlimin + \cplx + Mt_{pli1}\\ -\xpi \le -\xlimax - \cplx + Mt_{pli2}\\ \ypi \le \ylimin + \cply + Mt_{pli3}\\ -\ypi \le -\ylimax - \cply + Mt_{pli4}\\ \sum_{k=1}^4t_{plik} \le 3 \end{aligned}

Vehicle-Vehicle Collision Constraints

Vehicle-vehicle collision constraints can be represented the same way as vehicle-obstacle constraints. The only difference being that the constraint is between the coordinates of the pth\pth and qth\qth vehicles at the ith\ith time step for every pair of vehicles. Instead of using the minimum and maximum coordinates of the rectangular obstacles, we represent the volumes of the vehicles and any additional clearances through distances (dx,dy)(d_x, d_y) for each pair of vehicles.

xpixqidpq,xMbpqi1xqixpidpq,xMbpqi2ypiyqidpq,yMbpqi3yqiypidpq,yMbpqi4k=14bplik3\begin{aligned} \xpi - \xqi \ge \dpqx - Mb_{pqi1}\\ \xqi - \xpi \ge \dpqx - Mb_{pqi2}\\ \ypi - \yqi \ge \dpqy - Mb_{pqi3}\\ \yqi - \ypi \ge \dpqy - Mb_{pqi4}\\ \sum_{k=1}^4b_{plik} \le 3 \end{aligned}

Note that we have a different set of binary variables here.

Extensions

The entire problem has now reduced to a mixed integer linear program. The objective is a linear function and the constraints are linear in the state, control and binary variables for collision. Additionally, we can also enforce state and control limits for each vehicle as upper and lower bounds for each of these variables.

spisp,minspisp,maxupiup,minupiup,max\begin{aligned} s_{pi} \ge s_{p,min}\\ s_{pi} \le s_{p,max}\\ u_{pi} \ge u_{p,min}\\ u_{pi} \le u_{p,max}\\ \end{aligned}

Please refer to the paper 1, to see the entire LP written out in all its glory. The state and control constraints can also be formulated as approximate second order ball constraints using a collection of linear constraints (instead of box constraints) as seen in 2.

The papers model the obstacles as being rectangular, but we can always extend it to being general convex polygons. In this case, we would want the vehicle to lie outside the intersection of the half-spaces formed by the edges of the polygon.

Ol={z:alkTzblk, k{1,P}}(xpi,ypi)Ol\begin{aligned} O_l = \set{z: a_{lk}^Tz \le b_{lk}, \ \forall k \in \set{1, \ldots P}}\\ (\xpi, \ypi) \notin O_l \end{aligned}

Where OlO_l represents the space inside the lth\lth convex PP sided polygon.

Implementation

The MVMIP can be solved by setting up the objective and constraints and passing it to a linear solver that supports mixed integer programming. My implementation uses Google’s OR-Tools

The problem can be treated as a one-shot fixed arrival time problem where we set a large time step (NN) and solve for each vehicle trajectory. This method ensures that there is only a fixed initial cost to run the algorithm and we don’t make any adjustments later. The advantage of this method being that we are guaranteed to find the optimal solution to the initial problem setup.

The fixed long horizon method would not run well on a real system running in a dynamic environment. For such systems, we can cast it as a shorter receding horizon problem and execute the first few commands of the solution MPC style. This will make the MVMIP responsive to dynamic changes to the environment, but also comes at a cost for executing the algorithm at every timestep and will lead to a solution that will theoretically not be optimal (But is about the best it can do given that the environment is only going to be partially observable).

The biggest pain point of the algorithm is how the number of variables and constraints scale with KK, LL and NN. As we’ll see the time take to solve the MVMIP will explode as we deal with multiple vehicles and obstacles in a long horizon setup.

Results

Visual Solutions

Running the MVMIP for a single vehicle and obstacle results in this:

mvmip_setup1

Turns out that we’ve run into a classic problem in planning caused due to discretizing the problem. Because we only have collision constraints at each state and not in between successive states, if our control inputs have large limits, the MVMIP can find a solution that jumps through obstacles. It thinks that the trajectory is feasible because each state alone is collision free, but the reality is that the robot can pass through obstacles between discrete states.

How do we solve this? The simple LP formulation of the MVMIP makes it rather difficult to express anything too complex. We could model the space between successive states as another convex polygon and set up approximate constraints. An easier and less expensive way would be to just reduce the bounds of the control inputs such that each control step is guaranteed to move the robot by a maximum distance that is less than the size of the smallest obstacle.

up,max=maxuδmaxmaxuus.t.  f(s,u)Δtmax(xl,maxxl,min,yl,maxyl,min)\begin{aligned} u_{p, max} = \max_{u}\delta_{max} \approx \max_{u}u\\ \text{s.t.} \ \ f(s, u)\Delta t \le \max{(\xlmax - \xlmin, \ylmax - \ylmin)} \end{aligned}

This obvious has its own downsides of limiting the max control input that can be applied, among others limitations, but keeps it simple in the spirit of the algorithm.

Fixing the control limits now gives us the solution that we need:

mvmip_setup2

We can see that the vehicle maintains clearances to the obstacle as required, as is visually seen in the drawn boundary around the obstacle. The boundary around the vehicle itself is for vehicle-vehicle clearance.

We can verify that it works as expected even with moving obstacles:

mvmip_setup3

Now we can start messing around with some of the costs. To break it down, we have the qq vector costs on the state, rr vector costs on the control and pp vector costs on the final state. The qq costs encourage the robot to make progress towards the final state sfs_f. The rr costs encourage the robot to minimize the total control effort exerted and the pp costs encourage the robot to reach the goal by any means necessary.

By changing their relative magnitudes, we can achieve a range of behavior and decision making. If qq and pp are kept low and rr is high, the robot has no real incentive to move at all as it can minimize its total cost by staying in place. Any movement would accrue a large cost due to the high rr term which would overpower any reduction in qq or pp costs due to moving closer to the final state.

Making qq and pp moderately high while keeping a high value of rr can make the robot make progress towards the final state but stop when it has to make too much of a detour as that gets translated to applying higher control effort on average due to the longer path it will need to take. This can be seen here:

mvmip_setup4

We can further see the effects of different costs on this environment with multiple moving obstacles. High qq and pp, along with low rr will manifest in the form of aggressive decision making as expected.

mvmip_setup5

Reducing qq and increasing rr can make it act more conservatively.

mvmip_setup6

And finally reducing qq and increasing rr further can make it extremely passive. In this case it patiently waits for each obstacle before making its way forward. Note that pp needs to be very high in this case to reward it for making it to the final state. Otherwise, it will refuse to move at all.

mvmip_setup7

And finally, we have an example of the MVMIP in all of its glory — multiple vehicles along with multiple obstacles.
Each of the vehicles here have different costs and we can clearly see that they each have a different level of aggressiveness in their decision making.

mvmip_setup8

Computational Cost

How expensive is it to solve the MVMIP?

For small problems (Like the first one shown here), it can be pretty quick.
The single vehicle, single static obstacle case had 240 variables and 300 constraints and was able to be solved in around 10 ms.

The last one — multi vehicle, multi dynamic obstacle case had 8400 variables and 10500 constraints and took upwards of 30 min to solve. Obviously there are other factors such as how easy it is to actually find a solution given the constraints and so on, but in general the solve times explode as we have more vehicles and obstacles.

Some of this can be mitigated by using the receding horizon approach.

References

  1. Schouwenaars, T., De Moor, B., Feron, E., & How, J. (2001, September). Mixed integer programming for multi-vehicle path planning. In 2001 European control conference (ECC) (pp. 2603-2608). IEEE. [Link]

  2. Richards, A., & How, J. P. (2002, May). Aircraft trajectory planning with collision avoidance using mixed integer linear programming. In Proceedings of the 2002 American Control Conference (IEEE Cat. No. CH37301) (Vol. 3, pp. 1936-1941). IEEE. [Link]

See Also

Share

Comments