Trajectory generation and collision avoidance using model predictive control in swarm applications
Swarm Robotics Control is a set of methods used to make multiple vehicles move in a coordinated fashion, on a desired trajectory. The goal of a coordinated trajectory generation hinges mainly on the automation, accuracy, and efficiency in the movement of each vehicle in the given system. Each vehicle in the given system is assumed to communicate with others and transfer the information of some of its states for a collision free optimal path. This work mainly focuses on the collision avoidance and optimal path generation which divides the whole process into three categories. First, for a given time step the path for all the vehicles is predicted, given an initial and a final fixed state for each. Second, we check for any possible collisions in the predicted path and treat the colliding vehicles as obstacles. Finally, for given predicted collisions a new optimal path is generated by minimizing a chosen objective function. We seek the design of a suitable algorithm that can effectively include these three aspects and find the optimized path without these collisions, and at the same time calculating the control input required for the task. We look exclusively at swarms with fixed start and goal points requiring transition of states in between with the aim of creating a completely autonomous system that requires only initial and final positions of each unit of the swarm we use model predictive control (MPC), for its nature of predicting of states and optimization of control. The method of using MPCs enables us to use the predicted collisions as explicit constraints and accordingly modifies the control input and the future states of the vehicle. Such a method can be distributed when multiple agents are involved in the scenario, called a Distributed MPC (DMPC). It divides the calculation of paths into smaller segments that may be run in parallel, reducing both the complexity of computation as well as the time taken to provide a fast moving vehicle with the required control. The algorithm presented also predicts the future path of the vehicles and removes potential collisions for a given number of control steps (horizon). This results in the detection and avoidance of unfavorable trajectories at a much earlier stage. The presented algorithm is then tested in a simulated environment for varying number of vehicles in the swarm.
Keywords: UAVs, multi-agent, path planning, DMPC, optimal control
|UAV||Unmanned Aerial Vehicle|
|MPC||Model Predictive Control|
|DMPC||Distributed Model Predictive Control|
|CCMPC||Chance-Constrained Model Predictive Control|
|DNMPC||Decentralized Nonlinear Model Predictive Control|
|SQP||Sequential Quadratic Programming|
|SNMPC||Sequential Nonlinear Model Predicitive Control|
|ROS||Robot Operating System|
Statement of the Problem
The essence of the problem heavily centers around the specific application of constructing a collision free, optimal path for every agent from a starting point to an ending coordinate. Other kinds of swarm movement become accessible if this problem is suitably addressed.
A defining metric for the quality of an answer to this is efficiency. The trajectories must be generated and evaluated before the agent has overshot the time frame when it requires a correction of input control. The computational load for generating such trajectories for each agent becomes unmanageable for larger swarms. Hence, scalability is to be improved on by any algorithm considered to solve the problem.
The main objectives of this report, then will be to compare algorithms to an MPC system and illustrate a structured algorithm that shows improvement in both efficiency and scalability while keeping accuracy to a satisfactory standard.
The scope of this report is limited to computer simulations of the swarm and the graphical mapping of the results so obtained. The number of agents shown is four for the sake of simplicity of demonstration, however the number may be increased with predictable results. The movement of the agents is considered to be point to point and velocity is considered secondary to position due to the fact that the trajectory generated relies solely on position.
The assumptions taken here are:
- The formations to be realised by the swarm are known.
- The positions of each agent and its destination is accurately found within the time constraints.
- The swarm is decentralised, and can carry out parallel computation on the given optimisation function.
- Velocity may be taken to be constantly increasing as the limits depend on practical limitations, while positions determine the algorithm's competence.
Objectives of the Research
The objectives of the research are:
- To create an algorithm capable of satisfying all the requirements stated above.
- To optimize the algorithm with respect to the conditions mentioned above.
- To compare the efficiency and accuracy of the algorithm with currently used alternatives.
- To plot the simulation of such a swarm movement.
- To ensure the lowest chances of collision in the algorithm.
The research done is limited to computer simulations that reflect the actual working of such a model, due to time and physical constraints of realising such a system.
Further, the methods do not take velocity as a constraint, which would be seen in practical applications. This does not affect the comparison of the capabilities of the algorithm, but may lead to other problems during real world implementation in terms of processing time and accuracy.
Lastly, the simulation does not take into account the difficulties of providing a decentralised processing unit with accurate positional data. This is a problem outside of the accuracy of the algorithm itself, and is not immediately relevant to the research.
Chance-Constrained Collision avoidance, Decentralized and Sequential MPCs
A few of the algorithms currently being researched and used in swarm applications provide useful insights to the specific advantages we wish to see in the final algorithm. On reading about chance-constrained MPC (CCMPC), which uses a cost function minimized by an optimal control input for a stochastic dynamical system. All this is subject to probabilistic constraints. This is done over a window of time instants and this window length input over many iterations shows the solution to the problem (Nemirovski and Shapiro 2006).
Another method explored was to make the nonlinear MPC decentralized (DNMPC), which leads to many agents computing their own trajectories and inputs for each time frame and thus, reducing overall time and power required for computation. A coordinator makes sure the global constraints are adhered to. This also makes the model more scalable in terms of the number of agents included in the swarm.
Further, these may be sequential nonlinear MPCs (SNMPC), which have iterative processes that use sequential quadratic programming (SQP) to optimize the objective function to a single optimal solution. This linearizes the constraints, resulting in solving of a number of optimization sub-problems (Zhu and Alonso-Mora 2019).
Global Ofﬂine Trajectory Generation Algorithms
Here, we see the trajectory is found on the iterative and local approximations of the second order, as opposed to first order optimality in sequential NMPC. This has two main advantages, the ability to solve problems online because of lesser computation required and the occurrence of smooth trajectories in velocity and position spaces when the acceleration domain is taken into account. This gives us a quick solution for real time use, which may also become reactive to the environment, as well as a smooth trajectory for actual control inputs. The disadvantage here would be the loss of global optimization (Rossi, Santamaria-Navarro, Andrade-Cetto and Rocco 2017).
Requirements of an Ideal Trajectory Generation Algorithm
We see that the qualities that are the most important to an ideal trajectory generation algorithm may be quantified and hence, made more efficient through proper design. Such an ideal algorithm should:
- Smoothen the path through a set of waypoints, while minimizing deviation from the waypoint path.
- Always is constrained by the velocity or curvature specifications due to the physical limitations of the agent's dynamics.
- Does not have an unmanageable path length due to the above requirements.
- Is quick and light enough for real time reactions and online computations.
One of the solutions proposed and considered included incorporating a differential flatness technique. This makes nonlinear optimizations to be more easily handled. This is done through static and dynamic feedback. The process takes a nonlinear system and finds a number of functions in a lower dimension that can be easily optimized and are similar to the original problem at the end points (Anderson, Beard and McLain 2003).
Model Predictive Control Efficiency
Most of our reading surrounded a design of an algorithm that would maximize the efficiency in terms of time rather than computation burden, while keeping a semi-strict collision avoidance and need for accuracy. This is based on a distributed MPC, where every agent in the swarm computes its own path (decentralized) but also compiles each predicted path with all the others at every time step. This gives accurate knowledge about where each vehicle in the system would be at a potential time step, and hence can find collisions more accurately. The amount of time taken for this to occur is the main factor through which efficiency is determined, computation being secondary. This is compared to other similar distributed MPC algorithms (Luis and Schoellig 2019).
A brief overview of many of the state of the art algorithms used in creating reliable, transparent, quick and computation friendly controls result in the condensing of a list of requirements and potential hindrances to efficiency when designing a trajectory generator. This includes the inherent time consuming behaviour of MPCs, as well as the discussion of using iterative predictions given their overhead, especially with unpredictable environments and a larger time window. Finally, a Distributed MPC was thought to be the most efficient way of collision avoiding swarm control and the most malleable to further design changes and optimizations.
The aim of the project is to generate non colliding trajectories from an initial point to a final one using position and control variables. We assume the position is sensed accurately and that velocity is linear and need not be simulated for the purposes of this project. The underlying calculations are illustrated below and the entire code is plotted in MATLAB. The final simulation is shown in the graph made through these efforts.
The data required to create the algorithm was compiled from a number of comparative studies on various types of model predictive control trying to isolate the main characteristics to be optimized for the best application in a swarm. Apart from this, Probabilistic Robotics was a textbook that proved highly useful in forming a basic foundation of concepts around which we could modify the algorithm (Thrun, Burgard and Fox 2005). These concepts included general model predictive control, fundamentals of dynamic models and troubleshooting of the solver and its working. The algorithms considered included older implementations of MPCs on single vehicles, particularly a tail-sitting biplane. A few papers also touched on topics to slightly modify the efficiency (Luis and Schoellig 2019) and the nuances involved in it (Gao, Wu, Pan, Zhou and Shen 2018). Time, accuracy and minimal collisions are metrics to the success of the algorithm.
These resources proved incredibly useful in understanding the current layout of the knowledge already worked on in the domain, and realising the gaps and areas where there may be room for experimentation. An example of an MPC showing how it essentially works is shown in Fig 1. As the model predictive control systems we use here compromise computational lightness for optimal control and scalability, the obvious approach entails searching for methods to make the algorithm more efficient. While we do not attempt to make a system that is comparable in terms of efficiency, we seek to understand exactly what factors have been manipulated to better efficiency. The intuition behind this is the parallel nature of the distributed MPC, and therefore we concentrate on the modular aspect of the algorithm. Finally, we see how this can be used in a number of different settings, most importantly with differing number of agents. Finally, we seek to maintain the lowest amount of collisions so that the model can be safe for use, although due to the number of assumptions made on position and velocity, this may not be conclusive. The project also aims to show the need for such trajectory generating algorithms to have control as an optimization variable, as it prevents the need for recorrection as well as have reliable predictive windows.
The method of writing and simulating the algorithm followed a number of steps, enumerated below:
The problem was defined in part due to need at the aerospace department, and also as the MPC system is currently under research for its potential in increasing efficiency and scalability.
- Collection of Data
Data was collected mainly in the form of previously written code that suited only single vehicles which needed review, as well as a plethora of technical papers all on varying aspects of the algorithm, including the main advantages provided by MPCs and the main disadvantages that needed to be offset. Further knowledge on the subject was gained through course videos and textbooks that explained background concepts.
There was a need to understand both the qualities of a working MPC and the mathematics that enabled one to create a system from scratch. As a prerequisite, predictive systems in general and advanced control systems and their types were extensively read up on. Further gaps in understanding were sorted out by approaching various students and academic help in the campus.
The mathematics was a major portion of the skeleton of the algorithm. The swarm is modelled mathematically as a dynamic system using x, y, z position variables and velocity for each of these. Both position and velocity have model functions that indicate how a prediction of a future state may be made. This is now manipulated to form a modified dynamic system model which is in the form of matrices allowing for easy coding and computation. Finally an objective function dictates position at any given frame in a prediction window. The window refers to the number of time instants from the current one up to which the system predicts the agent's behaviour. The working of the window is further illustrated in Fig 2 and Fig 3.
The constraints include the amount of space the vehicle is allowed to move in and the physical limitations in its movement due to input control.
There are three different ways in which we penalize the path created by the algorithm. Each one focuses on optimizing a particular variable or cost. This is then used in the cost minimization function. The three costs are
(a) Trajectory Error Penalty, i.e., the greater the distance taken by the route from initial position to final, the greater the cost of the trajectory.
(b) Penalty on Control, i.e., to find the least amount of corrections to the agent's movement control so that there isn't a high amount of inputs to the agent and lesser computation.
(c) Penalty on Variation in Control, i.e., to have smooth trajectories that do not strain the physical restrictions of control on the vehicle and also do not expect unrealistic reactions to sudden large amounts of control.
An expression that penalizes collision is also obtained, by comparing the predicted window of every agent with those of other vehicles in the swarm.
Finally, all of these are compiled into a list of inequalities that define the constraints of the optimization problem.
The above constraints give a cost function that optimizes trajectory for each agent. However, this does not depend in the number of agents. Given the decision variables and the inequality constraints, we now have a function to be called and optimized in each of the agents. This is where the distributed nature of the system functions. As collision avoidance is provided only on demand, i.e., when a collision is detected in the predictions, this function is only called by the individual agents that are colliding. Hence, there are three sections of the code called a varying number of times to reduce time and computation.
A simple description of the flowchart would look like Fig 4.
The algorithm followed is based largely off the flowchart and has the initial and final positions of each agent as input. The output is the position, velocity, acceleration trajectories and control elements required at that time step to go to the next predicted time instant. The full algorithm is outlined in Fig 5.
The entire code was written in MATLAB, with an intention of minimally changing it when actual implementation is needed on an ROS (Robot Operating System). Further, it uses CVX as a solver, to optimize the given cost functions. This has the added function of understanding the limitations of the code through basic simulation without even plotting the values obtained as it shows infeasibility that is easy to draw inferences from. The code consists of two different functions for path optimizations with and without collision avoidance inequalities.
The code was first evaluated without collision avoidance, and easily found the most optimal path even with multiple agents. Changes were made to improve speed, including cleaning up code. The modularity was introduced once it showed sufficient results. Then, collision avoidance was implemented. An overview of the collision avoidance method is shown in Fig 6. This needed to be evaluated in different ways, including time, number of collisions and deviation from the optimal path after avoidance. The final results were drawn based on an observation of how quickly the code calculated the steps and reached it (the practical time for the agent to reach each simulated stage was not considered).
The main troubles of the code was infeasibility due to the constraints provided due to simulation. The solver often ended with no suitable value for the optimization function as it exceeded the boundaries set. This was solved through accurate constraint modeling and changes in how many inequalities were optimized at a time by the solver. The collision avoidance was separated again so that it had a priority over route optimization and so that the two constraints would not become contradictory. The amount of deviation when a collision was avoided was also a recurring problem, especially when the agent could not feasibly reach the end point due to a collision just before it.
Most of the above problems were solved by recreating a practical example that used more feasible constraints. Also, the factors and constants used in the code were to be determined on the basis of practical implementation, these values were also fixed to more appropriate values.
Lastly, the code was rewritten to exclude any unnecessary computations and to better suite simulation instead of real world applications. The plotting was made more readable and suitable comments were added.
RESULTS AND DISCUSSION
The results of the algorithm could be divided into two aspects.
One is the actual successful simulation of a system that could generate trajectories on multiple agents for point to point movement without collision, in a stipulated amount of time with both state and control constraints. This shows the working of a distributed model predictive control algorithm.
The other is the evaluation of the algorithm with other path optimization metrics. This is again two fold. One, is the distributive nature of the algorithm, defined by its time and computational burden. The time in which the MPC calculates the best path is seen by the smoothness of the trajectories, at least to the extent of simulation. Two, is the limitations of the code, either as an inherent property of the MPC or due to the current code composition and the constant values taken. The major limitations are the feasibility and the overshooting of the trajectory from final positions. This is shown in figures, but was fixed to a fair extent by adding velocity simulation and changing constants.
The final simulations for vehicles exchanging positions is shown in Figs 7,8 and 9. Fig 7 is the initial code which has no collision avoidance, but accurately controls the optimal trajectory to the final position. Fig 8 shows two collision avoiding agents and Fig 9 shows the same for multiple agents.
CONCLUSION AND RECOMMENDATIONS
The project proves as a learning base for how MPCs behave and how to both build one around the problem as well as optimize it around the given constraints and variables. The least amount of computation used would mean the best case real world implementation, as the accuracy and control optimization is superior in an MPC. Also, the project utilized extensive coding and mathematical understanding of underlying concepts, which on mastery, could be applied in a variety of control systems.
In future projects or modifications to the present system, the simulation on actual agents is a consideration, along with an in depth comparison of performance bench marks and statistics under a variety of different constraints and descriptors. These include scalability, time and processing complexity. It may bring more context to compare this in real world application due to practical bottlenecks not foreseeable or quantifiable through simulation alone.
Due to the time restrictions on the completion of the project, there was only a simple simulation of the system, without comparison analysis. The results themselves indicated trends towards our theoretical intuitions, however, they were not extensively and numerically evaluated.
Model Predictive Control has been recently viewed by the academic world as a system that may offer a great amount of potential in path planning especially in swarm applications. As there are no clear algorithms best suited for these applications, it becomes imperative that we find methods to optimize code that appears to have such potential. Most importantly, the real world applications of such algorithms are endless if the instability in time for computing results and chances for collision are minimized completely.
Further improvement on this particular system would include a harsher look at fringe cases as well as a secondary simulation on Gazebo or similar options. Finally, a practical implementation is advised with varying number of agents.
⦁ Arkadi Nemirovski and Alexander Shapiro. 2006. Convex Approximations of Chance Constrained Programs. Society for Industrial and Applied Mathematics.
⦁ Carlos E. Luis and Angela P. Schoellig. 2019. Trajectory Generation for Multiagent Point-To-Point Transitions via Distributed Model Predictive Control. IEEE Robotics and Automation Letters.
⦁ Erik P. Anderson, Randal W. Beard, Timothy W. McLain. 2003. Real Time Dynamic Trajectory Smoothing for Uninhabited Aerial Vehicles. IEEE Transactions on Control Systems Technology.
⦁ Fei Gao, William Wu, Jie Pan, Boyu Zhou and Shaojie Shen. 2018. Optimal Time Allocation for Quadrotor Trajectory Generation. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
⦁ Hai Zhu and Javier Alonso-Mora. 2019. Chance-Constrained Collision Avoidance for MAVs in Dynamic Environments. IEEE Robotics and Automation Letters.
⦁ Roberto Rossi, Angel Santamaria-Navarro, Juan Andrade-Cetto and Paolo Rocco. 2017. Trajectory Generation for Unmanned Aerial Manipulators through Quadratic Programming. IEEE Robotics and Automation Letters.
⦁ Sebastian Thrun, Wolfram Burgard and Dieter Fox. 2005. Probabilistic Robotics. The MIT Press.
Given the advanced nature of both the objective of the project and the mathematics involved, many sources of help proved indispensable in the initial understanding of the background work. Aside from my guide, Prof Mangal Kothari, incredible gratitude goes out to Mr Mahathi Bhargavapuri and Mr Abhiroop Rastogi for having the time to sort out my ignorant blundering around concepts despite their busy schedules. The Department of Electrical Engineering, Indian Institute of Technology, Kanpur also proved outstandingly useful when issues occurred with the solver used for computation. Further thanks goes to other members of the Swarm Lab for occasional discussions and suggestions.