Learning Efficient Multi-Agent Robotic Navigation for Exploration and Mapping

Transportation & Infrastructure,

Urban

Project Abstract

This project involves formalizing, both theoretically and experimentally, distributed multi-robot (i.e. swarm) navigation and exploration problems leveraging Graph Neural Networks (GNNs) architectures within the context of reinforcement learning policies. This approach directly employs the graph structure to enable resolution of the multi-robot navigation problem, by leveraging a distributed representation functionality of modern GNN methods, and thereby potentially enabling prospective scalability for large swarms comprised of hundreds of agents. The approach is initially envisioned as a model-free implementation with the option to extend to a model-based (or hybrid) implementation for comparison, as well as scaling to a large number of agents (𝑛 >> 3).

Project Description & Overview

Design and Simulation Setting:

Each drone has its own policy and decides its action independently.
The state-history (i.e., mapping coverage of the environment) and relative position of each drone comprising the swarm is signaled pairwise.
State and action spaces are discrete.
Initially a small number of drones employed (𝑛 ≤ 3) and then scaled (𝑛 > 3).
Initially a two-dimensional spatial environment (which could be expanded to three-dimensions).

This represents a further advancement by the extension and expansion of progress for a previous CUSP Capstone project, towards a publication which would incorporate results of both the proposed CUSP Capstone project and the previous CUSP Capstone project. The previous Capstone project in process only entails convergence of strategies for discrete stepwise navigation actions and coverage for adaptive motion planning by a pair of drones (𝑛 = 2), in order to optimize a pairwise objective function via reinforcement learning policies acting on tables.

One limitation to scale for this setting is that of redundant action increasing in proportion to the number of drones acting independently. This limitation to scale might be addressed by applying the aforementioned GNN policy network. In the GNN policy network, each node represents a drone, and each edge represents communication between drone pairs regarding state information (e.g., Kalman state transition matrix representation and/or syntactic traces as a transition vector representation summarizing path history) and corresponding relative position of each respective drone. Using such state information, the drone will be able to take optimized action to map the environment collaboratively. We can evaluate its performance by how efficiently the drones mapped the environment (e.g., time steps to reach 95% coverage). By employing a GNN, the aforementioned corresponding state-action spaces for each drone prospectively might be further augmented with structured and unstructured data streams, e.g., optical flow data (images collected by each drone) as supplemental state data and inertial measurement unit (IMU) data (drone acceleration) as supplemental action, in order to govern autonomous both agent-specific and aggregate drone swarm behavior in real-world environments.

Datasets

Synthetically generated data from simulation.

Competencies

Proficiency with coding (e.g. SciPy, NumPy, Julia).
Basic background or familiarity with robotic SLAM, adaptive state estimation/control, or state modeling (e.g., Kalman Filter) is beneficial.

Learning Outcomes & Deliverables

We expect to produce publications and source code related to this project

Students

Natalie (Huiyue) Feng, Kunal Kulkarni, Wangtianhan Pang, Aneri Piyushkumar Patel

Departments

Degrees & Programs

Resources

Overview

Community

News & Events