Method Overview
We investigate the potential of leveraging LLMs within a multi-agent framework to enhance the decision-making capabilities of autonomous driving system. This study aims to develop a multi-agent autonomous driving decision-making framework based on large language model, which simulates human decision-making process and combines LLM's own reasoning ability to form knowledge-driven, so as to improve the safety and efficiency of LLM agents driving in relatively complex environments. The framework consists of five complete modules: environment module, multi-agent interaction module, multi-step planning module, shared-memory module and ranking-based reflection module. The specific research contents are as follows: (1) Environment module: Create a realistic autonomous driving simulation scenario for highway ramp merging, and convert the simulation images and data into textual descriptions of the scene to serve as part of the input for the Large Language Model. (2) Multi-agent interaction module:By meticulously analyzing the historical behaviors and real-time status information of other vehicles to infer their potential intentions, the intelligent agent can formulate a series of subsequent action plans, achieving implicit interactions similar to those of human drivers. (3) Multi-step planning module: build a three-layer progressive thinking chain of goal - plan - action, so that the LLM can reason and think step by step and layer for complex scenes, and get the final action decision. (4) Shared-memory module: Develop a unified vector database accessible to all Large Language Model (LLM) agents, ensuring consistency in experience and performance. This approach is akin to the parameter-sharing mechanism used among multiple agents in reinforcement learning, which achieves similar benefits. (5) Ranking-based reflection module: By employing specific metrics to quantify the safety and efficiency of the vehicle's state after each action decision is executed, the reflection agent reflects on and revises low-scoring erroneous decisions after the scenario concludes. Finally, the revised decisions, along with high-scoring ones, are stored in the shared memory module for collective learning and improvement.