Researchers at South Korea’s Chung-Ang University Develop a ‘Meta-Reinforcement’ Machine Learning Algorithm for Traffic Lights to Improve Vehicle Throughput – FutureCar.com

author: Eric Walz

No matter how advanced modern vehicles become, including those capable of autonomous driving, being stuck in busy urban traffic is not likely to go away anytime soon using traditional traffic signal technology to control the flow of vehicles.

But recent advancements in artificial intelligence (AI) and machine learning have shown promise in optimizing traffic signal control that can make driving in congested urban areas less frustrating.

Researchers from South Korea’s Chung-Ang University are experimenting using reinforcement learning (RL) algorithms to solve non-stationary traffic signal control problems. The algorithm automatically customizes “reward functions” based on the classification of traffic regimes.

RL uses a “trial and error” problem-solving method for agent training in the field of machine learning.

RL is one of the three basic machine learning paradigms, along with supervised learning and unsupervised learning. It’s an area of ​​machine learning that’s focused on how intelligent agents (the traffic light) should take actions in an environment in order to maximize the reward, which is vehicles traveling uninterrupted in urban traffic scenarios. The goal of agents using RL is to maximize the total rewards.

Basically, the reward function of a machine learning algorithm is an incentive that tells the agent what is correct and what is wrong using reward and punishment. But oftentimes RL algorithms must sacrifice immediate rewards (some drivers may get stuck at a red light) in order to maximize the total rewards (improved traffic flow).

Existing traffic signals rely on a “rule-based controller” (red means stop, and green means go). The objective is to reduce vehicle delays during light traffic conditions and maximize the vehicle throughput during periods of road congestion.

Sub-optimum traffic signal controllers like these affect the daily life of people living in urban areas that frequently deal with congestion and delays. Conventional traffic lights with fixed state times are not well equipped to ease traffic congestion.

In addition, existing traffic signal controllers cannot adapt to ever-changing and random traffic patterns throughout the day. Although a human traffic controller might be able to perform better than a fixed controller, they can only manage a few intersections at a time.

One of the biggest challenges for the researchers was to implement RL in a non-stationary environment, which are vehicles randomly passing through intersections. Current research has explored reinforcement learning (RL) algorithms as one possible solution for easing traffic woes. However, RL algorithms do not always achieve the best results due to the dynamic nature of traffic environments.

To help better address this, the researchers have developed what they call a “meta-RL model” that adjusts its goal based on the traffic environment. The meta-RL algorithm has a wide-area coverage and outperforms existing alternative algorithms, according to the researchers at Chung-Ang University.

The two main goals of the meta-RL machine learning model are to maximize vehicle throughput through intersections during peak hours and minimize delays during peak times, such as during rush hour. The researchers led by Prof. Keemin Sohn developed an extended deep Q-network (EDQN)-incorporated context-based meta-RL model for traffic signal control.

Sohn is a Professor in the school of Civil and Environmental Engineering at Chung Ang University, Korea. His research interests include data science, artificial intelligence with applications in transportation planning.

Here’s how the meta-RL model works. First, it determines the traffic as either “saturated” or “unsaturated” using a latent variable that indicates the overall environmental condition. Based on the current traffic flow, the model either maximizes throughput or minimizes delays similar to a human controller. It performs this by implementing traffic signal cycles (action).

The action is controlled by the provision of a “reward.” The reward function is set to be +1 or -1 corresponding to a better or worse performance in handling traffic relative to the previous traffic light interval. In addition, the EDQN acts as a decoder to jointly control traffic signals for multiple intersections.

“Existing studies have devised meta-RL algorithms based on intersection geometry, traffic signal phases, or traffic conditions,” explained Prof. Sohn. “The meta-RL works autonomously in detecting traffic states, classifying traffic regimes, and assigning signal phases accordingly.”

The researchers trained and tested their meta-RL algorithm using Vissim 21.0, which is a commercial traffic simulator used by engineers to model real-world traffic conditions.

The team set up a transportation network in southwest Seoul consisting of 15 intersections to serve as a real-world test environment. After meta-training, the RL model could adapt to new tasks without adjusting its parameters.

These experiments show that the proposed model could switch control tasks without any explicit traffic information. It could also differentiate between rewards according to the saturation level of traffic conditions.

The research team found that the EDQN-based meta-RL model outperformed the existing algorithms for traffic signal control. However, the researchers stressed the need for an even more precise algorithm that takes into account different saturation levels from intersection to intersection in an urban area.

“Existing research has employed reinforcement learning for traffic signal control with a single fixed objective. In contrast, this work has devised a controller that can autonomously select the optimal target based on the latest traffic condition. The framework, if adopted by traffic signal control agencies , could yield travel benefits that have never been experienced before,” said Prof. Sohn.

The findings of the study were published in the “Computer-Aided Civil and Infrastructure Engineering” journal and were made available online on Sept 30.

Prev

.