|
Abstract
|
Every day, an increasing number of services rely on the efficient operation of communication networks. Although these networks have grown overly complex, their principal purpose will always remain to enable data exchange. IP-Optical networks play a pivotal role in worldwide data transmission, for which the routing and spectrum assignment/allocation (RSA) problem must be solved. While useful, traditional optimization methods and heuristics fall short under certain conditions, urging to consider reinforcement learning (RL) due to its adaptability and efficiency in handling complex decision-making problems. This paper extensively analyzes various partially observable Markov decision process (POMDP) formulations and their impact on solving the RSA problem while always using proximal policy optimization (PPO) as the underlying RL algorithm. We evaluate these formulations across different network topologies and demand patterns, benchmarking the performance of RL against baselines such as k-shortest path first-fit (KSPFF), integer linear programming (ILP), and the random policy. Our findings confirm critical dependence on the formulation, which, when done properly, RL can match or outperform the baselines. Notably, POMDPs with fewer possible actions and more concise observations improve the RL agent?s performance. The best POMDP formulations also exhibit consistent performance across multiple topologies, and the differently defined reward signals do not affect the overall performance. The study presents such and similar findings using a carefully planned workflow with close attention to statistical significance, appropriate baselines, and comprehensive visualizations.
|
|
Reference entry
|
Christou, F.; Hornek, N.; Kirstädter, A.
The Impact of Reinforcement Learning Formulations on Solving the Routing and Spectrum Assignment Problem
Proceedings of the IEEE International Conference on Communications (ICC 2025), Montreal, June 2025, pp. 1-6
|