Harnessing Relational Structures in Multi-Objective Project Portfolio Optimization: A GNN-Enhanced Deep Reinforcement Learning Framework

Masoudi, Babak

doi:10.30473/coam.2026.75894.1340

- فلوچارت مراحل ورود و ثبت نام در سامانه نشریات علمی وزارت عتف

- راه اندازی سیستم مدیریت نشریات علمی دانشگاه پیام نور

تعداد نشریات	49
تعداد شماره‌ها	1,298
تعداد مقالات	11,176
تعداد مشاهده مقاله	23,355,642
تعداد دریافت فایل اصل مقاله	15,748,897

	Harnessing Relational Structures in Multi-Objective Project Portfolio Optimization: A GNN-Enhanced Deep Reinforcement Learning Framework
Control and Optimization in Applied Mathematics
مقالات آماده انتشار، پذیرفته شده، انتشار آنلاین از تاریخ 31 اردیبهشت 1405 اصل مقاله (1.01 M)
نوع مقاله: Research Article
شناسه دیجیتال (DOI): 10.30473/coam.2026.75894.1340
نویسنده
Babak Masoudi^*
Department of Information Technology, Payame Noor University (PNU), P.O. Box 19395-3697, Tehran, Iran
چکیده
Relational graph structures add a layer of complexity to multi-objective combinatorial optimization (MOCO) that often renders large-scale NP-hard instances computationally prohibitive. While traditional metaheuristics like NSGA-II remain the industry standard, their reactive nature prevents them from learning policies that generalize to unseen tasks. To address this, an end-to-end Deep Reinforcement Learning (DRL) framework is introduced, integrated with a Graph Convolutional Network (GCN) specifically for the Multi-Objective Project Portfolio Selection Problem (PPSP). By mapping the structural interdependencies of projects, the GCN provides critical cues that allow a Proximal Policy Optimization (PPO) agent to construct high-quality portfolios. Training stability is ensured through a reward normalization strategy derived from weighted-sum Pareto scalarization theory. Benchmarks on Barab'{a}si-Albert and fully-connected graph instances reveal that the proposed DRL agent achieves a Hypervolume indicator 2.4 times higher than NSGA-II on 50-project tasks. Notably, interpretability analysis shows the model learns to prioritize high-degree "hub" projects with strategic synergies. Regarding scalability, the agent maintained over 90% of its Hypervolume performance when transitioned from 50 to 200 projects in a zero-shot manner, requiring no further training. This efficiency is mirrored in its computational speed; an average inference time of 12.69 ms represents a 300-fold acceleration compared to the metaheuristic baseline. Such results underscore the potential of GNN-driven structural exploitation as a robust alternative for high-speed, multi-objective optimization.
تازه های تحقیق
A GNN-enhanced PPO framework is proposed for multi-objective project portfolio selection, unifying structural graph encoding with Pareto-grounded reward normalization. The DRL agent achieves a Hypervolume indicator 2.4× higher than NSGA-II on structured Barabási-Albert graph instances with N=50 projects. Zero-shot scalability is demonstrated: an agent trained on 50 projects maintains over 90% Hypervolume when tested on 200-project instances without retraining. Inference is completed in ≈12.69 ms — over 300× faster than NSGA-II — enabling real-time multi-objective decision support. Interpretability analysis confirms that the GNN learns to prioritize high-degree “hub” projects, providing transparent and explainable portfolio construction logic.
کلیدواژه‌ها
Combinatorial optimization؛ Multi-objective optimization؛ Deep reinforcement learning؛ Graph neural networks؛ Project portfolio selection

مراجع
[1] Amirian, S., Amiri, M., Taghavifard, M.T. (2024). “Optimizing supply chain design for sustainability and reliability: A comparative study of augmented epsilon and normalized normal constraint methods”. Control and Optimization in Applied Mathematics, 9(1), 97– 130. https://doi.org/10.30473/coam.2023.67540.1230 [2] Angioni, D., Archetti, C., Speranza, M.G. (2025). “Neural combinatorial optimization: A tutorial”. Computers & Operations Research, 182, 107102. https://doi.org/10. 1016/j.cor.2025.107102 [3] Boschetti, M.A., Maniezzo, V. (2024). “Contemporary approaches in matheuristics: An updated survey”. Annals of Operations Research, 343(2), 663–700. https://doi.org/ 10.1007/s10479-024-06302-z [4] Darvish, A., Sepehri, M. (2025). “Artificial intelligence-driven project portfolio optimization under deep uncertainty using adaptive reinforcement learning”. Applied Sciences, 15(23), 12713. https://doi.org/10.3390/app152312713 [5] Deb, K., Pratap, A., Agarwal, S., Meyarivan, T. (2002). “A fast and elitist multiobjective genetic algorithm: NSGA-II”. IEEE Transactions on Evolutionary Computation, 6(2), 182–197. https://doi.org/10.1109/4235.996017 [6] Ekmekcioğlu, Ö., Pınar, M.Ç. (2023). “Graph neural networks for deep portfolio optimization”. Neural Computing and Applications, 35(28), 20663–20674. https://doi.org/ 10.1007/s00521-023-08862-w [7] Fu, X., Gu, S. (2024). “Deep reinforcement learning algorithm based on graph weight multi-pointer network for solving multiobjective traveling salesman problem”. IEEE Access, 12, 179091–179103. https://doi.org/10.1109/ACCESS.2024.3505436 [8] Gama, F., Isufi, E., Leus, G., Ribeiro, A. (2020). “Graphs, convolutions, and neural networks: From graph filters to graph neural networks”. IEEE Signal Processing Magazine, 37(6), 128–138. https://doi.org/10.1109/MSP.2020.3016143 [9] Hu, D., He, J. (2025). “A dynamic multi-objective optimization approach for computing resource allocation in industrial model repository”. Swarm and Evolutionary Computation, 99, 102219. https://doi.org/10.1016/j.swevo.2025.102219 [10] Janinhoff, L., Klein, R., Sailer, D., Schoppa, J.M. (2024). “Out-of-home delivery in last-mile logistics: A review”. Computers & Operations Research, 168, 106686. https:// doi.org/10.1016/j.cor.2024.106686 [11] Kipf, T.N., Welling, M. (2017). “Semi-supervised classification with graph convolutional networks”. In: Proceedings of the 5th International Conference on Learning Representations (ICLR). https://arxiv.org/abs/1609.02907 [12] Liesiö, J., Salo, A., Keisler, J.M., Morton, A. (2021). “Portfolio decision analysis: Recent developments and future prospects”. European Journal of Operational Research, 293(3), 811–825. https://doi.org/10.1016/j.ejor.2020.12.015 [13] Lim, Q.Y.E., Cao, Q., Quek, C. (2022). “Dynamic portfolio rebalancing through reinforcement learning”. Neural Computing and Applications, 34(9), 7125–7139. https: //doi.org/10.1007/s00521-021-06853-3 [14] Liu, X., Tong, X., Liu, Q. (2021). “Profiling Pareto front with multi-objective Stein variational gradient descent”. Advances in Neural Information Processing Systems (NeurIPS), 34, 14721–14733. [15] Manchanda, S., Michel, S., Drakulic, D., Andreoli, J.M. (2023). “On the generalization of neural combinatorial optimization heuristics”. Machine Learning and Knowledge Discovery in Databases, (pp. 426–442). Springer, Cham. https://doi.org/10.1007/ 978-3-031-26412-2_27 [16] McPherson, M., Smith-Lovin, L., Cook, J.M. (2001). “Birds of a feather: Homophily in social networks”. Annual Review of Sociology, 27(1), 415–444. https://doi.org/10. 1146/annurev.soc.27.1.415 [17] Miettinen, K. (1999). Nonlinear multiobjective optimization. Springer Science & Business Media. ISBN: 978-0-7923-8278-2 [18] Mohseni, N., McMahon, P.L., Byrnes, T. (2022). “Ising machines as hardware solvers of combinatorial optimization problems”. Nature Reviews Physics, 4(6), 363–379. https: //doi.org/10.1038/s42254-022-00440-8 [19] Pereira, J.L.J., Oliver, G.A., Francisco, M.B., Cunha, S.S., Gomes, G.F. (2022). “A review of multi-objective optimization: Methods and algorithms in mechanical engineering problems”. Archives of Computational Methods in Engineering, 29(4), 2285–2308. https://doi.org/10.1007/s11831-021-09663-x [20] Sharma, S., Kumar, V. (2022). “A comprehensive review on multi-objective optimization techniques: Past, present and future”. Archives of Computational Methods in Engineering, 29(7), 5605–5633. https://doi.org/10.1007/s11831-022-09778-9 [21] Sutton, R.S., Barto, A.G. (2018). Reinforcement learning: An introduction (2nd ed.). The MIT Press. [22] Wang, H. (2024). “Multi-objective reinforcement learning based on nonlinear scalarization and long-short-term optimization”. Robotic Intelligence and Automation, 44(3), 475– 487. https://doi.org/10.1108/RIA-11-2023-0174 [23] Wang, S., Sun, W., Huang, M. (2024). “An adaptive large neighborhood search for the multi-depot dynamic vehicle routing problem with time windows”. Computers & Industrial Engineering, 191, 110122. https://doi.org/10.1016/j.cie.2024.110122 [24] Yuan, M., Yu, Q., Zhang, L., Lu, S., Li, Z., Pei, F. (2025). “Deep reinforcement learning based proximal policy optimization algorithm for dynamic job shop scheduling”. Computers & Operations Research, 183, 107149. https://doi.org/10.1016/j.cor.2025. 107149
آمار تعداد مشاهده مقاله: 121 تعداد دریافت فایل اصل مقاله: 42

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

اخبار و اعلانات

آمار

Harnessing Relational Structures in Multi-Objective Project Portfolio Optimization: A GNN-Enhanced Deep Reinforcement Learning Framework