Abstract To tackle challenges such as convergence difficulties and suboptimal performance in the application of reinforcement learning to intelligent decision-making for joint operations, this study introduces an enhanced decision-making approach for joint operations utilizing non-metal-chastity an improved Proximal Policy Optimization (PPO) algorithm.We propose a structured intelligent decision-making model designed to execute decision-making functions effectively.The strategy loss mechanism is improved by constraining the upper limit of the strategy loss function.Furthermore, a priority sampling mechanism, is developed to assess sample values, thereby enhancing the efficiency of sampling training.
Additionally, a network structure facilitating distributed interaction and centralized learning is designed to expedite the training process.The proposed method is then applied to a Fittings joint operations simulation platform for intelligent decision-making.Simulation results demonstrate that our algorithm successfully addresses the aforementioned issues, enabling autonomous decisions based on battlefield dynamics, and ultimately leading to victory.