Abstract
Tightly interdependent cyber and service networks in the Manufacturing Internet of Things (MIoT) drive cascading failures to propagate in intertwined horizontal (intra-layer) and vertical (cross-layer) directions, greatly complicating post-cascading-failure recovery decisions. To address limitations of existing approaches that neglect cross-layer dependencies and struggle to simultaneously handle heterogeneous load patterns and multiple failure states, this paper proposes a coordinated recovery framework for cyber-service-coupled MIoT, termed the coupled reinforcement learning (Coupled-RL) mechanism. Specifically, the Coupled-RL-based recovery method equips two layer-specific recovery agents for the cyber and service networks and a lightweight coordinator that orchestrates cross-layer decision-making. This coordinator is designed to avoid infeasible and globally suboptimal plans: its Feasibility Module (FM) shares the set of repaired nodes between layers and filters out actions that violate cross-layer prerequisites, while its Prediction Module (PM) exchanges per-state maximal target Q-values across layers-these values are used to construct a coupled return and inject cross-layer foresight into the Bellman update process of the agents. A weighted coupled return function and an alternating decision procedure further enable decentralized policy coordination. Extensive experiments demonstrate that the proposed recovery method effectively addresses the two-layer network coordination problem in MIoT during cascading failure recovery. Additionally, a comparative analysis between the proposed algorithm and existing ones is conducted to verify its superiority.