With the advancement in electrical power systems, control engineering, and information and communication technologies (ICT), remarkable efforts have been made to improve the current electrical power grid, resulting in the so-called intelligent grid or the smart grid (SG). The envisioned smart grid heavily relies on ICT to support two-way communication, demand side management, and other critical smart grid operations. In case of an attack in the SG environment, communication link failures may occur. Moreover, an attacker may also compromise the switch within the SG communication network, thus leading to the breakdown of several communication links. Consequently, important information will not be communicated to and from the SG entities like relays, and remote terminal units (RTUs), As a result, cascading failures or blackouts may occur. Identification of these link failures and choosing alternative communication paths at run-time is indispensable. Thus, to deal with this issue, we consider an SDN-based smart grid setting in which the SDN controller in coordination with the network switches, is capable of learning about the link failures inside the network and then adapt to the changing network conditions dynamically. We model this problem as Multi-armed bandit problem and propose a link failure learning (LFL-MAB) algorithm. The proposed algorithm has the capability to learn the strategy adopted by the attacker and select those communication links which are reliable. Simulation results show that the proposed algorithm outperform other approaches and ensure reliable communication and resiliency within the smart grid network.