Explainable artificial intelligence based intelligent fault diagnosis: A systematic review from applications to insights 基于可解释人工智能的智能故障诊断:从应用到洞察的系统回顾
This is a repository about Explainable intelligent fault diagnosis (XIFD) methods, including papers, code, datasets etc.
We will continue to update this repository and hope this repository can benefit your research.
If you find this paper and repository useful, please cite our paper
@article{XIFD_LTF,
title={Explainable artificial intelligence based intelligent fault diagnosis: A systematic review from applications to insights},
author={Li, Tianfu, Chen Junfan, Liu Tao, Sun Chuang, Zhao Zhibin, Chen Xuefeng, Yan Ruqiang},
journal={Reliability Engineering & System Safety},
volume = {267}
pages={111935},
year={2026}
}
IFD consists of four main tasks, that is, machine anomaly detection (AD), fault diagnosis (FD), remaining useful life (RUL) prediction, and cross-domain IFD, as shown below.
We list papers, implementation code (the unofficial code is marked with *), etc, in the order of year.
Post-hoc XIFD methods aim to explain how a trained model produce predictions for any decision-making process with a given input by developing additional explainers or techniques, which can be further categorized into local explainability and global explainability depending on the object and destination of the explanation.
| Class | Methods | Suit for | Suit for |
|---|---|---|---|
| ML | DL | ||
| Global | Knowledge distillation [IJCV 2021] | not | yes |
| Global | Activation maximization: AM [ADLT 2024] | not | yes |
| Local | Local approximation method: LIME [ISMIR 2017], SP-LIME [KDD 2016], S-LIME [KDD 2021], ALIME [IDEAL 2019], ILIME [ADBIS 2019] | yes | yes |
| Local | Gradient based method: Guided-BP [SMARTTECH 2022], Smooth gradients [arXiv 2017], Integrated gradients [PMLR 2017] | not | yes |
| Local | Class activation mapping: CAM [CVPR 2016], Grad-CAM [ICCV 2017], Grad-CAM++ [WACV 2018], LRP [PLoS One 2015], | not | yes |
| Local | SHAP based method: SHAP [NIPS 2017] | yes | yes |
Global explainability aims to help people understand the overall logic behind the model and its inner working mechanism.
| Method | Literatures | Usage and Disadvantages |
|---|---|---|
| KD | Ji [ASC 2022], Zhong [IEEE Sens. J. 2023], Sun [TIM 2023], Li [KBS 2022] | Can explain the decision-making process of a complex model through a simple model, but ignores the knowledge representation within the complex model. |
| AM | Yang [MST 2022], Jia [MSSP 2018] | Can visualize the input preferences of each neuron, but it does not directly explain why these features lead to the activation of neurons. |
Local explainability aims to deeply analyze the decision-making process of the model for a specific input sample and its neighborhood.
| Method | Literatures | Usage and Disadvantages |
|---|---|---|
| LIME | Yao [ASME 2021], Al-Zeyadi [IJCNN 2020], Sanakkayala [86], Akin [Micromachines 2022], Khan [UT 2024], Gawde [DAJ 2024], [Access 2024], Li [MST 2024], Lu [MST 2022], Mai [DCASE 2022] | Can explain tables, images, and text data, but can only provide explanations for predictions of a single sample, and the explanations are unstable. |
| SP-LIME | —— | Multiple samples can be explained, and the selected samples need to cover important features, but the algorithm accuracy is low. |
| S-LIME | —— | Can produce stable explanations, not suitable for time series data. |
| ILIME | —— | By selecting the most influential samples for prediction, the explanation accuracy is higher, but it is not applicable to text and image data. |
| GraphLIME | Li [AEI 2024] | Can explain the importance of different node features for node classification tasks, but ignores the impact of edges on model performance. And it cannot be used to explain graph classification models. |
| Method | Literatures | Usage and Disadvantages |
|---|---|---|
| Guided-BP | —— | The target features are relatively concentrated. |
| Integrated gradients | Li [TNNLS 2021], Du [Sensors 2022] | Explain that within CNN, there is less noise in the features. |
| Smooth gradients | Peng [ISA Transactions 2022] | Positioning image decision features, unable to quantify contribution. |
| Method | Literatures | Usage and Disadvantages |
|---|---|---|
| CAM | Sun [Access 2020] | Effectively reduces parameters and prevent overfitting, but the original model structure needs to be modified. |
| Grad-CAM | Zhang [Sensors 2024], Chen [Access 2020], Lu [arXiv 2023], Ren [TIM 2023], Menno [Annual Conference of the PHM Society 2021], Mathew [Research Square 2024], Yu [Measurement 2022], Guo [TIM 2023], Senjoba [Applied Sciences 2024], Guo [CMC 2023] | Can be applied to different convolutional neural networks for explanation, but the gradient is unstable. |
| Grad-CAM++ | Chen [IEEE Sens. J. 2023] | Suitable for multi-target object detection explanations, but a lot of background information will be marked. |
| Score-CAM | Chen [Building and Environment 2023] | A gradient free method with good visualization effect |
| Smoothed Score-CAM | Yang [Neurocomputing 2023] | Introduces an enhanced visual explanation algorithm to smooth the traditional Score-CAM |
| FreGrad-CAM | Kim [TII 2020] | Designed to visualize the learned frequency features. |
| MultiGrad-CAM | Li [JMS 2023] | Designed to address the issue of traditional Grad CAM feature resolution decreasing with increasing network layers |
| GCN—CAM | Chen [SAFEPROCESS 2021] | Designed to visualize the learned features of GNNs. |
| SGG-CAM | Sun [Measurement 2022] | Designed to solve the problem of insufficient centralized and accurate activation response of traditional CAM to fault areas |
| Grad-Absolute-CAM | Li [Building and Environment 2021] | Designed to address the issue of traditional Grad-CAM being unable to focus on activating negative feature maps |
| Method | Literatures | Usage and Disadvantages |
|---|---|---|
| SHAP | Groote [MSSP 2022], Ahmad [Preprints 2021], Li [TAES 2023], Bindingsbø [Frontiers in Energy Research 2023], Moosavi [Electronics 2024], Brusa [Applied Sciences 2023], Pham [ATiGB 2022], Hasan Sensors 2021], Yan [EAAI 2024], Jang [TII 2023], Santos [MLKE 2024] | Can explain the effect of features on the model’s predictions, but it cannot provide an explanation of the causal relationship between the features and the results. |
| Method | Literatures | Usage and Disadvantages |
|---|---|---|
| LRP | Grezmak [Procedia CIRP 2019], [IEEE Sens. J. 2019], Kim [ESA 2024], Wang [RESS 2023], Nie [JIM 2021], Herwig [TI 2023], Han [JEET 2022], Xiong [Building Simulation 2024], Qu [SSRN 2023], Parziale [SSRN 2023] | Can provide explanations for model decisions, but has high computational costs for complex deep learning models |
Attention mechanisms provide a way to help understand what to focus on during model learning, and it allows the model to automatically and selectively focus on important information and ignore unimportant information when dealing with large amounts of input data in order to achieve transparency and visualization of the decision-making process, which in turn enhances the model explainability.
Currently, many types of attention mechanisms have been developed, such as spatial-attention, channel-attention and mixed-attention.
self-attention-based XIFD methods can effectively capture long-range dependencies in sequences and display the data of interest in the feature extraction process.
| Method | Sub-method | Literatures | Usage and Disadvantages |
|---|---|---|---|
| Non-self-attention | Traditional attention | Yang [ASC 2020, Chen [TIM 2021] | Doesn’t need to consider all relationships within the sequence and can be better adapted to a variety of different tasks, but is difficult to effectively capture relationships between the datapoints of the sequence. |
| External attention | Zhang [BE 2022] | - | |
| Channel attention | Ren [TIM 2023], Wang [Access 2023], Chen [IEEE Sens. J. 2022], Chan [GLOBECOM 2021] | - | |
| Path attention | Zheng [RESS 2024] | - | |
| CBAM attention | Li [TNNLS 2023], Zhang [Sensors 2024] | - | |
| Mixed attention | Liu [Applied Sciences 2022], Wang [TII 2019], Su [JMPSCE 2024, IET Renewable Power Generation 2022]], Khaniki [arXiv 2024], Xu [RESS 2022], Peng [JDMD 2023], Zhao [ESA 2024] | - | |
| Self-attention | Self-attention | Li [TNNLS 2022], Che [Shock and Vibration 2023], Wang [TR 2023], Han [KBS 2024], Jiao [Measurement 2023] | Has a large receptive field, fewer parameters, and lower complexity, but requires a large amount of data, making it difficult to explain the specific role of each input element. |
| Multi-head self-attention | Tang [TIM 2022], Liu [AEI 2022], Keshun [ Nonlinear Dynamics 2024] | - | |
| Scaling dot-product attention | Ning [Electronics 2024] | - |
Currently, there are two ways to implement PIFD, where the first way is to establish a physical simulation model (PSM) and generate corresponding data to assist in model training, thereby guiding the model to effectively extract fault features. While the second way embeds physics equations as the loss function of the model to guide model training, such as the recently emerging physics-informed neural networks (PINN).
-
A Gradient Alignment Federated Domain Generalization Framework for Rotating Machinery Fault Diagnosis [IOT 2025]
-
Federated Domain Generalization for Fault Diagnosis: Cross-Client Style Integration and Dual Alignment Representation [IOT 2025]
- Dual-contrastive Multi-view Graph Attention Network for Industrial Fault Diagnosis under Domain and Label Shift [TIM 2025
There are eight open-source dataset and two self-collected dataset for research of domain generalization-based fault diagnosis.
| Task | Dataset | Object | Description |
|---|---|---|---|
| Anomaly detection | MIMII [30] | Valve, pump, fan, slide rail | This dataset is a sound dataset that simulates the sound of components such as valves, pumps, fans, and slides under normal and abnormal conditions. The data was recorded by an 8-channel microphone array and simulated the impact of noise in a real factory. |
| Anomaly detection | MIMII DG [31] | Fan, gearbox, bearing, slide rail, valve | The dataset contains sound recordings from five types of industrial machines (fan, gearbox, bearing, slide rail, valve), collected under multiple domain-shift scenarios. Each audio clip is about 10 seconds long, sampled at 16 kHz using a TAMAGO-03 microphone in soundproof or anechoic chambers. |
| Anomaly detection | ToyADMOS [32] | Micromachines | The dataset is the first large-scale dataset for anomalous sound detection in machine operations, featuring around 540 hours of normal sounds and over 12,000 anomalous samples recorded with four microphones at a 48 kHz sampling rate. |
| Anomaly detection | IMAD-DS [33] | Motor & robotic arm | This multi-sensor industrial dataset contains normal and faulty operation data from robotic arms and brushless motors. It includes signals from microphones and accelerometers and introduces domain shifts such as variations in load, speed, and background noise. |
| Anomaly detection | RflyMAD [34] | Drones | This dataset is used for multi-rotor drone fault detection and health management. It contains data on 11 common faults (such as motor failure, propeller failure, etc.) in six flight states, covering both simulated and real flight scenarios. |
| Anomaly detection | PyScrew [35] | Screw | The dataset collects data from six screw tightening scenarios, including more than 34,000 industrial screw tightening operations, covering various health conditions such as thread wear, surface friction, and assembly failures. |
| Diagnosis | CWRU [36] | Bearing | The dataset consists of four sub-datasets, each with operating conditions of 0 hp - 1797 rpm, 1 hp - 1772 rpm, 2 hp - 1750 rpm, and 3 hp - 1730 rpm. The motor bearing faults include ball fault, inner ring fault, and outer ring fault. |
| Diagnosis | MFPT [37] | Bearing | This dataset consists of four sets of bearing vibration data. In the first sub-dataset, it contains three baseline conditions. In the second sub-dataset, it contains three outer race fault conditions. In the third sub-dataset, it contains seven outer race fault conditions with seven different loads. In the fourth sub-dataset, it contains seven inner race fault conditions with seven different loads. |
| Diagnosis | PU [38] | Bearing | The dataset contains 32 sub-files, including 26 faulty bearings and 6 healthy bearings. The faulty bearings include 12 artificial damages caused by EDM and 14 real damages caused by accelerated life tests. |
| Diagnosis | JNU [39] | Bearing | The dataset contains four types of bearing faults, including normal state, ball fault, inner race fault, and outer race fault. Vibration data were collected under three different working conditions, with the motor speeds set to 600, 800, and 1000 rpm, respectively. |
| Diagnosis | HIT [40] | Bearing | This dataset is for aero-engine inter-shaft bearing failure. The test bench consists of a modified aero-engine, a motor drive system and a lubricating oil system. The experiment collected data of one outer ring failure and two bearing inner ring failures at high- and low-pressure rotors at 28 different speeds. |
| Diagnosis | VATM [41] | Bearing& rotor | This dataset is a multi-sensor dataset that collects vibration, acoustic, temperature and drive current data of bearing inner and outer rings, shaft misalignment, rotor imbalance and other faults under three different torque load conditions. |
| Diagnosis | HUSTBearing [21] | Bearing | This dataset collects 9 different failure modes, including 2 groups of bearing failure data at 4 different speeds. |
| Diagnosis | XJTUSuprgear [23] | Gear | This dataset collects the failure data of spur gears with four different degrees of tooth root cracks under three different working conditions, that is, 900rpm, 1200rpm, and 0-1200rpm-0. |
| Diagnosis | SEU [42] | Gearbox | The dataset includes four fault types: broken tooth, missing tooth, tooth root crack, and tooth surface wear; the bearing dataset includes four fault types: ball fault, inner ring fault, outer ring fault, and mixed fault. |
| Diagnosis | XJTUGearbox [23] | Gearbox | This dataset collects fault datasets of 4 types of gear faults (tooth surface wear, missing teeth, tooth root cracks and broken teeth) and 4 types of bearing faults (inner ring, outer ring, rolling element and mixed faults). |
| Diagnosis | WT-Gearbox [43] | Gearbox | This dataset collects broken teeth, tooth surface wear, tooth root cracks, and missing tooth faults in the gearbox. Eight working conditions are considered for each fault. In addition to the X and Y axis vibration signals, the main shaft encoding signal is also collected to consider the impact of equipment disassembly and assembly on the monitoring signal. |
| Diagnosis | UoC [44] | Gearbox | In this dataset, the pinion gear mounted on the input shaft was introduced into nine different healthy conditions, including root crack, healthy, spalled, missing teeth, and sharpening, with five different severity levels. |
| Prognosis | CMAPSS [45] | Aero-engine | This dataset is open-source aviation engine performance degradation data from NASA and consists of four sub-datasets, which are engine performance degradation data under different operating conditions and failure mode combinations |
| Prognosis | N-CMAPSS [46] | Aero-engine | This dataset was also generated by simulation using the CMAPSS software developed by NASA and ETH. It contains 8 subsets, simulating the performance degradation data of 128 engines under 7 different failure modes. |
| Prognosis | PHM2010 [47] | Tool | This dataset is the open-source tool wear data of the 2010 PHM competition at different speeds, feed rates, and cutting depths. It consists of 6 sub datasets, each containing 315 samples. |
| Prognosis | IMS [48] | Bearing | This dataset is a dataset of a bearing run-to-failure experiment. The dataset consists of three subsets, each of which contains the performance degradation data of four bearings. |
| Prognosis | FEMTO-ST [49] | Bearing | This dataset is the open-source bearing performance degradation data of the 2012 PHM competition, which includes bearing data under three different working conditions. |
| Prognosis | XJTU-SY [50] | Bearing | This dataset contains the full-life cycle vibration signals of 15 rolling bearings under three working conditions, and clearly marks the failure location of each bearing. |
| Prognosis | GearLifeCyle [51] | Gear | This dataset is the full-life vibration data of gears generated by gear fatigue tests conducted using the FZG gear contact fatigue test bench, and includes performance degradation data collected under four operating conditions. |
| Title | Journal | Date | Code | Task |
|---|---|---|---|---|
| Conditional Contrastive Domain Generalization For Fault Diagnosis |
TIM | 2022 | Github | Diagnosis |
If you have any problem, please feel free to contact me.
Name: Tianfu Li
Email address: tianfu.li@kust.edu.cn

