Failure Analysis of Mechatronic Systems

The failure modes of a mechatronic system include failure modes of mechanical, electrical, computer, and control subsystems, which could be classified as hardware and software failures. The failure analysis of mechatronic systems consists of hardware and software fault detection, identification (diagnosis), isolation, and recovery (immediate or graceful recovery), which requires intelligent control. The hardware fault detection could be facilitated by redundant information on the system and/or by monitoring the performance of the system for a given/prescribed task. Information redundancy requires sensory system fusion and could provide information on the status of the system and its components, on the assigned task of the system, and the successful completion of the task in case of operator error or any unexpected change in the environment or for dynamic environment. The simplest monitoring method identifies two conditions (normal and abnormal) using sensor information/signal: if the sensor signal is less than a threshold value, the condition is normal, otherwise it is abnormal. In most practical applications, this signal is sensitive to changes in the system/process working conditions and noise disturbances, and more effective decision-making methods are required.

Generally, monitoring methods can be divided into two categories: model-based methods and featurebased methods. In model-based methods, monitoring is conducted on the basis of system modeling and model evaluation. Linear, time-invariant systems are well understood and can be described by a number of models such as state space model, input-output transfer function model, autoregressive model, and autoregressive moving average (ARMA) model. When a model is found, monitoring can be performed by detecting the changes of the model parameters (e.g., damping and natural frequency) and/or the changes of expected system response (e.g., prediction error). Model-based monitoring methods are also referred to as failure detection methods.

Model-based systems suffer from two significant limitations. First, many systems/processes are nonlinear, time-variant systems. Second, sensor signals are very often dependent on working conditions. Thus, it is difficult to identify whether a change in sensor signal is due either to the change of working conditions or to the deterioration of the process. Feature-based monitoring methods use suitable features of the sensor signals to identify the operation conditions. The features of the sensor signal (often called the monitoring indices) could be time and/or frequency domain features of the sensor signal such as mean, variance, skewness, kurtosis, crest factor, or power in a specified frequency band. Choosing appropriate monitoring indices is crucial.

Ideally the monitoring indices should be: (i) sensitive to the system/process health conditions, (ii) insensitive to the working conditions, and (iii) cost effective.

Once a monitoring index is obtained, the monitoring function is accomplished by comparing the value obtained during system operation to a previously determined threshold, or baseline, value. In practice, this comparison process can be quite involved. There are a number of feature-based monitoring methods including pattern recognition, fuzzy systems, decision trees, expert systems, and neural networks. Fault detection and identification (FDI) process in dynamic systems could be achieved by analytical methods such as detection filters, generalized likelihood ratio (which uses Kalman filter to sense discrepancies in system response), and multiple mode method (which requires dynamic model of the system and could be an issue due to uncertainty in the dynamic model) (Chow and Willsky, 1984).

As mentioned above, the system failures could be detected and identified by investigating the difference between various functions of the observed sensor information and the expected values of these functions. In case of failure, there will be a difference between the observed and the expected behavior of the system, otherwise they will be in agreement within a defined threshold. The threshold test could be performed on the instantaneous readings of sensors, or on the moving average of the readings to reduce noise. In a sensor voting system, the difference of the outputs of several sensors and each component (sensor or actuator) is included in at least one algebraic relation. When a component fails, the relations including that component will not hold and the relations that exclude that component will hold. For a voting system to be fail-safe and detect the presence of a failure, at least two components are required. For a voting system to be fail-operational and identify the failure, at least three components are required, e.g., three sensors to measure the same quantity (directly or indirectly).

Article Source: