Fault detection and progression - EXPERIMENTS WITH MULTIVARIATE DATA

6. EXPERIMENTS WITH MULTIVARIATE DATA

6.6. Fault detection and progression

67 data samples during the same period. However, because of their lower descriptor values and earlier appearance in the sequence, they were left to represent normal states. The decision on the classes to be labelled affects directly on the under- or over-diagnosis in future predictions.

68 Figure 38: Distances of data samples from the weight centres of the best matching class in labelled (fault mode related) classes are being elevated as the fault progresses.

Figure 39 shows the anomaly index for the whole time period and reveals that there are several anomalies in the data set. Some of these coincide with the date range of Figure 38 indicating that the fault mode is not precisely similar with the one on the other gearbox. The anomalies in early August can be considered random and should not cause alerts. However, the anomalies starting in mid- September are longer in duration and have significant index values.

Figure 39: Anomaly index trend for the whole data set show random anomalies in August, but more continuous anomalies in mid-September.

Let us study the trajectory plot again in Figure 40, which shows the travel paths just before transitions. We can see that the first anomalies happen in mid-September in classes 38, 41, 43 and 51. When compared with the previous trajectory in Figure 37 with data from the gearbox “GB01”, it can be seen that these classes are linked to the events just before the transition into the fault mode.

Apparently, we are gaining more information on the early warning and the fault progress. In order

69 to use this information, we need to re-train the classifier. Before that it is necessary to memorize the de-normalized weight factors of all labelled classes (21 to 29, 31, 32, 55 to 62, 65 and 66). The class numbers are used as labels in this case for an illustrative purpose, but in a real application fault mode and severity would be more appropriate. De-normalization, which is necessary, because the mean and standard deviation will change, is done by multiplying each weight by the standard deviation (scale factor) and adding the mean (offset factor).

Figure 40: Trajectory view showing fault progression as transitions on the map. The upper number in the class gives the class index and the lower number the amount of data samples used to train the class.

We already have a reasonable amount of training data for the normal modes and therefore append only the anomalies to the data bank consisting now of 6136 data samples. All data will be normalized using the original and appended data population. The retrained classifier map is given in Figure 41. We can now see the classifier’s ability to adapt to the new data.

70 Figure 41: The classifier map shows, how the labelled classes are located on the map after retraining.

The first number in the class is the class index, the second one is the number of training samples and the third one gives the index of the respective class on the previous classifier map. It can be seen that a new set of data was used to train the classes in the top left corner of the map.

The classes on the new map have been labelled by searching for the best matching class for each of the memorized weight vector. The class has then been labelled accordingly. Because of the weight vectors represented the average symptom values of the training samples, very high membership was detected. Some interesting observations can be made:

71 1. There were no hits in data set “GB02” in the classes 29 to 32 that represented the most severe fault mode in data set “GB01”. See distances of data samples in these classes in Figure 42.

2. There were no hits in data set “GB01” to classes 21 to 28 that represented the most severe fault mode in data set “GB02”. See distances of data samples in these classes in Figure 43.

3. Both data sets have hits in classes 53 to 56 before entering the classes mentioned in previous points.

The first two points above verify that the fault modes are different at least at the later stage of the fault progression. In fact, the bearing fault in “GB01” was in an outer race of the bearing, while in “GB02” it was in an inner race. Regardless of the differences between weight centres in classes 21 and 29, the two fault modes are still in the neighboring classes. As a conclusion, it is difficult to predict the correct type of an early bearing fault, when we are still in classes 53 to 56, where the state of both gearboxes remained for more than six weeks.

Figure 42: Euclidean distances of data samples from respective weight centres in classes 29 to 32 representing the most severe fault mode of data set “GB01”.

72 Figure 43: Euclidean distances of data samples from respective weight centres in classes 21 to 28 representing the most severe fault mode of data set “GB02”

This experiment proves that we can use a classifier to detect anomalies that might be caused by developing faults. We can also differentiate closely resembling fault modes even, when they appear in different machines in the same environment. The next challenge is to generalize the anomaly and fault detection even more.

No documento Intelligent Interpretation of Machine Condition Data (páginas 80-85)