Semi-supervised deep rule-based approach for the classification of Wagon Bogie springs condition

This paper focuses on the new model of classification of wagon bogie springs condition through images acquired by a wayside equipment. As such, we are discussing the application of a semi-supervised learning approach based on a deep rules-based (DRB) classifier learning approach to achieve a high classification of a bogie, and check if they either have spring problems or not. We use a pre-trained VGG19 deep convolutional neural network to extract the attributes from images to be used as input to the classifiers. The performance is calculated based on the data set composed of images provided by a Brazilian railway company which covers the two spring condition : normal condition (no elastic reserve problems) and bad condition (with elastic reserve problems). Also, an additive Gaussian noise level is applied to the images to challenge the proposed model. Finally, we discuss the performance analysis of the semi-supervised DRB (SSDRB) classifier and its distinctive characteristics compared with other classifiers. The reported results demonstrate a relevant performance of the SSDRB classifier applied to the questions raised.


Introduction
Since the emergence of steam-powered machines, rail transportation has become an effective solution for connecting urban centers, as well as a low-cost alternative to the industries' transactions. Due to these circumstances, the wagons are submitted to stressing cycles with heavy loads, increasing their bogie defects through springs fatigue.
In this context, image processing and computational intelligence techniques are increasingly participating in the solution for this scenario. Since their capacity to detect critical wagon conditions enables to guarantee the safety production and high productivity of this system of these transportation systems.  introduced deep rule-based (DRB) classifiers Gu (2017a, 2017b); . The DRB classifier is a general approach that serves as a strong alternative to current deep neural network (DNN) Gu (2017a, 2017b); . It is non-parametric, non-iterative, highly parallelizable and computationally efficient; it achieves very high classification rates, surpassing other methods .
Moreover, it further extends the DRB classifier  with a self-organising, self-evolving semi-supervised learning strategy by exploiting the idea of "pseudo label" naturally with its prototype-based nature. Starting with a small amount of labelled training images, the semi-supervised DRB (SSDRB) classifier is able to pseudolabel remaining images based on the ensemble properties of the training images using non-parametric measures . As semi-supervised learning is leveraged to obtain labeled cluster samples without a full retraining Li et al. (2022), the SSDRB classifier also can self-organize its system structure and self-update recursively with the pseudo-labelled data and, thus, it supports real-time streaming data processing.
The proposed SSDRB classifier also inherits the advantage of the DRB classifier's transparency as its semi-supervised learning process only concerns the visual similarity between the identified prototypes and the unlabelled samples, which is highly human interpretable compared with the state-of-the-art approaches, SVM Cristianini and Shawe-Taylor (2000) and deep learning networks LeCun et al. (2015).
Furthermore, as mentioned in , the proposed SSDRB classifier cannot only perform classification on out-of-sample images but also support recursive online training on a sample-by-sample basis or a chunkby-chunk basis. Moreover, unlike other semi-supervised approaches, the proposed approach can learn new classes actively without human experts' involvement to self-evolve Angelov (2013).
Therefore, it is very important to present SSDRB for the railway sector. Then, this paper discusses the architecture and approach of semi-supervised deep rule-based (SSDRB) algorithm Gu (2017a, 2017b); ;  to deal with the classification of wagon bogie springs conditions identifying if they are or are not in bad condition.
The main contributions of this work are summarized below: -The study of the wagon bogie springs conditions through SSDRB has never been addressed before. -The model discussed in this paper has advantages that are not covered by classical methods, such as a learning process that is easy to interpret by a specialist; online or offline training; the capability to classify images outside the sample; capability to deal with uncertainty. -We use a pre-trained vgg-verydeep-19 deep convolutional neural network (VGG19) Mateen et al. (2018) to extract the attributes of the images. By this, the proposed model can learn abstract resources and obtain higher precision. -We present the performance analysis in terms of the classification accuracy using a data set with images acquired from railway wayside equipment.
And our major conclusions are: -The use of VGG19 as a feature extractor is effective for this application. -The SSDRB classifier achieved mean accuracy greater than 96%. -Through the obtained results, the SSDRB classifier proved to be an excellent alternative for the classification of wagon bogie springs conditions. It can assist in the inspection process of the road by reducing inspection times, ensuring greater reliability and availability of the road.
The rest of the paper is organized as follows: Section 2 deals with the formulation of the problem. Section 3 aims to discuss the concept of SSDRB. Section 4 discusses the results of computer simulations. Section 5 presents the main conclusions.

Problem formulation
The bogie suspension springs are a fundamental part of the wagon's damping set which has the function of dissipating the energy caused by some unwanted vertical movements that occur in railway dynamics.
As wagon bogies springs are the main part of suspension and damping, the classification of its condition is a critical development to assist the railway companies in verifying its railway conditions and safety along with granting higher services reliability.
In 2017, 11 wagons of a 33 wagon freight train traveling north of Ely have derailed as presented in the Fig. 1. The derailment accident occurred by ineffective damping on the wagon bogies Commuters face train delays for days (2017). The line was blocked, affecting passenger services from Peterborough and Cambridge to Stansted Airport and London for 7 days. Rail experts say the cost of the derailment could top Ł 1.0 million Cost of freight train derailment could top Ł1 million (2017).
As shown in the Fig. 2, a similar accident already occurred 10 years before when a line was closed for six months as result of derailment caused by bogie's suspension problems Derailed freight train near ely causes chaos in the east (2017); Commuters face train delays for days (2017). The river Ouse had been shut to traffic and Network Rail According to ANTF (Brazilian National Association of Railway Transporters) O setor ferroviário de carga brasileiro (2019), the volume of goods transported by the railroads increased by 95% in the period from 1997 to 2019.
In 2019, approximately US 600 million was invested, allowing for a significant growth in the rolling stock fleet. In 1997, the railroads had 1,154 locomotives and in 2019, they already totaled 3,405 units, representing an increase of 195%. In the same period, the number of wagons went from 43,816 to 115,434, representing an increase of 163%.
This trend causes an increase in speed and loads transported, changing the dynamic wheel-rail contact, thus increasing the probability of bogie spring defected. Defects in the springs occur due to different reasons, for example, as a result of fatigue, due to repetitive passages over the rail components, such as welds, joints, and switches, or due to the impacts of defect wagons bogies springs. If bogie spring defected grow and are delayed, they can lead to high maintenance costs. Therefore, it is essential a rapid and automatic defect detection.
Accordingly, the proposed model can reduce the impact of overhauls on trains operation, since it enables a preventive maintenance routine, which makes it possible for interventions to be carried out only when anomalies are observed. Furthermore, it is an interpretable model easily understandable by humans, making it replicable to other types of wagons and scalable along the rail, thus reducing time spent on inspection.
Based on perspective, it is necessary, especially because it results in order to: -Prevent accidents caused by unexpected failures; -Reduce the number of unproductive hours in maintenance; -Eliminate the manual process of visual inspection; -Reduce the number of recurrent preventive interventions; -Reduce the probability of brinelling on a bearing; -Increase the productivity of rail operations, given the reduction in frequency and time of operational maintenance; -Reduce the TST (Time Stopped Train) index through the increase in the meantime between failures (MTBF).
As shown in Fig. 3, the springs have no elastic reserve problems and It is noted because the springs have space between their turns. However, in Fig. 4 is shown no space between their turn what is a critical defect because they do not have damping capacity anymore.
Taking into account the scenario previously exposed in this Section, this work aims to classify the two main conditions that can occur with the wagon bogie springs: springs with elastic reserve (without defect) and springs without elastic reserve (with bad condition).

The proposal: SSDRB
DRB architecture is shown in Fig. 5. The input I consists of a wagons bogies springs image and is subject to a normalization of pixel values up to a range of [0,255] Simonyan and Zisserman (2014). Subsequently, the image is scaled to 227x227 pixels to increase generalization and reduce computational complexity Simonyan and Zisserman (2014).
The VGG19 is used for resource extraction due to its simpler structure and better performance. The extracted data are processed by the fuzzy rule-based (FRB) layer, which constitutes a massively parallel set of nebulous irrigations of the AnYa type 0-order fuzzy Angelov and Yager (2012), which is the basis of the DRB classifier. Finally, the decision-making classifies the images based on the degree of similarity with the prototypes generated in the training stage.
The use of a pre-trained deep convolutional neural networks to extract global vectors of image features to train generic classifiers is an alternative widely used since it allows the classifier to learn more abstract and discriminative attributes of high level obtaining greater precision Simonyan and Zisserman (2014); Xia et al. (2017). In this work, the VGG19 is used to vectors extracted from the wagon bogie spring images that have a dimension of [1,4096] Xia et al. (2017).
A previously mentioned FRB layer is the learning mechanism of the DRB classifier. The FRB subsystems are independent of each other and can be changed without influencing the others. Furthermore, each FRB subsystem contains a set of massively parallel fuzzy rules, formulated around generalized prototypes P or learned from the corresponding class segments. As they all have the same consequence can be combined through the logical connector "OR", as follows: In Eq. 1 " ∼ " denotes similarity; c = 1, 2, ..., C ; N c is the number of prototypes of the c th class.
(1) Architecture of the DRB classifier to wagon bogie springs condition  During the training process, the prototypes are identified from the data density in the feature space and from these prototypes the corresponding fuzzy rules are generated. Due to the large dimensions of feature vectors X c 1 ( X c 1 = X 1,1 , X 1,2 , ..., X 1,M ) , the cosine dissimilarity Gu et al. (2017) which is presented by equation 2, is used as the distance components since it is free from problems with dimensionality Senoussaoui (2013); Aggarwal and Charu (2001); Beyer (1999).
(N c is the number of prototypes; p c,1 is the first prototype; S c,1 is the corresponding support; r c,1 is the radius, r 0 = 2 − 2 cos( π 6 )) 7: iii. Else 8: 1 2.Calculate the density of x k : 10: ); 20: End If 21: 5.Else 22: 6. End If 24: iv. End If 25: v. Generate/update the AnYa type fuzzy rule; 26: End While where x,y is the angle between feature vectors x and y; but the norm of x is: thus, the cosine dissimilarity as the distance measure given as: For this paper it was adopted M = 4096 . The Eq. 2 is important because it facilitates the computational efficiency by allowing recursive calculation. The DRB training process is mentioned in  in which, the DRB classifier identifies prototypes of the segments of the observed images of each class in an autonomous and non-parametric way and forms clouds of data around the prototypes of similar segments of the same class.
In this way, the C rules of massively parallel diffuse parallel C order of the type AnYa in total are formed (learned) through the training processes independently, based on the identified prototypes.
The detailed training process for the FRB subsystems is described in Angelov and Gu (2017a) and , and the main procedure of the training process is summarized in pseudocode form, as shown in the Algorithm 1.
Once the training process is completed, the classification of new images can be performed using the identified FRB. As shown in , during the validation process, each test image receives a confidence score from the fuzzy rules identified in the training stage : As a result, for each testing image, a vector of 1 x C dimensional scores of confidence of the nearest prototypes (one per spring's condition class) is generated: The label of this testing image is decided by using the "winner-takes-all" principle : In the SSDRB classifier, after the training step performed by the DRB with the labeled images, the model has the ability to learn from the unlabeled images. For a set of images {U} with U unlabeled images, a confidence vector (U i ) = (i = 1, 2, ..., U) is extracted from each U image using Eq. 6.
The images that satisfy the Eq. 9 condition will be used to update the meta-parameters.
where * max (U i ) denotes the highest score of confidence; * * max (U i ) denotes the second highest score; ( > 1) is a free parameter.

Experimental results
Based on the theory addressed in the previous topic, tests were carried out to evaluate the classification model proposed in this work.
The database used in this application is made from wagons bogies of Brazilian railway company MRS Logística S.A. [29]. Such database is composed of images that are captured by a railway wayside equipment fixed to the MRS railroad in order to capture wagons bogies images each time the train passes through the site.
The images are taken from both wagon sides when it is crossing through the equipment and the examples of the images contained in the database are shown in the Figs. 3 and 4. In addition, it is important to mention that the wagons are empty when they pass through the site, i.e., it means the springs expected standard condition is that they are not compressed and the springs should have space between their turns.
The types of bogie that can be used on freight wagons may vary according to the type of wagon. Nonetheless, for each type of bogie the springs type and its geometric distribution are particular to each bogie type. Therefore, for the present work we used bogie images that correspond to the most representative portion of the main transport flow of MRS Logística S.A.
The experiment was conducted by collecting 250 images evenly distributed over different periods of the day which 125 presented wagon bogie with springs without elastic reserve problems and 125 images in which correspond to bogie with elastic reserve problems.
In addition, we distributed the data set for the test phase in an equal and random way, with the premise of 70 % of the database for training and 30 % for testing. All data were presented for the classifier 200 times and the tests performed with the SSDRB classifier were done in offline mode and considering = 1.2 . Furthermore, all simulations were performed in the MATLAB 2017 environment running with an Intel Core i7-3537U CPU at 2.00 GHz with 12 GB DDR4 2700MHZ and operating system Windows 10 64-bit.
For the representation of possible occurrences of defects in the acquired images and in order to challenge the proposed models, we added three different intensities of Additive white Gaussian noise (AWGN) in all data set, PSNR = 20 dB, PSNR = 6 dB and PSNR = 3 dB as shown in Figure 6 and 7. The peak signal to noise ratio (PSNR) Pullano et al. (2012) is the most used parameter to measure a corrupted image quality when compared to the original one Huang et al. (2017). The three different intensities adopted values for the PSNR is enough to report problems that can occur due some factors, such as dirt in the lens of the equipment responsible for acquiring the images Prasad and Kishore (2017) Such parameter is the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. It is computationally lightweight and is usually expressed in terms of the logarithmic decibel scale. There is an inverse relationship between PSNR and MSE. So, the higher the value of PSNR indicates the best image quality. Considering a noise-free m x n monochrome image I and its noisy approximation K, we can calculate the PSNR from a corrupted image as follows: where (10) PSNR = 10 log 10 255 2 MSE In Eq. 10, the elements of the matrix are represented by using linear pulse-code modulation (PCM) Tomar and Jain (2015) with B bits per sample, where the maximum is 2 B − 1 . Therefore, the value 255 2 denotes the maximum possible pixel value of the image, due to the fact of the pixels in this work are represented using 8 bits per element. In practical applications, the measurement noise can be heavy-tailed non-Gaussian noise Yan et al. The confusion matrix is classic method used in machine learning to summarize the prediction results from supervised classification or determination of the behavior of classification models, where rows are the real data and columns are the predicted classes James et al. (2013). Therefore, this structure is used to show the SSDRB results of computer simulations. The Table 1 show the performance for the original dataset. In addition, the results of images with white Gaussian noise (AWGN) PSNR of 20 dB, 6 dB and 3 dB are shown in Tables 2, 3, 4, images with Cauchy noise in Table 5 and images with Laplace noise in Table 6.
The numerical results presented in Table 7, 8, 9 and 10 show the overall performance results for the Original dataset, images with Gaussian white noise, images with Cauchy noise, and images with Laplace noise, respectively. Four metrics were used to measure performance during the test phases: Accuracy, Mean Square Error (MSE), Cohen's kappa coefficient Stehman (1996) and F-score Sokolova et al. (2006).

Statistical analysis
Statistical analysis was performed as in Amaral et al. (2019). Therefore the two-sample t-test was applied for the test accuracy metric in order to evaluate the results in Table 7, 8, 9 and 10. Considering two sets of samples G 1 and G 2 , the two-sample t-test allows us to infer assumptions from two independent data samples and to verify their statistical validity. This statistical test is expressed as: where G 1 , G 2 , 2 G 1 and 2 G 2 are the means and standard deviations values of the samples belonging to G 1 and G 2 , respectively. Also, L G 1 = #{G 1 } and L G 2 = #{G 2 } , in which # denotes the Cardinality operator. The degree of freedom is defined as L G 1 + L G 1 − 2 . In addition to the determination of , it becomes important to infer the hypothesis, which are given by: Given a significance level , usually around 0.05, the p-value is calculated from and represents the lowest value of to reject the null hypothesis ( H 0 ). Thus, values of the p-value below means that the null hypothesis is not true Moore (2009).
The sets of samples G 1 and G 2 of Eq. (12) are the test accuracy metric obtained for the 200 times, where G 1 refers to SSDRB and G 2 can denote which other classifier method, as listed in Table 11. The degree of freedom presented in these statistical tests is 64, which is relatively high, so there is no need to verify the normality of error distributions Moore (2009). Table 11 presents the results of the two-sample t-test performed for the test accuracy metric in Tables 7, 8, 9 and 10. The rejection of the null hypothesis are indicated by the letters 'W' and 'L' representing respectively the wins and the losses of the method tested. Meanwhile the acceptance of the null hypothesis is described by 'E' which means equality of the tested methods.

Conclusion
The work discusses the application of image processing techniques and computational intelligence by introducing the SSDRB classifier to the analysis of its results in the classification of wagon bogie springs. VGG19 was used for preprocessing the images, which proved to be very effective in extracting attributes from images due to its simple structure and better performance.
Compared to the classical classifiers, the SSDRB classifier showed higher accuracy in tests. The results turned out to be a great alternative since they presented the best result for the original data set. Despite it was not presented the best result numerically for images with PSNR of 3dB where Linear SVM results showed higher accuracy, the results of SSDRB has shown the maintenance of mean accuracy greater than 96% even to non-Gaussian noises. In addition, the model presented in this paper has other advantages that are not covered by classical methods, such as a learning process that is easy to interpret by a specialist, online or offline training, the ability to classify images outside the sample and ability to deal with uncertainty.
Detecting and classifying the condition of bogie springs can be done in the present context, generating a significant reduction in cost and time. It is therefore worth mentioning that the use of intelligent systems can support decisionmaking processes, increasing the flexibility and efficiency of the process.  Intelligent systems can assist in decision-making processes, bringing more agility and efficiency to the process. In this respect, the SSDRB classifier is an attractive alternative to quickly and efficiently diagnosing and classifying the condition of the wagon bogie springs, reducing costs and time spent on inspection.
The model discussed in this paper is limited in the definition of meta-parameters such as the theta of the semisupervisor DRB. Therefore, as future work it is intended to optimally define the meta-parameters using the effective method reported in literature like presented in Li et al. (2022).
In addition as future work, we intend to improve the process by researching more efficient techniques for pre-processing image classification and evaluation of other distance measure data density.