Anticipating Railway Operation Disruption Events Based on

Transcription

Anticipating Railway Operation Disruption Events Based on
A publication of
CHEMICAL ENGINEERING TRANSACTIONS
VOL. 33, 2013
Guest Editors: Enrico Zio, Piero Baraldi
Copyright © 2013, AIDIC Servizi S.r.l.,
ISBN 978-88-95608-24-2; ISSN 1974-9791
The Italian Association
of Chemical Engineering
Online at: www.aidic.it/cet
DOI: 10.3303/CET1333120
Anticipating Railway Operation Disruption Events Based on
the Analysis of Discrete-Event Diagnostic Data
Olga Fink*,a, Enrico Ziob, Ulrich Weidmanna
a
Institute for Transportation Planning and Systems, ETH Zurich, Wolfgang-Pauli-Str. 15, 8093 Zurich, Switzerland
Chair on Systems Science and the Energetic Challenge, European Foundation for New Energy-Electricité de France
(EDF) at École Centrale Paris and SUPELEC, Grande Voie des Vignes, 92295 Chatenay-Malabry Cedex, France;
Department of Energy, Politecnico di Milano, Italy
[email protected]
b
Public transport reliability is a highly important factor affecting passenger service quality, transportation
mode choice, and operating costs. Unreliable service increases operating costs and reduces patronage. In
railway systems reliability is significantly influenced by the technical reliability of infrastructure and rolling
stock systems. To meet the stringent requirements on reliability and availability, many advanced railway
systems and components include monitoring and diagnostic tools. The data generated by these systems
can be supplied to data-based methods predicting reliability.
Approaches to predict component failures and remaining useful life are usually based on continuously
measured diagnostic signal data. The use of event-based diagnostic data is limited.
This paper describes our research applying Echo-State Networks (ESN) in combination with Restricted
Boltzmann Machines (RBM) and fuzzy logic to predict potential railway network disruptions based on
discrete-event diagnostic data. The case study focuses on predicting impending failures of a train door
system on the level of an individual system potentially causing disruption events of railway operations.
The proposed approach achieved an average prediction accuracy of 97 %. The research results
demonstrate the suitability of the proposed combination of methods for use in predicting railway operation
disruption events. The findings show that the prediction of medium-term class event patterns is especially
helpful since railway operators can use this information to take remedial actions to prevent the disruption.
1. Introduction
In railway systems with high frequency service, small operational disruptions can lead to cascading events
throughout the network. Anticipating and preventing technical failures of single components or systems
can significantly improve overall service reliability, which is a decisive factor for transportation mode
choice. This can be achieved by predicting impending disruption events caused by technical failures of
single components or systems.
Both railway infrastructure systems as well as rolling stock systems are increasingly equipped with
monitoring and diagnostic systems. These diagnostic systems allow operators to identify failure causes
and locations of failures in shorter times and thereby provide higher availability of systems.
Diagnostic systems can be generally subdivided in those that monitor a parameter continuously and
enable therefore an inference on the condition of the system, and those that automatically record events.
The recorded discrete events can range from confirming an enabled function to indicating that a signal
exceeds defined limits. Some of these events are used to warn the train driver or to assist the
maintenance crew in their fault finding and corrective actions.
Diagnostic systems usually provide huge amounts of high dimensional and dynamic data that is non-linear
and can be difficult to interpret and handle with physical models. Therefore, data-based methods are
increasingly applied to extract the information in the data. It is assumed that there is a huge amount of
Please cite this article as: Fink O., Zio E., Weidmann U., 2013, Anticipating railway operation disruption events based on the analysis of
discrete-event diagnostic data, Chemical Engineering Transactions, 33, 715-720 DOI: 10.3303/CET1333120
715
structure in the data, but the structure is too complicated to be represented by a simple model. In these
cases, machine learning techniques are usually applied.
Different machine learning techniques have been applied to different problems of reliability prediction,
diagnostics and prognostics, such as several types of neural networks (Zio et al., 2012), support vector
machines (Pai, 2006) and combinations of different techniques (Caesarendra et al., 2011).
For continuously measured signals there are several studies with the application focus on railway systems
(Roberts and Silmon, 2012). Many of the studies focus on railway infrastructure systems (Garcia Marquez
and Schmid, 2007). Two of the studies focussed specifically on train door systems (Lehrasab et al., 2002)
and (Smith et al., 2010), as the present study. In both studies, continuously measured diagnostic data
were used as indicators allowing inference on the performance of the considered systems. Examples of
such indicators are electrical current and voltage signals from the open/close cycles, closing and opening
times, pressure, velocity and airflow. In (Smith et al., 2010), the input parameters were used in a
regression model to derive the level of performance of the system. In (Lehrasab et al., 2002) a
combination of different methods was applied to classify performance indicators and to indicate faults. The
achieved classification accuracy was between 66 % and 90 %. The network types applied in these studies
were Multilayer Perceptrons (MLP) with different network design principles (such as e.g. cascade
correlation), Radial Basis Functions (RBF) and Self-Organizing-Maps (SOM).
Discrete-event diagnostic data contain less information on the condition of the system and the evolution of
the condition in time, compared to continuously measured diagnostic data. There are therefore many
differences compared to using continuously measured signals. A previous study (Fink et al., 2013) has
shown that discrete-event diagnostic data can be used for predictions with good classification precision. In
the present study a different approach is applied: the data is pre-processed in a different way and the
classification is now distinguishing between short-term, mid-term and long-term occurring events,
compared to only two classes used in the previous study. Furthermore, the applied algorithms are partly
different. The approach proposed in the present study applies Restricted Boltzmann Machines (RBM),
Echo State Networks (ESN) and fuzzy classification for predicting potential operational disruption events.
Discrete-event diagnostic data from the railway door system was considered in this study. Door systems
can directly or indirectly lead to delays in railway operations. For example, if a sliding step, part of the
primary door system designed to bridge the gap between the train door and the platform, cannot be
retracted, the train will not be permitted to depart. In this case the sliding step must be retracted manually,
either by the conductor or the driver, which results in a delay. Therefore, anticipating failures of door
systems can prevent operational disruptions and service delays and thereby lead to improved reliability
and efficiency. The prediction of mid-term occurring events is particularly beneficial for anticipation and
prevention.
The remainder of the paper is organized as follows. The next Section describes the theoretical background
of the applied methods, which include fuzzy logic, Restricted Boltzmann Machines (RBM), Echo State
Networks (ESN). The full algorithm applied to predict impending diagnostic events with the potential for
causing railway operational disruptions is introduced in Section 3. Section 4 presents the results obtained
applying the algorithm to the passenger train door system diagnostic data. Finally, Section 5 discusses
results and makes recommendations for further research.
2. Theoretical background
In this Section, the theoretical background of the applied algorithms is summarized, namely Fuzzy logic,
Restricted Boltzmann Machines and Echo State Networks.
2.1 Fuzzy logic
Fuzzy logic is the logic of fuzzy sets, which enables dealing with real world phenomena and especially with
vagueness. Fuzzy logic enables the transformation of linguistic variables into appropriate set theory
(Zadeh, 1973). It allows partial set memberships, contrary to crisp set memberships originating from binary
logic in which a value is either part of the set or not. The degree of membership of a fuzzy logic variable is
between 0 and 1 and is defined by a membership function. A membership function can have different
forms, such as triangular, trapezoidal, Gaussian.
In classification, patterns are classified as belonging to a specific set based on similarity in the data.
Especially if the classes can be described by linguistic variables the application of the fuzzy logic is
beneficial.
In the current case study, three classes are defined: data patterns leading to a disruption event in the
short-term, in the medium-term or in the long-term. This classification is sufficiently precise for the
operators to anticipate the failures. The medium-term class event patterns are especially helpful since
railway operators can use this information to take remedial actions to prevent the disruption.
716
2.2 Restricted Boltzmann Machines (RBM)
Restricted Boltzmann Machines are networks of symmetrically connected neuron-like units. RBMs are also
referred to as stochastic neural networks. Boltzmann machines consist of two layers: a visible layer and a
hidden layer. Each unit in the visible layer is connected to all units in the hidden layer and vice versa. The
visible layer contains the input parameters; the hidden layer contains the latent parameters that the
networks learn (Ackley et al., 1985). The hidden layer learns to model the distribution of the visible layer of
variables. However, the units within one layer are not interconnected. Therefore, the networks are called
“restricted”. This restriction simplifies the learning process.
The learning process of RBMs can be either supervised or unsupervised. Single RBMs can be composed
to more complex structures, such as in deep learning, in this case their weights are adjusted after an initial
unsupervised learning process by back propagation learning algorithm (Hinton et al., 2006).
2.3 Echo State Networks (ESN)
Echo state networks are a specific type of recurrent neural networks. Similar to other recurrent neural
networks, ESN are able to exhibit dynamic temporal behavior and have a memory (Jaeger, 2005). ESN
are typically applied for modeling complex dynamic systems. They have been applied for many practical
applications, such as iterated prediction of time series, signal classification and dynamic pattern
recognition tasks (Verstraeten, 2009).
The main structural element of ESN is a reservoir rather than a layered structure. The ESN are defined by
randomly and sparsely connected neurons, which are organized in the reservoir. The weights between the
connected neurons within the reservoir are fixed and are not trained during the training process (Jaeger,
2005).
In the training process, input signals (“training signals”) are presented to the reservoir to induce nonlinear
responses. The single neurons exhibit an “echoed” response to the training signal and generate a variation
or a transformation of the induced training signal. Subsequently, a desired output signal is determined by a
trainable linear combination of all the generated response signals. In supervised learning a teacher signal
is fed back to the reservoir. The process is illustrated schematically in Figure 1.
Figure 1 Functional Principle of Echo State Networks, Based on (Jaeger, 2005)
The ESN’s main difference from other neural networks is that only the weights of the reservoir output
signals are trained (Jaeger, 2005). The weights of the connections within the reservoir are not trained but
are generated randomly. This approach significantly reduces the learning process compared to other
algorithms (e.g. back propagation through time). One of the major properties of the echo state network is
the so-called echo state property. This property ensures that the network has a fading memory and thus
that the influence of initial conditions, which are randomly generated, diminishes asymptotically
(Verstraeten, 2009). Therefore, at the end of the training process only learned relationships influence the
output.
3. Applied data and algorithms
3.1 Applied data
In this study diagnostic discrete-event data from a railway fleet consisting of 52 train sets, part of which
consists of 9 cars and the other part of 11 coaches, each with two doors on each side, was applied. The
available observation period was 313 days (approximately ten months). It is assumed that due to the size
of the fleet, the data set covers all relevant combinations of different parameters and that these differences
are reflected in the occurrence of the diagnostic events. The data are considered sufficient to demonstrate
the feasibility of the approach.
Data were collected automatically by event recorders. These begin recording parameters when a
predefined diagnostic event occurs. The number and character of parameters recorded depends on the
affected system. The parameters include speed, outside temperature, overhead line voltage etc. The
717
recorded events are always assigned a time stamp, train location (i.e. train number, car number) and
actual location via GPS. Additionally, the usage profiles of each door can be deduced from the diagnostic
data. Diagnostic events fall into one category:
- Driver action required – high priority;
- Driver action required – low priority;
- Driver information;
- Maintenance.
Depending on the category, the corresponding events will be communicated to the driver or only to the
maintenance crew. High priority diagnostic events are those that can potentially result in a delay-causing
event. For door systems, high priority events that require driver action include those of the sliding steps
that cannot be retracted or of the doors that do not close prior to departure. The signals that are relevant
for maintenance include information on deviations from the normal operation of various door system
components and subsystems.
There are 261 distinct event codes for the door system considered in this research. These event codes
indicate the specific door affected by the event. For instance, there can be four different codes for one type
of event (one for each door in the car). In this research, the allocation of an event to a specific door system
is performed in the structure of the data and not in the coding of the events. Therefore it was possible to
reduce the 261 codes to 72 distinct events. Out of the 72 events, 12 required a high priority driver action.
Note that the functionality of door systems can also be affected by external influences, such as
passengers obstructing the door. But, only those events originating from technical malfunctions can be
predicted by functional data-driven algorithms.
For this study one of the high priority events with a root cause in technical faults is selected to demonstrate
the feasibility of the approach.
3.2 Preprocessing of input data
In this research, the observation time window was set at four weeks because this was considered
sufficiently long for different diagnostic event patterns to evolve and given the amount of available data
(ten months). In order to cover many different combinations of diagnostic event patterns and also to
generate a sufficient number of input signals, the data patterns were generated by moving a four-week
fixed time window over the 313-day study period, one day at a time. The consequence of this approach is
that the time periods overlap and the data patterns can show similarities.
It is assumed that the last occurred signal that represents an event has the biggest influence on the
system. Therefore, the time since the last occurred event is calculated for all the signals. Thereby, the
whole history of the data range is used and not only the signals occurring in the defined observation
window.
Before presenting the data to the algorithm the inputs were preprocessed. The inputs were normalized to
be in the value range between 0.1 and 0.9. In order to capture possible deviations from the minimum and
the maximum values which are not covered by the selected data set and to ensure a better generalization
ability, the data range was extended symmetrically by 20 %.
In order to ensure a good learning capability of the algorithm, the input data set is balanced in such a way
that the data set is composed of equal numbers of patterns from each of the three classes. This approach
is valid only if the selected data patterns from one class are representative of the class.
Subsequently, the data sequence is randomized in order to ensure that the generalization ability of the
algorithm is not affected by the sequence of the presented patterns.
For this case study, trapezoidal membership functions were selected. The short-term events occur within
less than 11 days, the medium-term events occur within the time period of 7 and 37 days, and the longterm events occur within longer than 30 days (Figure 2).
Figure 2 Trapezoidal membership functions (short-term, mid-term, long-term)
718
3.3 Applied algorithm
The applied algorithm was composed of a Restricted Boltzmann Machine and several Echo State
Networks, applied to subsets of the data either in parallel or in series. The proposed algorithm learns to
distinguish between the patterns assigned to the three fuzzy classes.
In the first step, the input is presented to a RBM which learns in an unsupervised manner the probability
distribution of the hidden layer. The output of the RBM, the probability distribution of the hidden layer, is
evenly partitioned onto three subsets with overlapping of the boundaries by 10 % of the input length for
each of the subsets. Each of the subsets is presented to an echo state reservoir with different reservoir
parameters. The outputs of the three reservoirs are joined in the next step in a single reservoir which is
trained in a supervised manner with ridge regression. Ridge regression is similar to ordinary linear
regression, but with the difference that a regularization term is included in the minimization problem of
residuals, which imposes rigidity (Hoerl and Kennard, 1970).
In order to compare the results, the class with maximum degree of membership in the original data set is
compared to the predicted class.
4. Results
The holdout technique is used to evaluate the generalization ability of the algorithm classification. For this
purpose, the data set is divided into two subsets, in a way that both are representative of the underlying
distribution of the entire data set and are independent. The algorithm learns based on the first subset of
training data to generalize the data patterns. The second subset of testing data is used to test the
generalization ability of the algorithm. In this research 90 % of data were used for training and 10 % for
testing. Totally, 1075 data patterns were presented to the algorithm during the training phase. The
algorithm was tested on 119 patterns. As mentioned above, the composition of the primary data set
(including training and testing data sets) is balanced between all of the three classes. Both subsets, for
training and testing, are also balanced by randomizing the sequence of the patterns.
In the first step the misclassification rate is computed to evaluate the overall performance of the algorithm.
The misclassification rate does not distinguish between the classification performances of the algorithm for
different classes. It only gives information on the rate of patterns that were misclassified by the algorithm,
irrespective of which of the three classes was misclassified. In the present study the misclassification rate
was 3.5 %.
The confusion matrix is a performance assessment parameter that can distinguish between the classes
and compare the classification performance within a single class (Aronoff, 1982). The classification
performance within each of the classes is displayed in Table 1. The Table shows that the classification
performances of the classes “short” and “long” give a similarly good performance of correctly classified
patterns, above 97 %. The classification performance for the class “mid” is 95 %, slightly below that of the
other two classes. This is due to more similarity of data patterns of the “mid” class to those of the other
classes.
Table 1: Confusion matrix for the classification results
Class “short”
(predicted)
Class “short” (actual) 97.4 %
Class “mid” (actual) 2.3 %
Class “long” (actual) 0 %
Class “mid”
(predicted)
0%
95.3 %
2.7 %
Class “long”
(predicted)
2.6 %
2.3 %
97.3 %
5. Conclusions and discussion
The study proposes a combination of algorithms to extract information on the condition of a railway system
from diagnostic event data. The proposition applied to a specific case study of a door system leads to
good results in terms of precision of classification of potentially occurring disruptions in railways operation.
The main practical purpose of the performed predictions is to reduce the number of operational disruption
events and thereby increase service reliability. This approach enables railway operators to reduce
corrective maintenance actions by improving their planned predictive maintenance actions and thus to
reduce overall maintenance costs.
The prediction of operational disruptions on mid-term horizon is the most beneficial for railway operators.
In principle, if all the mid-term events could be anticipated before occurrence, there would be no short-term
events. Actually, those events in which the fault progression is accelerated and the patterns cannot be
detected in mid-range will still occur even if the anticipation of predicted mid-range failures is successful.
719
Despite the good performance witnessed in the case studied, there remain some limitations. First, only
diagnostic event data could be included in the model developed. The significance of the results could be
increased by integrating additional data in the approach (e.g. data on performed maintenance activities).
There are several other ways to extend the research. The history of the data is partly included in the
approach by computing the interval from the present point of time to the last occurred event. Integrating
the evolution of the condition in time would enable dynamic monitoring of the process: an observed “jump”
from class “long-term” to class “mid-term” would initiate a more attentive monitoring procedure and alert
maintenance planning.
Additionally, part of the data could not be used because the record of the events does not only contain
disruptions caused by technical malfunction but also those caused by external influences, such as
passengers obstructing the door etc. To be able to distinguish the high priority events due to technical root
causes, it might be beneficial to introduce filtering algorithms that can filter the data in an unsupervised
manner in order only to distinguish those data patterns with the technical root cause.
6. Acknowledgement
The authors would like to thank ALSTOM Transportation for providing the data for this research project.
The participation of Olga Fink to this research is partially supported by the Swiss National Science
Foundation (SNF) under grant number 205121_147175.
The participation of Enrico Zio to this research is partially supported by the China NSFC under grant
number 71231001.
References
Ackley D. H., Hinton G. E., Sejnowski T. J. 1985, A Learning Algorithm for Boltzmann Machines, Cognitive
Science 9 (1): 147-169.
Aronoff S. 1982, Classification Accuracy: A User Approach, Photogrammetric Engineering and Remote
Sensing 48 1299-1307.
Caesarendra W., Widodo A., Pham Hong T., Bo-Suk Y., Setiawan J. D. 2011, Combined Probability
Approach and Indirect Data-Driven Method for Bearing Degradation Prognostics, IEEE Transactions on
Reliability 60 (1): 14-20.
Fink O., Nash A., Weidmann U. 2013, Predicting Potential Railway Operations Disruptions Caused by
Critical Component Failure Using Echo State Neural Networks and Automatically Collected Diagnostic
Data, Transportation Research Record: Journal of the Transportation Research Board
Garcia Marquez F. P., Schmid F. 2007, A Digital Filter-Based Approach to the Remote Condition
Monitoring of Railway Turnouts, Reliability Engineering & System Safety 92 (6): 830-840.
Hinton G. E., Osindero S., Teh Y.-W. 2006, A Fast Learning Algorithm for Deep Belief Nets, Neural
Computation 18 (7): 1527-1554.
Hoerl A. E., Kennard R. W. 1970, Ridge Regression: Biased Estimation for Nonorthogonal Problems,
Technometrics 12 (1): 55-67.
Jaeger H. 2005, Tutorial on Training Recurrent Neural Networks, GMD Report 159, German National
Research Center for Information Technology
Lehrasab N., Dassanayake H. P. B., Roberts C., Fararooy S., Goodman C. J. 2002, Industrial Fault
Diagnosis: Pneumatic Train Door Case Study, Proceedings of the Institution of Mechanical Engineers,
Part F: Journal of Rail and Rapid Transit 216 (3): 175-183.
Pai P.-F. 2006, System Reliability Forecasting by Support Vector Machines with Genetic Algorithms,
Mathematical and Computer Modelling 43 (3-4): 262-274.
Roberts C., Silmon J. 2012, Advanced Techniques for Monitoring the Condition of Mission-Critical Railway
Equipment, Railway Safety, Reliability, and Security: Technologies and Systems Engineering, F.
Flamini, Hershey PA, Information Science Reference 341-354.
Smith A. E., Coit D. W., Yun-Chia L. 2010, Neural Network Models to Anticipate Failures of Airport Ground
Transportation Vehicle Doors, Automation Science and Engineering, IEEE Transactions on 7 (1): 183188.
Verstraeten D. 2009, Reservoir Computing: Computation with Dynamical Systems, Faculteit
Ingenieurswetenschappen, Ghent, Ghent University
Zadeh L. A. 1973, Outline of a New Approach to the Analysis of Complex Systems and Decision
Processes, IEEE Transactions on Systems, Man and Cybernetics (1): 28-44.
Zio E., Broggi M., Golea L. R., Pedroni N. 2012, Failure and Reliability Predictions by Infinite Impulse
Response Locally Recurrent Neural Networks, Chemical engineering transactions 26
720

Documents pareils