| Return to Main
Page Predicting Roughness Progression of Asphalt Overlays JOINT
CSHRP/ALBERTA BAYESIAN APPLICATION 1.0 INTRODUCTION Bayesian analysis was first introduced
to Alberta Transportation and Utilities staff at a
workshop held in Calgary in May 1991. Clayton, Sparks and
Associates Inc. (now VEMAX Management Inc.) of Saskatoon
in cooperation with Decision Focus Inc. of California,
were hired by the Canadian Strategic Highway Research
Program (CSHRP) to develop and transfer Bayesian
Statistical Methods to Canadian Highway Agencies. Bayesian statistical theory was
developed by Reverend Thomas Bayes in his paper entitled
"An Essay Towards Solving a Problem in the Doctrine
of Chance" published in 1763. The theory was later
revisited in midlate 1980's by Raiffa, Pratt and
Zellner. The main advantage of this statistical method
over conventional statistics is that expert judgement can
be combined, in a statistically rigorous manner with real
world data. This approach is especially useful in
situations where the databases are of insufficient
quality or size to support classical regression. Such
situations are very common in highway performance
modelling applications. In November 1994, a workshop on
Bayesian methods was held in Winnipeg. Alan Mah from
Research and Development Branch and Marian Kurlanda from
Roadway Engineering Branch attended this workshop. It was
decided that each of the eight agencies participating in
the Joint CSHRP/Agency Bayesian Application Project would
prepare its own model using the Bayesian approach, go
through a first iteration, and present their findings in
Ottawa in May 1995. In January 1995, Alberta's Bayesian
Application was selected to be the development of an
overlay roughness progression model for the first overlay
cycle. To limit the scope of the modelling effort, it was
decided that only pavement sections located in Central
Alberta with granular base courses, would be considered.
Roughness was selected because the Department's PMS
lacked an adequate model. The Department's existing model
predicts roughness progression of original pavements. It
was felt that a new model specific to overlays would be
timely given the decrease in new pavement construction
and the increasing emphasis on the pavement
rehabilitation. 2.0 BAYESIAN METHOD The Bayesian statistical approach
combines prior knowledge (experience) with field data. In
highway engineering, new models are continually needed to
better predict pavement performance or to run various
Pavement Management Systems; however, it takes much time
and expense to gather data about pavement performance. In
such situations, the Bayesian approach is useful in short
circuiting the data collection cycle. After gathering
some data, which may not be sufficient to support
meaningful classical regression, one can collect some
expert judgement and combine the two sources of
information into a relatively robust regression model.
The expert judgement serves to bridge the gaps in field
data. It is obvious that a lot of valuable
information can be obtained from people who have observed
pavement performance throughout their careers. These
professional and field staff know what variables are
contributing to pavement performance. They understand the
functional relations of the variables. Their impressions
on these relationships can be encoded and when combined
with field data, these impressions can have profound
impacts on the resulting posterior models. There are several steps that can be followed to guide the execution of a Bayesian statistical analysis. These steps are outlined in what has been termed "the template" , see Figure 1. The Bayesian Application in Alberta
followed the steps rayed out in the template. The
discussion also follows this general outline. 2.1 Model Selection As described in the introduction, we
have decided to model pavement performance in terms of
roughness, and specifically Riding Comfort Index (RCI) of
asphalt overlays (1st overlay) on pavements over granular
base located in the central part of Alberta. 2.2 Select Dependent Variable Riding Comfort Index (RCI) was selected
as the dependent variable. RCI is a measure of pavement
roughness. In Alberta, roughness is assessed subjectively
by a panel of highway engineers this method was first
suggested by the Canadian Good Roads Association. To use
this subjective judgement for pavement inventory
purposes, the panel's subjective RCI ratings are
correlated to results obtained using response type
roughness measuring equipment (in Alberta's case, the Cox
Road Meter). A correlation is then used to convert the
number of counts obtained using the Cox Road Meter for a
particular highway to RCI. Typically, in developing the
correlation, a circuit of pavement test sections is
established. The sections are rated by panel of raters
who drive the sections. The sections' roughness is also
measured using the Department's Cox Road Meters.
Equations correlating Riding Comfort Index and number of counts per
kilometer (as recorded by the Cox Meters), are then
developed. These correlations are used in the
Department's Pavement Management System. Correlation of Road Meters with Riding
Comfort Index is a standard procedure performed by
Roadway Engineering Branch every three to four years. The
procedure is also periodically performed when any major
repairs are made to the Department's Cox Road Meter(s). RCI is a subjective measure of
roughness and is not stable over time. Its value depends
on many other subjective factors and not just pavement
roughness. 2.3 Select Model Type In regression analysis, Classical or
Bayesian, there are three distinct types of models:
empirical, mechanistic and empiricalmechanistic. The
different types of models essentially refer to how the
functional form of the regression equation was derived.
With empirical models, the functional form of the
regression equation is to be derived by statistically
processing the data the data drives the selection of
the functional form. Typically software programs such as
SAS, Shazam or NCSS are used to facilitate the selection
of the empirical functional form. With mechanistic
models, the functional form is developed from engineering
first principles (i.e. physical laws like F=ma).
Empiricalmechanistic models blend the two different
approaches. The Alberta Bayesian Roughness models
implemented utilizes an empirical model type. 2.4 Select Independent Variables Road roughness is one of the most important pavement evaluation parameters, unfortunately it is also one of the most complex. The mechanism driving roughness progression is not fully understood; however, some work has to be undertaken to qualify it. For instance, the World Bank's Highway Design and Maintenance Model (HDM) postulated that changes in roughness are due to a combination of:
The HDM model (a relatively sophisticated model) predicts roughness progression as a function of:
The HDM model uses parameters that are
not very often recorded in pavement management databases.
To support the calibration of such a complex regression
model one would need data from a variety of pavement
sections where the various forms of distresses had been
measured for a long time. So far no one has developed
such a database (with the exception of the World Bank
studies, were in tropical and subtropical areas). Efforts
like CSHRP may provide one in the future, but the
ongoing care and feeding these complex models will
continue to be a problem. On the other side of the complexity
spectrum, Alberta's current roughness model predicts
roughness only as a function of the immediate past
roughness and pavement age. This model is currently used
in the Department's Pavement Management System. The
proposed Bayesian model would be somewhat more complex.
Our intention was to expand the model to consider 46
independent variables. To select which independent variables were to be included in our model, we solicited the opinion of experts in the Department. A list of candidate variables was developed and the experts were asked to rank each variable with respect to its influence on PCI of overlays. The candidate variables provided to the experts included: All these variables are routinely gathered and stored in Alberta's Pavement Management System database. A questionnaire was prepared and circulated to seven experts listed in Table 1. The experts were selected from Alberta Transportation and collectively reflect 164 years of transportation experience. The experts were asked which variables they felt that were the most important with respect to predicting the RCI of an overlay. Each variable was rated using a fivepoint scale (very important 100%, 75% 50% 25%, and not important 0%). The ratings of all seven experts participating were input to a Microsoft Excel spreadsheet and the statistics for each variable were calculated. Results of this analysis are summarized in Figure 2. Additional details with respect to the selection of the independent variables is provided in appendices A and B. Figure 2 - Ranking of Candidate Variables Based on the experts' ranking, the following six variables were selected to be used in the first iteration of the Bayesian model. 2.4.1 Development of Soil Roughness Factor The soil_type variable listed above is
a categorical variable with many states (CH, CL, etc.).
As such, it is difficult to implement in the context of a
regression analysis. In order to capture the effect of
soil type in the regression model, a correlation was
developed that would map the effect of different soil
types to a continuous variable which reflected its
perceived influence on RCI. The correlation was developed
subjectively by encoding our panel of experts. The soil factor developed by our
experts quantifies the aggressiveness of different soil
types with respect to roughness. Each soil type was rated
on the scale from zero to five (0 to 5). An an index
value of "0" reflected a passive soil
environment (i.e., soils having very little or no effect
on the roughness progression). A value of "5"
reflected soil type that contributes substantially to
roughness progression. The experts were provided with a
questionnaire and asked to rate the soil types described.
The questionnaire and the obtained from our experts are
attached in Appendix C. 2.5 Postulate Functional Form The following functional form was proposed for first iteration of the Bayesian project: The first iteration model was purposely
designed to be as simple as possible and an
additivelinear functional form was selected. Subsequent
iterations may attempt to improve the functional form of
the model by adding cluster variables. One suggestion for
future iterations may be to explore curvilinear
correlation between RCI and the overlay age variable. 2.6 Assemble Information To calculate the coefficients (bi's) of
the regression equation, the data (expert judgement and
field data) has to be gathered. The data would be
composed of:
All data was combined in Microsoft
Excel. 2.6.1 Assembling
Sample Data Sample data was obtained from the
Department's Pavement Management System datafile. There
are two datafiles used in the Department's PMS; one for
primary and one for secondary highway networks. The files
are updated each year and both reside on the Department's
main frame computer. The files were transferred from the
main frame computer to a personal computer and stored in
ASCII format. However the files were so large ( 8 MB)
that only the SAS text editor was able to handle them.
The files were divided into 35 files, each of
approximately 200,000 bytes. This allowed each of the
smaller files to be handled using the QED text editor. The data files were selectively
compiled into the "Bayesian database" Certain
limitations were set on information that was brought into
the Bayesian database. These limitations included
screening the data and including only those sections
which complied with the following criteria:
Several macro routines were written
using the QED test editor to assemble the sample data in
a format that could be easily transferred to EXCEL. These
routines greatly reduced the time required to
preprocess the PMS data. Preliminary regression analyses on the
data extracted from the Department's PMS. These analyses
showed that despite our attempt to filter the data, it
was still quite 'noisy'. Further analysis of the data
showed that there were errors in our PMS datafile. These
errors were due to several reasons, including but not
limited to, the following:
Data with obvious errors were deleted
from the Bayesian data base. However, the discrepancies
due to the subjectivity of RCI could not be addressed
within the scope of this project. The sample data was also used to
prepare histograms for each variable. These histograms
helped to screen the data (quality assurance), assess the
distribution of each variable, and to set "encoding
intervals". 2.6.2 Assembling Expert Judgment Data (Prior Data) Several methods have been developed by
the Consultant which allow you to encode an expert's
judgement. For our project, the full orthogonal matrix
method was selected. The advantages of this method
include:
The disadvantages of this method
include:
The matrix used to encode the judgement of the experts is shown in Appendix D. The matrix contains two sheets, each sheet is specific to different setting of the Soil Roughness Factor variable (1.0 and 4.0). The remaining variables were encoded at two or three levels as illustrated in Table 2. The variables thought to dominate roughness progression were encoded using three levels. Less dominate variables were encoded using two levels. An encoding package was developed to
support the encoding of the experts. This package
defined: the problem, the variables and the process of
encoding expert judgement. The purpose of the package was
to ensure that there was no misunderstanding amongst the
experts and that they all interpreted the problem
consistently. A copy of the encoding package is included
in Appendix D. The encoding package was sent to each
of the seven experts listed in Table 1. Five of the experts reviewed the
package and completed the matrices as instructed. The
completed matrices were returned to us for analysis. Information from the matrix was
manually entered into EXCEL. The resulting spreadsheets
for each expert has an identical format to the
spreadsheet containing the sample data. The expert
judgement data is shown in Appendix E and the PMS data in
Appendix F. The Bayesian analysis consisted of a single iteration using a combined prior representing the five experts encoded. The analysis utilized the XLBayes software. The methodology followed is illustrated conceptually in Figure 3. Figure 3 - Bayesian Analysis Process A combined prior was selected because
the Department wanted a "single" model which it
reflected the Departments understanding of the problem.
If each expert was analysed separately, five separate
posterior models would have been developed. The
Department would then be faced with the challenge of
selecting amongst these models. This dilemma was
preempted by developing a combined prior. Before combining the experts, their individual prior models were compared (quality assurance) to determine if the experts agreed on the relative impact that each independent variable had on roughness progression. This was accomplished by running a classical regression analysis on each of their encoded judgements and then computing prediction sensitivity results for each of their (prior) models. As illustrated in Figure 4, the experts were in agreement. All of the experts agreed on the size and magnitude of each of the coefficients. These results are very promising as it demonstrates that there is a consistent understanding amongst the experts with respect to pavement roughness and the factors that influence it. Figure 4 - Sensistivity Analysis Given the high degree of agreement
amongst the experts, we were confident in our decision to
combine their judgements into a single prior. The combined prior was computed using
the classical regression option in XLBayes. This process
was straightforward. The raw dependent and independent
data from the experts was "rangedin" and the
program executed. XLBayes computed and returned the
statistics for the prior, namely:
Once the prior was calculated, the next
step was to select the NPrior option in the main menu
and compute the posterior model. To do this the prior and
sample data information was "ranged in" and the
program executed. XLBayes then computed and returned the
statistics for the prior model, posterior model and data
model. The prediction feature is optional and was not
used for this analysis. In the event that a reader may want to reproduce our results, Table 3 provides the names of the files used in the analysis described above. 4.0 MODEL RESULTS AND EVALUATION As a result of the Bayesian analysis, a new (posterior) model for predicting RCI of overlaid pavement was developed. The model has six independent variables. The predictive (posterior) equation is as shown below: The model combines data taken from the
Department's Pavement Management System (PMS) and expert
judgement of five experts who participated in the final
encoding of their expertise. Table 4 illustrates the resulting Prior model (based on expert judgement only), Data model (based on data only), and Posterior model (combined) resulting from the analysis. The model results were analyzed using the Evaluation Table shown in Table 5. For each independent variable the rationality of the sign and magnitude, as well as the statistical significance of the coefficient, is compared. The table indicates which information (prior or data) is reflected in the posterior. The output of the analysis is shown in Appendix G and the model sensitivity analysis in Appendix H. In the case of our models, all
variables in all three models have rational signs. There
is reasonable agreement between the data and the experts
on the magnitude of each coefficient. There does not
appear to be any significant outliers with respect to
sign. The results indicate that the posterior
model is buying more into the expert judgement than the
PMS data. This is a function of the relative degrees of
information and variances/covariances for the two data
sets. This can be seen graphically in the probabilistic
density plots for each variable included in Appendix G. If the posterior model was used, the
predicted RCI values would be lower then those predicted
using the data model. The analysis also indicates that,
as a rule, prediction made using the posterior model will
have less uncertainty than the predictions made using the
data model. Three out of six independent variables
have not produced appreciable changes to the model. These
were: rci_perf (RCI performance of the original
pavement), soil_factor and ol_thick (overlay thickness).
In the case of rci_perf variable it may be worth it to
redefine this variable. As it is used in the model the
variable is interpreted as the slope of the RCItime
curve for the original pavement. As such, the assumption
made is that the RCI value changes as a linear (straight
line) function of time. In fact, the change can be along
a curve. Further investigation into this relationship may
be needed. The soil_factor variable was developed
to facilitate the inclusion of soil influence in the
model. The variable was developed using expert judgement.
It may be that two soil variables may be required: one to
describe how the soil influences development of
transverse cracks, and the other to describe how the soil
influences development of heaving resulting from
expansion or frost action. Both distresses will cause
pavement roughness, but the differences are governed by
different physical phenomenon. Overlay thickness (ol_thick) is the
third variable for which the posterior's change over the
prior was not appreciable. The model indicates that the
overlay thickness does not contribute substantially to
the performance of the overlaid pavements. The overlay age (ol_age) and the
initial RCI after the overlay (ini_rci) have the greatest
impact on roughness progression. Subsequent fine tuning
of the model should focus on improving these two
variables. One possibility may be to transform the
variable. Keeping with philosophy of Bayes, the experts
should be solicited for their ideas on how best to
transform the age and traffic variables. Redevelopment of traffic data
(total_esal) and segmentation of the model is another
possibility to improve the model. The distribution of the
traffic data in the PMS is bimodal. As such, two models
could be developed: one for low and one for high traffic
volumes. The second iteration of the model could also
include a redefined traffic variable. Instead of using
annual traffic, cumulative traffic may be tried. 5.0 CONCLUSIONS As part of the joint Alberta/CSHRP
Bayesian Applications Project, a model for predicting the
RCI of overlaid pavements with granular bases, located in
central Alberta was developed. The model combines data
from the Department's PMS and the expert judgement of
five experts. The model can be used in the Department's
PMS, however, some changes in the PMS program would be
needed to facilitate its integration into the software. In the future, the Bayesian method
should be definitely investigated as a tool for other
model development projects in the Department, especially
in those areas where not much historical data exists.
Such projects may include the prediction of performance
of new materials and pavement treatments not previously
used by the Department. The performance of crack sealants
may be one example. This method is not limited to the
pavement engineering problems. Other areas of possible
use include traffic engineering, location studies or
geotechnical engineering problems. At present, the method was confined to Roadway Engineering Branch of the Department. Only a handful of people were involved in the study. At first, the participants were skeptical about the method and its potential, but with time and more information they started to be convinced that the use of the method has application within the Department. However, more work should be done to further publicize the method and its potential. An XLBAYES manual and brochure explaining the method and its advantages would be of great value to highway agencies across Canada. |