| Return to Main
Page Prediction of Rutting in Alternative Asphalt Concrete Overlay Methods JOINT
CSHRP/NEW BRUNSWICK BAYESIAN APPLICATION 4.2 ITERATION 2 4.2.1 Definition of Model Two specific models to predict rutting
are to be developed to address both thick and thin
asphalt concrete overlays on asphalt concrete pavements. 4.2.2 Dependent Variable As in the first iteration, the
dependent variable (Yi ) is the depth of rutting measured
in millimeters as per SHRPP333 spec (1.2 M straight
edge) 4.2.3 Independent Variables The independent variables selected for
the three models were based on variables which were
included in the CLTPP rutting model regression
equation. All of these variables were for the top lift of
a virgin AC overlay. The variables selected are: % Air
Voids; % Retained on 4.75 mm sieve; % Crushed particles;
Age in years; and Traffic Log10 annual lane. 4.2.4 Model Type and Functional Form The model type is the empirical model type with a curvilinear functional form (Log Traffic) Rut = Bo + B1 (Voids) + B2
(Retained) + B3 (Age) + B4 (Crushed) + B5 (Log Traffic) 4.2.5 Model Inputs 4.2.5.1 Actual Data "Data" There was no new data introduced to
either database for the second iteration. It was
suggested by the consultant to remove any zero distress
data from the database. Two of the assumptions of
regression are first that errors have a mean equal to
zero (predicted actual = error ) and secondly that the
distribution of the error is normal. Zero data points
would have no error and therefore would not be normally
distributed. Thus zero distress data do not fit the
assumptions of regression and can not be used. As well as removing zero distress data,
the variable thickness was removed from both the thick
and thin databases. Only those contracts with complete
sets of independent variables and corresponding dependent
variables were included. 4.2.5.2 Encoding Expert Judgement "Prior" The same expert judgement encoded from
the nine experts used in iteration one was used in
iteration two. The variable thickness was removed from
each matrix and classical regressions were redone for
both models. For the thick model the expert judgement
from the thick overlay with milling and the thick overlay
with padding matrices had to be combined before a
classical regression could be done. 4.2.6 Analysis of Data The process of selecting the best
combination of experts to give the best model was a trial
and error approach. This trial and error approach was
first carried out on the thin model. Due to time
constraints, the same experts who were found to give the
best predictions for the thin model were then used as the
best combination of experts for the thick model. As suggested by the consultant at the
workshop, all of the zero distress data was removed. 4.2.7 Model Runs 4.2.7.1 Thin Model A summary sheet of comparison of model predictions to actual measured rut depth follows the model runs on page 38. 4.2.7.1.1 First Model Run For the second iteration of the thin
model, the first step in selecting the best combination
of experts was to include all of the expert judgement for
the prior in one database. This prior was run through the
XLBayes program with the thin database and the resulting
posterior model was predicting twice the measured rut
value (see Appendix FIteration 2 Run 1 for results).
4.2.7.1.2 Second Model Run To help select the best combination of experts, a line plot comparing each expert's prediction was constructed (page 35). This plot showed that expert Flemming was predicting very high rut values and that expert Nicholson was predicting very low rut values. Following additional discussions with both experts, each identified areas of uncertainty and were not comfortable with their estimates of rut values. Therefore the second run on the thin model had Flemming and Nicholsons' expert judgement removed. This second run brought the predicted values closer to the measured rut values (see appendix F Iteration 2 Run 2 for results). Line Plot
Comparison of Experts 4.2.7.1.3 Third Model Run Another line plot of predicted rut
values was constructed which showed two groupings Experts
Hughes' and Robertsons predictions were grouped below the
other experts. Therefore, for this run the expert
judgement from Hughes and Robinson were removed leaving
experts MacFarlane, Legere, Leblanc, Doucet and Crandall.
For this third run the experts were combined using a
NPrior approach where each experts judgement is
statistically combined in series until one data base
results, as opposed to compiling all of the expert
judgement once into a database. The resultant model
increased the predicted rut values even higher, from the
measured values, than the previous two models. See
Appendix F for complete calculations. 4.2.7.1.4 Fourth Model Run Since Ray Leblanc's predictions were
higher than the other experts, a separate model was
developed to determine if his predictions were higher
because of the region of the Province in which he worked.
A Bayesian model was developed from a database consisting
of data from contracts from his region of the Province
together with a "prior" database using only his
expert judgement. The result of this model showed that
his predictions were still higher than the actual
measured rut values for those contracts on which he
worked. It appears from the results that he anticipates
that rutting is a more serious problem than it is.
Therefore run four (4) has expert Leblanc removed,
leaving four experts ;experts MacFarlane, Leger, Doucet,
and Crandall . For this fourth run, two trials were
made to develop a model. The first trial combined the
experts into one database, in one step, while the second
trial combined the experts using the NPrior statistical
approach as described previously. There was a negligible
difference between the two methods of combining the
expert judgement. This model with Leblanc removed had
predicted the rut measurement closer to the actual
measured rut values than did the third model run. 4.2.7.1.5 Fifth Model Run This fifth run included those experts
who were included in run 2, except LeBlanc leaving
experts Robertson, MacFarlane, Legere, Hughes, Doucet and
Crandall. This resulting model was predicting closer to
the actual measured rut values than any previous model. It is interesting to note that when
combining the experts an attempt was made to hold the
Degrees of Freedom (DOF) constant for each expert so that
each expert was assured of having equal weighting on the
resultant model. However, it was found that there was a
negligible difference between this model and the
resultant model where all experts were combined in one
operation. Therefore, the method of inputting the prior
expert judgement data as one database in one step works
just as well. 4.2.7.1.6 Sixth Model Run The sixth model run used the same
experts as model five (5), Robertson, MacFarlane, Legere,
Hughes, Doucet and Crandall. This model combined the
expert judgement to get an average rut value of the 6
experts for each cell location of a 48 cell matrix. This
model predicted closer to the actual measured rut values
than any of the previous other models. 4.2.7.1.7 Seventh Model Run This seventh model run was attempted to
see if the predicted rut value could come closer to the
actual measured rut value. This model was the same as
model six but had the variable " % crushed"
removed because it appeared to be acting like a constant.
The result of removing " % crushed" worsened
the predictions. Therefore, from a predictive versus
actual rut measurement comparison the model resulting
from the sixth run proved to be the best model . 4.2.7.2 Thick Model A summary sheet comparing thick models to actual rut depth (mm) can be viewed on page 41. 4.2.7.2.1 First Model Run The Thick with Milling and Thick with
Padding databases were combined for the second iteration.
Due to time limitations the experts that were selected as
the best combination of experts in the THIN model
(Nicholson, Flemming and LeBlanc removed) continued to be
used as the best combination of experts for the Thick
model. All the expert judgement was combined into one
database. This first model run was not predicting rut
values close to the actual measured rut values. 4.2.7.2.2 Second Model Run For the second run the expert judgement
was combined in a 48 cell matrix using the average rut
values ( average of milling and padding values for each
expert was then averaged for the 6 experts). This model
predicted marginally better than the first run. 4.2.7.2.3 Third Model Run Additional quality control was done on
the data base and remaining outliers removed. This model
was predicting rut values closer to the actual measured
values than the first two models. 4.2.7.2.4 Fourth Model Run For the fourth run the variable "
% crushed" was removed. The rut predictions became
worse instead of better. 4.2.7.2.5 Fifth Model Run The fifth run used a cumulative log
traffic variable instead of using the annual log traffic
variable used in all previous models. This model
predicted closer than any previous model. 4.2.7.2.6 Sixth Model Run The sixth run was based on model five
with the variable % crushed removed. This increased the
difference between the measurement for predicted rut and
the actual measured rut value. 4.2.7.2.7 Seventh Model Run Since expert judgement was encoded for
125mm thickness, a condensed database that included
contracts with thickness from 125mm to 140mm was the next
model. Average rut values were used for the expert
judgement and cumulative traffic values were used for
both the expert judgement and the database. This resulted
in a model that predicted rut values considerably lower
than the actual rut values. 4.2.7.2.8 Conclusion From a predictive versus actual rut
measurement comparison the best predictive model for the
Thick Model was model 5 with a cumulative log traffic
variable and average rut values for the expert judgement. 4.2.8 Sensitivity Analysis The sensitivity analysis performed in
iteration 1 is the same as was performed in this second
iteration. Therefore, only the results of iteration 2
will be explained for the two models, Thin and Thick. 4.2.8.1 Building Predictive Cases 4.2.8.1.1 Thin Overlay Model The table and the sensitivity line plot for the thin model can be seen on the following pages (43 & 44). This sensitivity line plot shows that the variable age still appears to be the dominant variable, but not to the extent that it was in the first iteration. From the graph it can be seen that the "prior", "posterior" and "data" all agree on the slope (sign) of the variables age, % crushed and traffic. They show that as the age of the overlay and traffic on the overlay increase, the rut depth would increase and as the variable % crushed increased, the rut depth would decrease. The experts and the data do not agree on the slope of the variables % air voids and % retained. The data, at this time, is not very definitive therefore, the posterior reflects the prior. Table: 80% CI Prediciton Table for Thin Figure: Sensitivity Plot for Thin
4.2.8.1.2 Thick Overlay Model The sensitivity line plot of the Thick model does not indicate that there are any variables that are dominant in the model (see table and line plot pages 45 & 46). Each variable seems to have equal effect on the prediction of the model. The "prior", "posterior" and the "data" agree on the slope of all the variables except for the variable % retained. The variable % retained in the "data" model is not definitive at this time. Table: 80% CI Prediction Table for Thick Figure: Sensitivity
Plot for Thick 4.2.8.2 Sensitivity of Input Assumptions Due to time constraints the only
sensitivity analysis performed was to determine the
sensitivity of the posterior model to variations to the
Degrees of Freedom (DOF) of the "prior". It was
very important to check this particular sensitivity
because combining experts increased their degrees of
freedom to a level not considered representative of their
knowledge. Their degrees of freedom are elevated
considerably from 42 to 426. 4.2.8.2.1 Thin Overlay Model The sensitivity of the "posterior" standard error to the "prior" DOF shows a slight curve in the line ( see page 48 ) indicating the thin model is slightly sensitive to the "prior" DOF. As long as the "prior" DOF is at least as high as that for the data the sensitivity shouldn't affect the results. Figure: Sensitivity
to DOF for Thin 4.2.8.2.2 Thick Overlay Model The sensitivity plot on page 49 ,representing the "posterior" standard error to the "prior" DOF for the thick model in this second iteration is horizontal. The "prior" degrees of freedom is not a dominant factor in this model. Figure: Sensitivity to DOF for Thick 4.2.9 Inference from Analysis of Iteration 2 4.2.9.1 Thin Model When the evaluation table for the thin
model using the sixth model run was reviewed the
statistical results looked good (see page 53) . The model
had a low intercept value, a reasonably low standard
error, and all the variables were found to be
statistically significant. However, the model was still
predicting rut values approximately 1.5 times greater
than the actual rut values After reviewing the model runs of this
second iteration for the Thin Model it was decided to try
a model run using the same experts as model four;
Crandall, Doucet, Legere, and MacFarlane, to verify that
the method of averaging rut values was valid. In this
run, instead of using the two trials mentioned in model
four, another trial was ran where the expert judgements
were combined to get an average rut value of the four
experts ( model six followed this method with six
experts) for each cell location of the 48 cell matrix.
When this model was run through the Bayesian analysis it
was found to be the closest predicting model to the
actual measured rut values . The comparison of the
predicted rut measurement versus the actual measurement
rut value between all three methods for entering expert
judgement can be viewed on the following page (page 52).
The first trial combined the experts into one database,
the second trial combined the experts using the NPrior
statistical approach within XLBayes explained in third
model run, and the third trial combined expert judgement
to get an average rut value of the four experts for each
cell location of the 48 cell matrix. The proper method to follow when
combining experts is not an established science. It is
not known what the "philosophical" implications
are in terms of Bayes Theorem, according to the Workshop
Summary of the Canadian Bayesian Applications, dated May,
1995. The knowledge of how or when to combine experts is
certain to be studied in the future. However, for
purposes of this report, combining experts using average
expert judgement brings the predicted rut measurement
closer to the actual measured rut value and therefore
deserves consideration. 4.2.9.2 Thick Model The evaluation table for the second
iteration of the thick model is shown on page 54 . The
model used was model 5 with a cumulative log traffic
variable and average rut values for the expert judgement.
All the variables were statistically significant as well.
However, a consistent difference between the predicted
rut measurement and the actual measured rut value could
not be determined as it varied thoughout the database. ANALYSIS AND INTERPRETATION OF RESULTS THIN MODEL The following evaluation table shows the results from iteration two for the thin model using combined expert judgement. MODEL of : Thin O/L The thin model has a low intercept
value Bo = .126523 and a reasonably low standard error
value of Se = 2.65572 and strong T values. However the
model is still predicting rut values approximately 1.5
times greater than the actual rut values. THICK MODEL The evaluation table shows the results of the second iteration of the thick model using combined expert judgement. The Thick model has a fair intercept Bo = 3.316905 and a good standard error Se =2.233469 and strong T values. |