Return to Main Page

Prediction of Rutting in Alternative Asphalt Concrete Overlay Methods

JOINT C­SHRP/NEW BRUNSWICK BAYESIAN APPLICATION

4.2 ITERATION 2

4.2.1 Definition of Model

Two specific models to predict rutting are to be developed to address both thick and thin asphalt concrete overlays on asphalt concrete pavements.

4.2.2 Dependent Variable

As in the first iteration, the dependent variable (Yi ) is the depth of rutting measured in millimeters as per SHRP­P­333 spec (1.2 M straight edge)

4.2.3 Independent Variables

The independent variables selected for the three models were based on variables which were included in the C­LTPP rutting model regression equation. All of these variables were for the top lift of a virgin AC overlay. The variables selected are: % Air Voids; % Retained on 4.75 mm sieve; % Crushed particles; Age in years; and Traffic Log10 annual lane.

4.2.4 Model Type and Functional Form

The model type is the empirical model type with a curvilinear functional form (Log Traffic)

Rut = Bo + B1 (Voids) + B2 (Retained) + B3 (Age) + B4 (Crushed) + B5 (Log Traffic)

4.2.5 Model Inputs

4.2.5.1 Actual Data ­ "Data"

There was no new data introduced to either database for the second iteration. It was suggested by the consultant to remove any zero distress data from the database. Two of the assumptions of regression are first that errors have a mean equal to zero (predicted ­ actual = error ) and secondly that the distribution of the error is normal. Zero data points would have no error and therefore would not be normally distributed. Thus zero distress data do not fit the assumptions of regression and can not be used.

As well as removing zero distress data, the variable thickness was removed from both the thick and thin databases. Only those contracts with complete sets of independent variables and corresponding dependent variables were included.

4.2.5.2 Encoding Expert Judgement ­ "Prior"

The same expert judgement encoded from the nine experts used in iteration one was used in iteration two. The variable thickness was removed from each matrix and classical regressions were redone for both models. For the thick model the expert judgement from the thick overlay with milling and the thick overlay with padding matrices had to be combined before a classical regression could be done.

4.2.6 Analysis of Data

The process of selecting the best combination of experts to give the best model was a trial and error approach. This trial and error approach was first carried out on the thin model. Due to time constraints, the same experts who were found to give the best predictions for the thin model were then used as the best combination of experts for the thick model.

As suggested by the consultant at the workshop, all of the zero distress data was removed.

4.2.7 Model Runs

4.2.7.1 Thin Model

A summary sheet of comparison of model predictions to actual measured rut depth follows the model runs on page 38.

Comparison Summary Sheet

4.2.7.1.1 First Model Run

For the second iteration of the thin model, the first step in selecting the best combination of experts was to include all of the expert judgement for the prior in one database. This prior was run through the XLBayes program with the thin database and the resulting posterior model was predicting twice the measured rut value (see Appendix F­Iteration 2 ­ Run 1 for results).

4.2.7.1.2 Second Model Run

To help select the best combination of experts, a line plot comparing each expert's prediction was constructed (page 35). This plot showed that expert Flemming was predicting very high rut values and that expert Nicholson was predicting very low rut values. Following additional discussions with both experts, each identified areas of uncertainty and were not comfortable with their estimates of rut values. Therefore the second run on the thin model had Flemming and Nicholsons' expert judgement removed. This second run brought the predicted values closer to the measured rut values (see appendix F ­Iteration 2 ­ Run 2 for results).

Line Plot Comparison of Experts

4.2.7.1.3 Third Model Run

Another line plot of predicted rut values was constructed which showed two groupings Experts Hughes' and Robertsons predictions were grouped below the other experts. Therefore, for this run the expert judgement from Hughes and Robinson were removed leaving experts MacFarlane, Legere, Leblanc, Doucet and Crandall. For this third run the experts were combined using a N­Prior approach where each experts judgement is statistically combined in series until one data base results, as opposed to compiling all of the expert judgement once into a database. The resultant model increased the predicted rut values even higher, from the measured values, than the previous two models. See Appendix F for complete calculations.

4.2.7.1.4 Fourth Model Run

Since Ray Leblanc's predictions were higher than the other experts, a separate model was developed to determine if his predictions were higher because of the region of the Province in which he worked. A Bayesian model was developed from a database consisting of data from contracts from his region of the Province together with a "prior" database using only his expert judgement. The result of this model showed that his predictions were still higher than the actual measured rut values for those contracts on which he worked. It appears from the results that he anticipates that rutting is a more serious problem than it is. Therefore run four (4) has expert Leblanc removed, leaving four experts ;experts MacFarlane, Leger, Doucet, and Crandall .

For this fourth run, two trials were made to develop a model. The first trial combined the experts into one database, in one step, while the second trial combined the experts using the N­Prior statistical approach as described previously. There was a negligible difference between the two methods of combining the expert judgement. This model with Leblanc removed had predicted the rut measurement closer to the actual measured rut values than did the third model run.

4.2.7.1.5 Fifth Model Run

This fifth run included those experts who were included in run 2, except LeBlanc leaving experts Robertson, MacFarlane, Legere, Hughes, Doucet and Crandall. This resulting model was predicting closer to the actual measured rut values than any previous model.

It is interesting to note that when combining the experts an attempt was made to hold the Degrees of Freedom (DOF) constant for each expert so that each expert was assured of having equal weighting on the resultant model. However, it was found that there was a negligible difference between this model and the resultant model where all experts were combined in one operation. Therefore, the method of inputting the prior expert judgement data as one database in one step works just as well.

4.2.7.1.6 Sixth Model Run

The sixth model run used the same experts as model five (5), Robertson, MacFarlane, Legere, Hughes, Doucet and Crandall. This model combined the expert judgement to get an average rut value of the 6 experts for each cell location of a 48 cell matrix. This model predicted closer to the actual measured rut values than any of the previous other models.

4.2.7.1.7 Seventh Model Run

This seventh model run was attempted to see if the predicted rut value could come closer to the actual measured rut value. This model was the same as model six but had the variable " % crushed" removed because it appeared to be acting like a constant. The result of removing " % crushed" worsened the predictions. Therefore, from a predictive versus actual rut measurement comparison the model resulting from the sixth run proved to be the best model .

4.2.7.2 Thick Model

A summary sheet comparing thick models to actual rut depth (mm) can be viewed on page 41.

Comparison Summary Sheet

4.2.7.2.1 First Model Run

The Thick with Milling and Thick with Padding databases were combined for the second iteration. Due to time limitations the experts that were selected as the best combination of experts in the THIN model (Nicholson, Flemming and LeBlanc removed) continued to be used as the best combination of experts for the Thick model. All the expert judgement was combined into one database. This first model run was not predicting rut values close to the actual measured rut values.

4.2.7.2.2 Second Model Run

For the second run the expert judgement was combined in a 48 cell matrix using the average rut values ( average of milling and padding values for each expert was then averaged for the 6 experts). This model predicted marginally better than the first run.

4.2.7.2.3 Third Model Run

Additional quality control was done on the data base and remaining outliers removed. This model was predicting rut values closer to the actual measured values than the first two models.

4.2.7.2.4 Fourth Model Run

For the fourth run the variable " % crushed" was removed. The rut predictions became worse instead of better.

4.2.7.2.5 Fifth Model Run

The fifth run used a cumulative log traffic variable instead of using the annual log traffic variable used in all previous models. This model predicted closer than any previous model.

4.2.7.2.6 Sixth Model Run

The sixth run was based on model five with the variable % crushed removed. This increased the difference between the measurement for predicted rut and the actual measured rut value.

4.2.7.2.7 Seventh Model Run

Since expert judgement was encoded for 125mm thickness, a condensed database that included contracts with thickness from 125mm to 140mm was the next model. Average rut values were used for the expert judgement and cumulative traffic values were used for both the expert judgement and the database. This resulted in a model that predicted rut values considerably lower than the actual rut values.

4.2.7.2.8 Conclusion

From a predictive versus actual rut measurement comparison the best predictive model for the Thick Model was model 5 with a cumulative log traffic variable and average rut values for the expert judgement.

4.2.8 Sensitivity Analysis

The sensitivity analysis performed in iteration 1 is the same as was performed in this second iteration. Therefore, only the results of iteration 2 will be explained for the two models, Thin and Thick.

4.2.8.1 Building Predictive Cases

4.2.8.1.1 Thin Overlay Model

The table and the sensitivity line plot for the thin model can be seen on the following pages (43 & 44). This sensitivity line plot shows that the variable age still appears to be the dominant variable, but not to the extent that it was in the first iteration. From the graph it can be seen that the "prior", "posterior" and "data" all agree on the slope (sign) of the variables age, % crushed and traffic. They show that as the age of the overlay and traffic on the overlay increase, the rut depth would increase and as the variable % crushed increased, the rut depth would decrease. The experts and the data do not agree on the slope of the variables % air voids and % retained. The data, at this time, is not very definitive therefore, the posterior reflects the prior.

Table: 80% CI Prediciton Table for Thin

Figure: Sensitivity Plot for Thin

 

4.2.8.1.2 Thick Overlay Model

The sensitivity line plot of the Thick model does not indicate that there are any variables that are dominant in the model (see table and line plot pages 45 & 46). Each variable seems to have equal effect on the prediction of the model. The "prior", "posterior" and the "data" agree on the slope of all the variables except for the variable % retained. The variable % retained in the "data" model is not definitive at this time.

Table: 80% CI Prediction Table for Thick

Figure: Sensitivity Plot for Thick

4.2.8.2 Sensitivity of Input Assumptions

Due to time constraints the only sensitivity analysis performed was to determine the sensitivity of the posterior model to variations to the Degrees of Freedom (DOF) of the "prior". It was very important to check this particular sensitivity because combining experts increased their degrees of freedom to a level not considered representative of their knowledge. Their degrees of freedom are elevated considerably from 42 to 426.

4.2.8.2.1 Thin Overlay Model

The sensitivity of the "posterior" standard error to the "prior" DOF shows a slight curve in the line ( see page 48 ) indicating the thin model is slightly sensitive to the "prior" DOF. As long as the "prior" DOF is at least as high as that for the data the sensitivity shouldn't affect the results.

Figure: Sensitivity to DOF for Thin

4.2.8.2.2 Thick Overlay Model

The sensitivity plot on page 49 ,representing the "posterior" standard error to the "prior" DOF for the thick model in this second iteration is horizontal. The "prior" degrees of freedom is not a dominant factor in this model.

Figure: Sensitivity to DOF for Thick

4.2.9 Inference from Analysis of Iteration 2

4.2.9.1 Thin Model

When the evaluation table for the thin model using the sixth model run was reviewed the statistical results looked good (see page 53) . The model had a low intercept value, a reasonably low standard error, and all the variables were found to be statistically significant. However, the model was still predicting rut values approximately 1.5 times greater than the actual rut values

After reviewing the model runs of this second iteration for the Thin Model it was decided to try a model run using the same experts as model four; Crandall, Doucet, Legere, and MacFarlane, to verify that the method of averaging rut values was valid. In this run, instead of using the two trials mentioned in model four, another trial was ran where the expert judgements were combined to get an average rut value of the four experts ( model six followed this method with six experts) for each cell location of the 48 cell matrix. When this model was run through the Bayesian analysis it was found to be the closest predicting model to the actual measured rut values . The comparison of the predicted rut measurement versus the actual measurement rut value between all three methods for entering expert judgement can be viewed on the following page (page 52). The first trial combined the experts into one database, the second trial combined the experts using the N­Prior statistical approach within XLBayes explained in third model run, and the third trial combined expert judgement to get an average rut value of the four experts for each cell location of the 48 cell matrix.

The proper method to follow when combining experts is not an established science. It is not known what the "philosophical" implications are in terms of Bayes Theorem, according to the Workshop Summary of the Canadian Bayesian Applications, dated May, 1995. The knowledge of how or when to combine experts is certain to be studied in the future. However, for purposes of this report, combining experts using average expert judgement brings the predicted rut measurement closer to the actual measured rut value and therefore deserves consideration.

4.2.9.2 Thick Model

The evaluation table for the second iteration of the thick model is shown on page 54 . The model used was model 5 with a cumulative log traffic variable and average rut values for the expert judgement. All the variables were statistically significant as well. However, a consistent difference between the predicted rut measurement and the actual measured rut value could not be determined as it varied thoughout the database.

ANALYSIS AND INTERPRETATION OF RESULTS

THIN MODEL

The following evaluation table shows the results from iteration two for the thin model using combined expert judgement.

MODEL of : Thin O/L

Evaluation Table

The thin model has a low intercept value Bo = .126523 and a reasonably low standard error value of Se = 2.65572 and strong T values. However the model is still predicting rut values approximately 1.5 times greater than the actual rut values.

THICK MODEL

The evaluation table shows the results of the second iteration of the thick model using combined expert judgement.

Evaluation Table

The Thick model has a fair intercept Bo = 3.316905 and a good standard error Se =2.233469 and strong T values.

(Continue)

Return to Table of Contents

Return to Main Page