Weibull curve fitting
A Weibull distribution is fitted to data, wind speed or wave height, above a given threshold. This distribution is used to extrapolate beyond the maximum data value and to calculate the chance that some extreme value is exceeded.
The Weibull distribution
According to a conditional Weibull distribution, the probalility P(u>x) that a value u exceeds x is given by:
where w is a (given) threshold; a and d are the Weibull shape and scale parameter, respectively. Both Weibull parameters a and d, as well as the probability P(u>w) that the value u exceeds the threshold w, are estimated from the distribution of the data. This is done in the following way.
The probability P(u>w) is simply the fraction of the observed values ui, i = 1, ..., N that exceeds the (given) threshold value w. The parameters a and d are chosen in such a way that the likelyhood of the observed distribution for u > w given these parameters is optimal. This amounts to minimizing the cost function J , defined as
with respect to a and d. In the above equation the summation is over all observations ui, i = 1, ...,M exceeding the threshold w, and f(u) is the probabilty density function:
The Weibull shape parameter
For waves in deep water, the distribution of significant wave heights is approximately exponential, which corresponds to a Weibull shape parameter equal to one. For wind speed the shape parameter is about twice as large. The value of the shape parameter gives important information on the quality of the fit. Deviations from these preferential values can have several causes:
- Threshold too high or too low
- Not enough (independent) measurements
- A climate subject to rare, extreme events (hurricanes, typhoons)
In cases where not enough data is present, it may be helpful to fix the Weibull parameter to a reasonable, pre-determined value. This eliminates one degree of freedom from the Weibull distribution that needs to be estimated from the data.
The bootstrap method
The reliability of the exceedance values is estimated by taking different samples from the available data values. The 90% confidence interval is obtained in this way. If the method fails to give reasonable results, the confidence interval is left out.