The goodnessOfFit operator requires a summary function, relative frequency distribution or a sample vector as its left operand and a relational function as its right operand.
Most of the time the right operand will be the relational function =, although < or > may be used in certain cases. The left argument is an optional parameter list which applies only when the left operand is a distribution function. The right argument is a sample vector. There are several types of goodnessOfFit tests. These include:
The following flow chart shows how TamStat determines which type of test to perform:
The syntax of the goodnessOfFit operator is:
[ConfLevel] report [Parameters] distributionFunction|relativeFrequency|SampleVector goodnessOfFIt relationalFunction SampleVector
Some examples follow:
To test To test whether a sample is from a particular distribution, we perform a goodness-of-fit test. Let’s open a package of regular M&M’s and count the number of each color. Suppose there are 15 brown M&M’s, 13 yellow, 12 red, 32 blue 20 orange, and 16 green. First we create a list of colors:
'
COLORS ← 'Brown,Yellow,Red,Blue,Orange,Green'
Then we create a list of the corresponding counts:
FREQ ← 15 13 12 32 20 16
We will test if the sample came from a uniform distribution; that is that the manufacturer produces the same number of each color:
report uniform goodnessOfFit = COLORS FREQ
We reject the null hypothesis since the p-Value is less than 0.05 and the test statistic is greater than the critical value. It is evident that the colors are not uniformly distributed.
M&M/Mars used to publish the proportions of each color on the internet. They were 13% brown 14% yellow, 13% red, 24% blue, 20% orange, and 16% green. To test if the M&M’s are still distributed this we say we perform a multinomial goodness of fit test. First we set the proportions, making sure that they total 1:
PROP ← 0.13 0.14 0.13 0.24 0.2 0.16
sum PROP
1
Then we run the test and display the report showing a much better fit:
report COLORS PROP multinomial goodnessOfFit = COLORS FREQ
Five children were selected from a class at random and timed in a short race. The times were 6.3, 4.2 4.7, 6 and 5.7 seconds. The previous race times were uniformly distributed between 4 and 8 seconds. Test whether the race time distribution has improved.
report 4 8 rectangular goodnessOfFit < 6.3 4.2 4.7 6 5.7
────────────────────────────────────────
kolmogorov Test
i Value S(x) F(x) T+ T-
1 4.2 0.2 0.05 0.05 0.15 *
2 4.7 0.4 0.175 -0.025 0.225
3 5.7 0.6 0.425 0.025 0.175
4 6 0.8 0.5 -0.1 0.3
5 6.3 1 0.575 -0.225 0.425 *
Mean: 5.38 Sample Size: N = 5
H₀:F(x)≥F*(x) H₁:F(x)<F*(x)
┌────────────────┬───────────────────┐
│Test Statistic: │P-Value: │
│T=0.425 │p=0.12367 │
├────────────────┼───────────────────┤
│Critical Value: │Significance Level:│
│T(α)=0.509 │α=0.05 │
└────────────────┴───────────────────┘
Conclusion: Fail to reject H₀
────────────────────────────────────────
The report above shows that there has not been a significant improvement in race times, and that the current race times are still distributed uniformly between 4 and 8 seconds. Note that we use the “rectangular” distribution in TamStat which is the continuous analog of the discrete uniform distribution. Also note that the largest positive and largest negative differences are flagged in the report.
Automobile emissions from a previous year have been measured and were normally distributed with a mean of 5.6 and a standard deviation of 1.2. Twelve cars were randomly selected, and the following emissions measurements taken:
4.8, 6.2, 6.0, 5.9, 6.6, 5.5, 5.8, 5.9, 6.3, 6.6, 6.2, 5.0
Do the current emissions have the same distribution as the previous year?
X ← 4.8 6.2 6 5.9 6.6 5.5 5.8 5.9 6.3 6.6 6.2 5
report 5.6 1.2 normal goodnessOfFit = X
──────────────────────────────────────────────────────
kolmogorov Test
i Value S(x) F(x) T+ T-
1 4.8 0.083333 0.25249 0.25249 -0.16916
2 5 0.16667 0.30854 0.2252 -0.14187
3 5.5 0.25 0.46679 0.30013 -0.21679
4 5.8 0.33333 0.56618 0.31618 -0.23285 *
5 5.9 0.5 0.59871 0.18204 -0.098706
6 5.9 0.5 0.59871 0.18204 -0.098706
7 6 0.58333 0.63056 0.13056 -0.047225
8 6.2 0.75 0.69146 0.024796 0.058537
9 6.2 0.75 0.69146 0.024796 0.058537
10 6.3 0.83333 0.72017 -0.029834 0.11317
11 6.6 1 0.79767 -0.11899 0.20233 *
12 6.6 1 0.79767 -0.11899 0.20233
Mean: 5.9 Sample Size: N = 12
H₀:F(x)=F*(x) H₁:F(x)≠F*(x)
┌────────────────┬───────────────────┐
│Test Statistic: │P-Value: │
│T=0.3161839595 │p=0.14499 │
├────────────────┼───────────────────┤
│Critical Value: │Significance Level:│
│T(α)=0.375 │α=0.05 │
└────────────────┴───────────────────┘
Conclusion: Fail to reject H₀
──────────────────────────────────────────────────────
When the exact distribution is unknown, one can test whether the data come from a family of distributions. For example, if the data appear bell-shaped, we can test whether the sample comes from a normal distribution. The student survey contains the weights of students. Let us test whether the student weights are normally distributed:
report normal goodnessOfFit = Weight
─────────────────────────────────────────────────────────────────────────────
Lillefors Test
i Xi Zi S(Zi) F(Zi) T+ T-
1 100 -1.6545 0.026316 0.049017 0.049017 -0.022702
2 105 -1.5359 0.052632 0.062281 0.035965 -0.0096497
3 115 -1.2988 0.078947 0.097008 0.044376 -0.01806
4 115 -1.2988 0.10526 0.097008 0.01806 0.0082556
5 120 -1.1802 0.13158 0.11895 0.013689 0.012626
...............................................................
8 139.5 -0.71788 0.21053 0.23642 0.052206 -0.02589 *
...............................................................
22 165 -0.11325 0.57895 0.45492 -0.097716 0.12403 *
...............................................................
34 220 1.1908 0.89474 0.88314 0.014722 0.011594
35 225 1.3094 0.92105 0.9048 0.010064 0.016252
36 245 1.7836 0.94737 0.96276 0.041705 -0.015389
37 260 2.1393 0.97368 0.98379 0.036425 -0.010109
38 280 2.6135 1 0.99552 0.021835 0.0044811
Mean: 169.7763158 Standard Deviation: 42.17478069 Sample Size: N = 38
H₀:normal H₁:not normal
┌────────────────┬───────────────────┐
│Test Statistic: │P-Value: │
│T=0.1240315761 │p=0.16469 │
├────────────────┼───────────────────┤
│Critical Value: │Significance Level:│
│T(α)=0.156 │α=0.05 │
└────────────────┴───────────────────┘
Conclusion: Fail to reject H₀
────────────────────────────────────────────────────────────────────────────
Note when the number of observations is large, the report only displays the first and last 5 observations as well as the observations containing the largest and smallest differences.
A random sample of 9 packages from a delivery service is taken and each parcel is weighed. A random sample of 12 packages from another delivery service is taken and those packages are also weighed. Are the distributions of weights for each delivery service the same?
⍝ Weights for delivery service X
X ← 7.6 8.4 8.6 8.7 9.3 9.9 10.1 10.6 11.2
⍝ Weights for delivery service Y
Y ← 5.2 5.7 5.9 6.5 6.8 8.2 9.1 9.8 10.8 11.3 11.5 12.3 12.5 13.4 14.6
report X goodnessOfFit = Y
────────────────────────────────────────
smirnov Test
i X Y S1 S2 S1-S2
1 0 5.2 0 0.066667 0.066667
2 0 5.7 0 0.13333 0.13333
3 0 5.9 0 0.2 0.2
4 0 6.5 0 0.26667 0.26667
5 0 6.8 0 0.33333 0.33333
......................................
18 11.2 0 1 0.6 0.4 *
......................................
20 0 11.5 1 0.73333 0.26667
......................................
21 0 12.3 1 0.8 0.2
22 0 12.5 1 0.86667 0.13333
23 0 13.4 1 0.93333 0.066667
24 0 14.6 1 1 0 *
24 0 14.6 1 1 0 *
Sample Size: N = 9 M = 15
H₀:F(x)=G(x) H₁:F(x)≠G(x)
┌────────────────┬───────────────────┐
│Test Statistic: │P-Value: │
│T=0.4 │p=0.33060 │
├────────────────┼───────────────────┤
│Critical Value: │Significance Level:│
│T(α)=0.573 │α=0.05 │
└────────────────┴───────────────────┘
Conclusion: Fail to reject H₀
────────────────────────────────────────
It appears that the distribution of weights for the delivery services are the same.