Up  

 

Kolomogorov -Smirnov test

nonparametric test, one or two samples, weakly ordinal

most often - where you have data in the form of frequencies that fall into ordered classes

example 1 (one-sample application)

one sample case - compare sample data with some expected population distribution

sample - random sample of 100 farms located at different distances from the market place. Distances of farms from markets allocated to classes

class 0-4.9 5-9.9 10-14.9 15+

frequency 30 25 25 20

%of total 10 20 30 40

land

 

% of total land = expected population distribution under a null hypothesis

frequency of farms - observed distribution/sample

H0: distance from market has no influence on farm location

(observed distance (sample) no different from expected population

H1: distance from market and farm location are significantly related (observed/sample significantly different from expected population

confidence level = "=0.05 95% confidence level of H0 no true

compute D statistic: first convert observed and expected distributions into cumulative proportions and list the difference

observed (Oi) 30/100=0.30 25/100=0.25 25/100=0.25 20/100=0.20

EOi 0.30 0.55 0.85 1.00

expected (Ei) 10/100=0.10 20/100=0.20 30/100=0.30 40/100=0.40

Eei 0.10 0.30 0.60 1.00

diff 0.20 0.25 0.25 0.00

find max difference = 0.25

D is found by inspection of the differences between the cumulative proportions

difference is absolute value

computed values of D must exceed table values to reject H0

(pages 282 and 283 in text)

for samples over 35 at "=0.05 is 1.36/%n

so critical value is .136

if observed and expected values are equal then D=0

critical value of .136 means that if the null hypothesis is true, we expect Dmax value this large in 5% of samples

therefore D=0.136 defines lower limit of top 5% of probability distribution of D with a sample of 100

in this case, D=0.25 which is in the top 5%

Dmax > Dcritical, reject null hypothesis, accept research .25 > .136

therefore at "=0.05, can reject H0 and conclude the samples are significantly different

farm location seems to be dependent on distance from the market place

example 2 (2 sample application)

commuting distances

 

0-0.9

1-1.9

2-4.9

5-9.9

10+

Total

Working class

10

20

10

5

0

n1=45

Middle class

5

10

20

20

5

n2=60

 

H0: (two tailed) no significant difference in commuting distances of the 2 groups

Test (2 tailed)

D=max *c1 - c2*

where c1 and c2 are cumulative proportional distributions of the 2 samples

c1

10/45=0.22

20/45=0.44

10/45=0.22

5/45=0.11

0/45=0.0

C2

5/60=0.08

10/60=0.17

20/60=0.33

20/60=0.17

5/60=0.08

3c1

0.22

0.67

0.89

1.00

1.00

3c2

0.08

0.25

0.58

0.92

1.00

Diff

0.14

0.42

0.31

0.08

0.00

large sample approximation, say over 100 cases

= 0.27

Dtest > Dcritical

0.42 > 0.27 therefore can reject null, suggest real differences in the commuting distances of two groups

for a stricter test

alternate test for significance for one tailed test

m= # of observations in sample one

n= # of observations in sample two

with df=2

critical value is 5.99

this can be used only when N>40 with N-2 df (pg 276 in text)

the degree of difference is important not the sign of the difference