Up  

 

Runs test for randomness

randomness is a key assumption for all statistical tests

how do we decide if this assumption is reasonable

example

suppose we wish to ascertain whether the sequence of annual rainfall totals over the past 15 years at some location has been generated by a random process or whether the sequence contains a trend

even if the location were chosen randomly the data may still not be random since individual rainfall values are not selected randomly

nonparametric methods can be used to test for randomness of a sample even after it has been collected

number of runs test

test is based on examining the number of runs in a sequence

run- unbroken sequence of like items surrounded by unlike items

using the convention of + and - to denote 2 types of items a sequence could be [++++----] that contains 2 runs

when compared to the median + for above - for below

another sequence [+-+---+-] contains 6 runs

the total number of cases in a sequence is a good indicator of randomness

if there are too few runs it is possible some clustering pattern is present and the series is probably not random

if there are too many runs there may be a repeating or alternating pattern

for values of n1, n2 and n it is feasible to construct the sampling distribution of the number of runs R with mean and variance

where n1= number of +s and n2=number of -s

this sampling distribution is closely approximated by the normal distribution provided n1, n2 $10

any ‘tied’ value surrounded by 2 observations of the opposite sign is noncritical, no matter which sign is applied to the observation the number of runs remains the same

if surrounded by like signs, it is a critical tie depending on the sign it affects the number of runs

to handle this, do the R statistic twice, once by assuming the sign most conducive to the rejection of H0 and the second time by assigning the sign least conducive to the rejection of H0

suppose we have the following sequence of rainfall over 15 years

 

rainfall (inches)

sorted

sequence

1951

66

56

+

1952

57

56

-

1953

56

57

-

1954

60

58

0

1955

61

58

+

1956

63

59

+

1957

59

59

-

1958

58

60 (median)

-

1959

56

60

-

1960

60

61

0

1961

61

61

+

1962

63

63

+

1963

66

63

+

1964

59

66

-

1965

58

66

-

 

step 1 find the median

R=6, n=15, n1=8, n2=7

 

to test if rainfall has been decreasing , a 1-tailed test is applied

a small value of R sustains this hypothesis

for "=0.05 Z.05=±1.645 table 12 pg 284

hence

A’s are the critical values

A= 8.47 +(-1.645)(1.86)

A=8.47 -3.06

A=5.41

since R>5.41 we conclude that rainfall has not been decreasing so we accept H0

which is that rainfall has not been decreasing

this approach is for the series increasing or decreasing, since a steady increase or decrease would give few runs

for 2-tailed test

A=8.47 ±(1.96)(1.86)

A=8.47±3.65

A’s range 4.82 to 12.12

since R is not inside the range we conclude the series is generated by a random process

you can also use table 14 pg 286 which lists the critical values for a 2 tailed test for ns # 20