runs test

Runs test for randomness

randomness is a key assumption for all statistical tests

how do we decide if this assumption is reasonable

example

suppose we wish to ascertain whether the sequence of annual rainfall totals over the past 15 years at some location has been generated by a random process or whether the sequence contains a trend

even if the location were chosen randomly the data may still not be random since individual rainfall values are not selected randomly

nonparametric methods can be used to test for randomness of a sample even after it has been collected

number of runs test

test is based on examining the number of runs in a sequence

run- unbroken sequence of like items surrounded by unlike items

using the convention of + and - to denote 2 types of items a sequence could be [++++----] that contains 2 runs

when compared to the median + for above - for below

another sequence [+-+---+-] contains 6 runs

the total number of cases in a sequence is a good indicator of randomness

if there are too few runs it is possible some clustering pattern is present and the series is probably not random

if there are too many runs there may be a repeating or alternating pattern

for values of n₁, n₂ and n it is feasible to construct the sampling distribution of the number of runs R with mean and variance

where n₁= number of +s and n₂=number of -s

this sampling distribution is closely approximated by the normal distribution provided n₁, n₂ $10

any ‘tied’ value surrounded by 2 observations of the opposite sign is noncritical, no matter which sign is applied to the observation the number of runs remains the same

if surrounded by like signs, it is a critical tie depending on the sign it affects the number of runs

to handle this, do the R statistic twice, once by assuming the sign most conducive to the rejection of H₀ and the second time by assigning the sign least conducive to the rejection of H₀

suppose we have the following sequence of rainfall over 15 years

	rainfall (inches)	sorted	sequence
1951	66	56	+
1952	57	56	-
1953	56	57	-
1954	60	58	0
1955	61	58	+
1956	63	59	+
1957	59	59	-
1958	58	60 (median)	-
1959	56	60	-
1960	60	61	0
1961	61	61	+
1962	63	63	+
1963	66	63	+
1964	59	66	-
1965	58	66	-

step 1 find the median

R=6, n=15, n₁=8, n₂=7

to test if rainfall has been decreasing , a 1-tailed test is applied

a small value of R sustains this hypothesis

for "=0.05 Z_.05=±1.645 table 12 pg 284

hence

A’s are the critical values

A= 8.47 +(-1.645)(1.86)

A=8.47 -3.06

A=5.41

since R>5.41 we conclude that rainfall has not been decreasing so we accept H₀

which is that rainfall has not been decreasing

this approach is for the series increasing or decreasing, since a steady increase or decrease would give few runs

for 2-tailed test

A=8.47 ±(1.96)(1.86)

A=8.47±3.65

A’s range 4.82 to 12.12

since R is not inside the range we conclude the series is generated by a random process

you can also use table 14 pg 286 which lists the critical values for a 2 tailed test for n_s# 20