U test

Mann-Whitney U-test

use when strong ordinal (at least) available e.g. 1/2/3/4/5

is a powerful distribution free test

when interval/ratio (with unknown or non-normal distribution) the data are allotted rank numbers in a sequence

it is used to test whether there is a difference in two independent samples

that is do the samples come from different populations

example: shopping behavior in London, $/month on food

downtown (A) 135 137 144 140 146 145 mean=$141.17 n₁=6

suburbs (B) 132 133 142 134 136 143 mean=$136.66 n₂=6

sample size n₁=n₂= 6 if different the smaller is always n₁

averages indicate suburban store food prices are lower than downtown prices

but is the difference significant?

H₀: that the samples are drawn from the same population and that the differences are due to chance variation

H₁: That the samples are drawn from different populations and that the differences are significant

step 1. rank order the values (prices) but maintain the group identity

if the values are tied use the arithmetic average of the rankings they would otherwise receive

LET # in group A be n₁

let # in group B be n₂

G ranks (A) be R₁= 50

G ranks (B) be R₂= 28

step 2: calculate U - inspect each B in turn and count the number of As which precede it

U=0+0+0+1+3+3 = 7 this is test statistic

step 3: look up critical value pg 280 in text

for 2 tailed at .10 the value is 7

for 2 tailed at .05 the value is 3

U will be large if [BABABA] u = 3

U will be small if [BBBAAA] u = 0

computed values of U must be less than or equal to the value in table to reject H₀

here U=7 and critical value =3 we cannot reject null hypothesis

computational formula

= (6x6) + (6(6+1) / 2) - 50 = 7

= (6x6) + (6(6+1) / 2) - 28 = 29

check U_min = n₁ x n₂ - U_max

taking the smaller value of U we get 7 which is greater than the critical value so we reject the null

for larger sample size > 20 we can test H₀ with a z statistic using

significance

a low value of U is produced when there is a large difference between 2 samples

for H₁: X … Y the value needed is the smaller of U_x U_y

for H₁: X > Y ( a one tailed test) the value needed is U_x or U₁

for H₁: X < Y ( a one tailed test) the value needed is U_y or U₂