Up  

 

 

parametric tests t-test/z test

parametric tests are more efficient/powerful than nonparametric tests but there are 3 restrictions on their use

1) data must be measured at the interval/ratio scale

2) data must be drawn from a normally distributed population

3) data must be drawn in independent samples

4) when you have 2 or more samples, the populations from which the samples are drawn are assumed to have equal variance = homoscedasticity assumption

its ok to assume this if n # 40 otherwise use the F test (ANOVA)

H0=:1=:2 or :1-:2=0 the population means are equal

or samples drawn from the same population or there is no significant difference between them

 

eg growth rates in northern and southern Ontario

H0= there is no significant difference between growth rates in N and S Ontario cities t-test if n# 40

northern southern

01=10.6 02=15.0

n1=11 n2=10

s1=11.8 s2=9.6

si=standard deviation of ith sample

t statistic

SE is the standard error of the difference

 

where SE*01-02*=

therefore

you may see somewhat different formulas if the analyst decides to use n-1 correction in the variance calculation of the sample

 

S is the pooled estimate of the variance of the data, a kind of average of the 2 sample variances

 

t-tables df 2tails

10 2.228

20 2.086

30 2.042

inf 1.960

as n increases t critical approaches 1.96 in other words a normal distribution

for our 2 tailed test df= (n1-1)+(n2-1)=(11-1)+(10-1)=10+9=19

at 0.05 with df=19, critical value of t=2.093 (pg 274 in textbook)

therefore we cannot reject H0

conclude similar growth rates, they are not significantly different

 

z test if n$40

eg clay content at 2 sites

Site 1 site 2

01 = 62.7 02 61.8 small difference in means

N1 = 120 n2 150

s1 = 2.50 s2 2.62 Small standard deviation

therefore reject H0, there is a significant difference between sites

 

for z distribution with sig at 5% z=1.96 (2 tailed test)

t test for paired samples

where dj is the difference between values x1j, x2j

if we make the assumption of difference dj is a random sample from a normal population

we could generalize the test to allow hypotheses concerning any value for the mean difference in the population

:d = :1 - :2

example

a cartographer test the time taken by intro students to perform a given set of tasks involving some extraction of information from some maps, at the end of the course this is repeated

student

1st time

2nd time

difference

1

16

15

1

2

23

21

2

3

17

16

1

4

14

15

-1

5

16

15

1

6

21

19

2

7

19

18

1

8

24

10

14

9

26

15

11

10

19

20

-1

 

d = 31/10=3.10

Sd=5.11

if "=0.05 tc=2.262 with df=9

one tailed and two tailed tests

so far we’ve only looked at testing against the null hypothesis, against H1 that there is a difference between the means of the population from which the 2 samples were taken

 

since we want to know if the difference lies in either direction it is a 2-tailed test

 

if we want to test that there is a difference between means in a specified direction we have a 1-tailed test

 

if H1 is x >y then the null hypothesis can be rejected only if  x >y and if it is significant at a chosen level