Lab Three: Measures of Central Tendency, Dispersion, Skewness and Kurtosis

INTRODUCTION

This lab presents three of the most commonly used methods available to the researcher when attempting to describe a given distribution of observations:

1) measures of central tendency (mean, mode and median) which reveal what observation(s) is (are) most representative of the entire distribution,

2) measures of dispersion (range, mean deviation, variance and standard deviation) which provide information concerning how the observations are spread out in the distribution and,

3) techniques that enable the researcher to describe the overall shape of the distribution (skewness and kurtosis measures).

An understanding of these descriptive statistics is not only directly useful but, as well, yields subsequent utility when performing parametric or non-parametric tests.

Formulae and/or Definitions:

Mode: the most frequently occurring observation in a data set.

Median (Md): the middle observation of a ranked set of data.

Mean: x = Ex / n

Mean of frequencies:

Range: the highest observation minus the lowest observation

Standard Deviation = square root of the variance

(In all cases: 'x' is an individual observation, 'E' is the summation of every individual observation 1 through I, and 'n' is the total number of observations.)

For your purposes s is the same thing as F.

skewness=

kurtosis=

Instructions:

All calibrations for the questions in Part A are to be performed using the 'life expectancy' data set and must be calculated by hand. Questions for Part B refer to the 'frost-flee days' listing and can be computed either manually or with the use of SPSS for Windows. Note: when calculating the variance, standard deviation, skewness and kurtosis values it is recommended that you construct a table similar to the example illustrated in Earickson, pg. 95.

PART A: (35 marks)

Life Expectancy At Birth (for Selected Countries)

COUNTRY

AFGHANISTAN

ARGENTINA

BOTSWANA

CANADA

CUBA

DOMINICAN REP

W. GERMANY

GUATEMALA

INDIA

JAPAN

LIBYA

MALDIVES

MOZAMBIQUE

NORWAY

POLAND

SAUDI ARABIA

SRI LANKA

TURKEY

UNITED KINGDOM

VIET NAM

YEARS

36

70

61

75

75

62

73

60

55

77

57

47

51

76

72

56

69

63

74

64

COUNTRY

ALGERIA

BANGLADESH

BRAZIL

CHAD

CYPRUS

ETHIOPIA

GREECE

GUYANA

IRAQ

N. KOREA

LUXEMBOURG

MEXICO

NEW ZEALAND

PANAMA

ROMANIA

SINGAPORE

SWITZERLAND

UGANDA

UNITED STATES

ZAIRE

YEARS

57

48

64

44

74

47

74

68

59

64

73

65

73

71

71

72

79

47

75

50

 

Source: The World Bank Atlas (1985). The World Bank, Washington, DC pp 5-9.

 

Questions:

l) Calculate the mode, median and mean for the 40 countries sampled. The mean (or average) is the most commonly used measure of central tendency but what can be gained by evaluating a distribution's median or mode? Which measure of central tendency is most appropriate when nominal level data is used?

2) Calculate the range, mean deviation, variance and standard deviation. In general terms, what are each of these dispersion statistics measuring?

3) On graph paper, draw a histogram (6 classes would be most appropriate). Label on the histogram where the mean, first standard deviate (plus and minus the mean), and second standard deviate (plus and minus the mean) are located. How many of the 40 observations are within 1 standard deviation of the mean? How many observations are beyond 2 standard deviations?

4) Interpret the results attained from questions 1 through 3. (ie. What can be said about the distribution's central tendency and dispersion characteristics?)

5) What does skewness and kurtosis measure and why is it important to know the values? Calculate the skewness and kurtosis for this data set and interpret the results.

PART B:  (35 marks)

Number of Frost-Free Days (for Various Canadian Centres)

Station

 

St. Johns

Sydney

Charlottetown

Chatham (N.B.)

Quebec

Val d'Or

Ottawa

Windsor

Sioux Lookout

Churchill

Medicine Hat

Edmonton

Vancouver

Estevan Point

Whitehorse

Yellowknife

Frobisher Bay

# of Days

 

130

145

150

122

132

98

142

173

113

81

125

127

212

226

87

108

59

Station

 

Goose Bay

Yarmouth

Saint John

Schefferville

Montreal

Kapuskasing

Toronto

Thunder Bay

Winnipeg

Regina

Calgary

Penticton

Victoria

Dawson

Fort Smith

Inuvik

Alert

# of Days

 

122

174

170

73

183

83

192

101

118

107

106

143

202

92

64

45

4

Source: Canada and the World (An Atlas Resource) (1985). J. Matthews and R. Morlow, Toronto, Canada. pp 161

Questions:

1) What are the mean, median and mode values for the 'number of frost-free days' for the 34 stations listed above?

2) What values are calculated for: the range, mean deviation, variance, standard deviation, skewness and kurtosis?

3) To graphically show the distribution, draw a 6 class histogram.

4) With respect to the results attained in both parts A and B, which of the two data sets would be most appropriate for parametric testing? In answering this question make reference to: the derived skewness and kurtosis values and the general shape of your two histograms. (Note: one of the assumptions of parametric tests is that the underlying data set be normally distributed .)

PART C: (10 marks)

Make a list of ten numbers whose:

(List 1) mean is 75.
(List 2) mean is the same as the mean for List 1, but whose range is bigger than the range of the numbers in List 1.
(List 3) whose median is 6 and whose range is 9.
(List 4) whose mean is bigger than the median.

PART D: (20 marks)

Now its time to do some work using your own data.  Therefore, go and collect some data applicable to your area of interest in geography. The data should be such that it can be analyzed with the measures of central tendency, meaning it should be of at least interval level.

For this data, you should explain where it comes from, what are the likely sources of error, and its significance to geographic study. Then analyze your data and explain what you find.