Measures of
Central Tendency and Variability
Central Tendency
Many Variables tend
to be normally distributed
- number of siblings
- salary of workers in one job
category
- number of sunny days per
month
- satisfaction with internet
browser
By this we mean:
- the values tend to cluster
around a middle or central value
- the further you get from this
central value, the fewer observations there are
A Normally
Distributed Variable:
- can be quite well-described
with two values
- a measure of central tendency
- a measure of variability
So...
You need to know:
- where the middle
is
- how variable the data are
Measuring the middle:
- arithmetic average (mean)
- middle score (median)
- most frequent score (mode)
Mean
- Can be thought of as the
balance point
- To calculate the mean:
- Add up the individual
values
- Divide by the number
of values
Median
- This is the
middle score
- To find the median:
- line up all the
scores from lowest to highest
- count up the number
of scores
- identify the middle
score
- if the # of scores is
uneven, the middle score is N/2+.5
- if the # of scores is
even, the middle score is exactly between N/2 and
N/2+1
- Count through your
lined-up scores until you find the middle one
- This is your median
- If the middle falls
between 2 scores, add them together and take the
average
Mode
- Put the scores in
piles with identical scores piled together
- Look for the tallest pile, or
the most frequently-occurring score
- This is the mode
Note: If the Mean,
Median, and Mode are equal, then the data are normally
distributed.
Non-Normal
Distributions
- The three values are NOT the
same
- Mean: 13.3
- Median: 14.75
- Mode: 15
- This tells us that the data
are NOT symmetrical, or normally distributed
- In fact, by the ordering of
these values, we know HOW the data are not symmetrical
- when
mean<median<mode
- there is extra
weight on the low side
- when
mean>median>mode
- there is extra
weight on the high side
- this is called
skew
Variability
- in addition to identifying
the middle, we need to talk about how
variable the data are
- to do this:
- we use measures of
variability
- these measures tell
us how much the individual scores differ from
each other
Range
- the simplest measure of
variability
- it just tells us the extent
of the distribution - how different the largest value is
from the smallest
- to calculate the range, just
subtract the smallest value from the largest
- when the range of one set of
data is greater than another, it may suggest that the
data with the greater range may be more variable
The
Inter-Quartile Range
- is a more complex measure of
variability
- to calculate this value:
- line up all the
scores from lowest to highest
- divide the total number of
scores into 4 equal-sized groups:
- the median is the dividing
line between groups 2 and 3
- you can also identify the
dividing line between groups 1 and 2, and groups
3 and 4
- the first gives you a
score above which 3/4 of your scores fall
- the second gives you a
score above which 1/4 of your scores fall
- the difference between
these is the inter-quartile
range
Standard
Deviation
- Is the arithmetic
measure of variability
- Takes into account the
difference between each individual score and the group
mean
- Calculates a sort of
average deviation
You can
pair these measures
- mean-standard deviation
- median-inter-quartile range
- mode-range
- The first pair is the
most powerful, and most frequently
used
- The second pair is
good for skewed data
- The third pair is
rarely used, only for simple descriptive purposes
Questions? email ajohnson@uwo.ca
This page was last updated on Sunday,
February 21, 1999 at 08:28:40 PM