Standard Deviation &Variance
Dr. Harinatha Reddy
Sri Krishnadevaraya University
Department of microbiology
Standard deviation:
• In statistics, the standard deviation is a measure that is used to
quantify the amount of variation or dispersion of a set of data.
• It is represented by Greek symbol (s) and in short form S or SD.
• It also known as root mean square deviation.
This formula for ungrouped data:
SD OR
SD for Ungrouped data:
Example 1: A hen lays eight eggs. Each egg was weighed and
recorded as follows:
60, 56, 61, 68, 51, 53, 69, 54
S.NO Variance
(X)
Deviation (d
or dx)=
X-X
̅
(X-X
̅ )2 or
dx2
1 60 60-59= 1 1×1=1
2 56 56-59= -3 3×3=9
3 61 61-59= 2 2×2= 4
4 68 68-59= 9 81
5 51 51-59= -8 64
6 53 53-59= -6 36
7 69 69-59= 10 100
8 54 54-59= -5 25
N=8 X=432 ∑dx2=320
Mean(X
̅ )= ∑X/N
= 432/8
=59
SD=
SD = 320/8
SD= 6.32
Exercise 1: Calculate SD for ungrouped data:
• 10,15,20,25,4,8
Standard deviation for grouped data (discrete variables):
Where n = ∑f
Example: Standard deviation for grouped data (discrete
variables):
Workers (X) Frequency (f)
0 1
1 1
2 2
3 3
4 6
5 5
6 4
7 3
8 3
9 2
First calculate mean for
discrete data:
Mean formula for discrete
data is Mean (X
̅ )= ∑fx / ∑f
Workers (X) Frequency
(f) (fx)
0 1 0
1 1 1
2 2 4
3 3 9
4 6 24
5 5 25
6 4 24
7 3 21
8 3 24
9 2 18
∑f=30 ∑fx=150
Mean formula for discrete data is
Mean (X
̅ )= ∑fx / ∑f
= 150/30 = 5
Mean (X
̅ )= 5
Workers (X) Frequency
(f) (fx) (X-X
̅ ) (X-X
̅ )
2
(X-X
̅ )
2
f
0 1 0 0-5=-5 25 25×1= 25
1 1 1 1-5=-4 16 16×1=16
2 2 4 2-5=-3 9 9×2= 18
3 3 9 -2 4 12
4 6 24 -1 1 6
5 5 25 0 0 0
6 4 24 1 1 4
7 3 21 2 4 12
8 3 24 3 9 27
9 2 18 4 16 32
∑f=30 ∑fx=150 ∑(X-X
̅ )
2
f =152
SD
Exercise 1: Calculate SD for following discrete data
Blood cells No of days
90 5
55 9
60 5
70 4
80 10
100 20
Standard deviation for grouped Data
(Continuous serious)
Standard deviation for grouped Data
Hours Number of
students
10 -14 2
15 -19 12
20 -24 23
25-29 60
30-34 77
35-39 38
40 -44 8
Table 1. Number of hours per week spent
watching television
First calculate Mean (x
̅ )= ∑f.m / ∑f
M= Middle values of class (It also mention as X)
Hours Number of
students
10 -14 2
15 -19 12
20 -24 23
25-29 60
30-34 77
35-39 38
40 -44 8
First calculate Mean (x
̅ )= ∑f.m / ∑f
M= Middle values of class (It also mention as X)
Hours Midpoint
(x)
Frequenc
y (f)
fx
10 to 14 12 2 24
15 to 19 17 12 204
20 to 24 22 23 506
25 to 29 27 60 1,620
30 to 34 32 77 2,464
35 to 39 37 38 1,406
40 to 44 42 8 336
∑f= 220 ∑fx=6,560
Mean (x
̅ )= ∑f.m / ∑f
= 6560/220
=29.82
Hours Midpoint
(M or X)
Frequenc
y (f)
fm
or fx
(M-X
̅ ) (M-X
̅ )
2
(M-X
̅ )
2
f
10 -14 12 2 12×2=24 12-29.82
= -17.82
(17.82 )
2
=317.6
317.6×2
= 635.2
15 -19 17 12 17×12=204 17-29.82=
-12.82
(12.82 )
2
=164.4
164.4×12
=1,972.8
20 -24 22 23 22×23=506 -7.82 61.2 1,407.6
25 -29 27 60 27×60=1,620 -2.82 8.0 480.0
30 -34 32 77 32×77=2,464 2.18 4.8 369.6
35 -39 37 38 37×38=1,406 7.18 51.6 1,960.8
40 -44 42 8 42×8=336 12.18 148.4 1,187.2
∑f= 220 ∑fx=6,560 ∑(M-X
̅ )
2
f=
8,013.2
Mean (x)==29.82
SD
Where n = ∑f
M written as X in the equation
(Middle point (M) of the class also called as X)
SD
Coefficient of standard deviation
Coefficient of standard deviation
Coefficient of standard deviation=
Standard deviation (SD)
Arithmetic mean (𝐗̅ )
=
SD
𝐗̅
Hours Midpoint
(M or X)
Frequenc
y (f)
fm
or fx
(M-X
̅ ) (M-X
̅ )
2
(M-X
̅ )
2
f
10 -14 12 2 12×2=24 12-29.82
= -17.82
(17.82 )
2
=317.6
317.6×2
= 635.2
15 -19 17 12 17×12=204 17-29.82=
-12.82
(12.82 )
2
=164.4
164.4×12
=1,972.8
20 -24 22 23 22×23=506 -7.82 61.2 1,407.6
25 -29 27 60 27×60=1,620 -2.82 8.0 480.0
30 -34 32 77 32×77=2,464 2.18 4.8 369.6
35 -39 37 38 37×38=1,406 7.18 51.6 1,960.8
40 -44 42 8 42×8=336 12.18 148.4 1,187.2
∑f= 220 ∑fx=6,560 ∑(M-X
̅ )
2
f=
8,013.2
Mean (X
̅ )==29.82 Coefficient of SD =
SD
𝐗̅
=6.03/29.2
= 0.206
Uses of Standard deviation:
• Standard deviation is based on all the observations.
• Of all the measures of dispersion, standard deviation is best
because it is least effected by fluctuations.
• It used in the finding of standard error.
Variance
Variance:
• The variance is the arithmetic mean of the squares of sum the
deviations for the mean value of the data.
• It is represented by s2 or σ2
• Formula for the ungrouped data=
s2 or σ2 =
• Ascending order: 16,17,18,19,20,21,22,23,24.
Example: Variance for ungrouped data:
• 23,22,20,24,16,17,18,19,21,
• Ascending order: 16,17,18,19,20,21,22,23,24.
• Mean: 180/9= 20
S.No Observation
s (X)
Deviation from mean
(Dx or d= X-X
̅ )
Square of
deviation
(X-X
̅ )2 or X 2
1 16 16-20= -4 42 = 16
2 17 17-20= -3 32 = 9
3 18 18-20= -2 4
4 19 19-20= -1 1
5 20 20-20= 0 0
6 21 21-20= 1 1
7 22 22-20= 2 4
8 23 23-20= 3 9
9 24 24-20= 4 16
N= 9 ∑X= 180 ∑X-X
̅ )2 = 60
S2 =
S2 =60/9-1
S2 = 7.5
Exercise 1 : Variance for ungrouped data:
• 10,2,8,6,15,20,4,5
• Calculate variance for ungrouped data:
Variance for grouped data (continuous series):
Variance of grouped data formula:
X̅ : Mean
M or X: Mid point of class interval.
N= frequency
The first step in the variance for grouped data is
to calculate mean:
Mean (X
̅ )= ∑fm / ∑f
Example: Variance for grouped data:
Age H1N1 patients
31-35 2
36-40 3
41-45 8
46-50 12
51-55 16
56-60 5
61-65 2
66-70 2
The first step in the variance for grouped data is
to calculate mean:
Mean (X
̅ )= ∑fm / ∑f
Class interval Mid point (M
or X)
Frequency (f) Fm or fx
31-35 33 2 66
36-40 38 3 114
41-45 43 8 344
46-50 48 12 576
51-55 53 16 848
56-60 58 5 290
61-65 63 2 126
66-70 68 2 136
∑f= 50 ∑fm= 2500
Mean (X
̅ )= ∑fm / ∑f
= 2500/50
Mean (X
̅ ) = 50
Mean (X
̅ ) = 50
Class
interval
Mid
point (M
or X)
Freque
ncy (f)
Fm or fx (X-X
̅ ) or
(m-X
̅ )
(X-X
̅ )2 or
(m-X
̅ )2
f(X-X
̅ )2
Or
f(m-X
̅ )2
31-35 33 2 66 33-50= -17 172 = 289 289×2= 578
36-40 38 3 114 38-50= -12 144 144×3= 432
41-45 43 8 344 -7 49 392
46-50 48 12 576 -2 4 48
51-55 53 16 848 3 9 144
56-60 58 5 290 8 64 320
61-65 63 2 126 13 169 328
66-70 68 2 136 18 324 648
∑f= 50 ∑fm=
2500
∑(X-X
̅ )2 =
1052
∑f(X-X
̅ )2
= 2900
= 2900/50-1
=2900/49= 59.18
SD or S = 59.18 SD or S= 7.69
Co-efficient of Variation (CV)
Co-efficient of Variation (CV)= Standard deviation × 100
Mean
Calculate
the Variance, standard deviation and co-efficient of variance for
the data.
Yield of wheat
per hectare
No of wheat fields
10-20 22
20-30 5
30-40 2
50-60 12
60-70 16
70-80 10
First find SD and followed by CV
Co-efficient of Variation (CV)= Standard deviation × 100
Mean
Class
interval
Mid
point (M
or X)
Freque
ncy (f)
Fm or fx (X-X
̅ )
Or
(m-X
̅ )
(X-X
̅ )2
Or
(m-X
̅ )2
f(X-X
̅ )2
Or
f(m-X
̅ )2
10-20 22
20-30 5
30-40 2
50-60 12
60-70 16
70-80 10
∑f= ∑fm= ∑(X-X
̅ )2 = ∑f(X-X
̅ )2
=
Significance of Variance:
• It is easy to calculate.
• It indicates the variability clearly.
• The variance is the most informative among the measures of
dispersion for populations.
• It is most frequently used measure of variation in data especially
with normal, binomial or Poisson distribution.
Biostatistics Standard deviation and variance

Biostatistics Standard deviation and variance

  • 1.
    Standard Deviation &Variance Dr.Harinatha Reddy Sri Krishnadevaraya University Department of microbiology
  • 2.
    Standard deviation: • Instatistics, the standard deviation is a measure that is used to quantify the amount of variation or dispersion of a set of data. • It is represented by Greek symbol (s) and in short form S or SD. • It also known as root mean square deviation. This formula for ungrouped data: SD OR
  • 3.
    SD for Ungroupeddata: Example 1: A hen lays eight eggs. Each egg was weighed and recorded as follows: 60, 56, 61, 68, 51, 53, 69, 54 S.NO Variance (X) Deviation (d or dx)= X-X ̅ (X-X ̅ )2 or dx2 1 60 60-59= 1 1×1=1 2 56 56-59= -3 3×3=9 3 61 61-59= 2 2×2= 4 4 68 68-59= 9 81 5 51 51-59= -8 64 6 53 53-59= -6 36 7 69 69-59= 10 100 8 54 54-59= -5 25 N=8 X=432 ∑dx2=320 Mean(X ̅ )= ∑X/N = 432/8 =59 SD= SD = 320/8 SD= 6.32
  • 4.
    Exercise 1: CalculateSD for ungrouped data: • 10,15,20,25,4,8
  • 5.
    Standard deviation forgrouped data (discrete variables): Where n = ∑f
  • 6.
    Example: Standard deviationfor grouped data (discrete variables): Workers (X) Frequency (f) 0 1 1 1 2 2 3 3 4 6 5 5 6 4 7 3 8 3 9 2 First calculate mean for discrete data: Mean formula for discrete data is Mean (X ̅ )= ∑fx / ∑f
  • 7.
    Workers (X) Frequency (f)(fx) 0 1 0 1 1 1 2 2 4 3 3 9 4 6 24 5 5 25 6 4 24 7 3 21 8 3 24 9 2 18 ∑f=30 ∑fx=150 Mean formula for discrete data is Mean (X ̅ )= ∑fx / ∑f = 150/30 = 5 Mean (X ̅ )= 5
  • 8.
    Workers (X) Frequency (f)(fx) (X-X ̅ ) (X-X ̅ ) 2 (X-X ̅ ) 2 f 0 1 0 0-5=-5 25 25×1= 25 1 1 1 1-5=-4 16 16×1=16 2 2 4 2-5=-3 9 9×2= 18 3 3 9 -2 4 12 4 6 24 -1 1 6 5 5 25 0 0 0 6 4 24 1 1 4 7 3 21 2 4 12 8 3 24 3 9 27 9 2 18 4 16 32 ∑f=30 ∑fx=150 ∑(X-X ̅ ) 2 f =152 SD
  • 9.
    Exercise 1: CalculateSD for following discrete data Blood cells No of days 90 5 55 9 60 5 70 4 80 10 100 20
  • 10.
    Standard deviation forgrouped Data (Continuous serious)
  • 11.
    Standard deviation forgrouped Data Hours Number of students 10 -14 2 15 -19 12 20 -24 23 25-29 60 30-34 77 35-39 38 40 -44 8 Table 1. Number of hours per week spent watching television First calculate Mean (x ̅ )= ∑f.m / ∑f M= Middle values of class (It also mention as X)
  • 12.
    Hours Number of students 10-14 2 15 -19 12 20 -24 23 25-29 60 30-34 77 35-39 38 40 -44 8 First calculate Mean (x ̅ )= ∑f.m / ∑f M= Middle values of class (It also mention as X)
  • 13.
    Hours Midpoint (x) Frequenc y (f) fx 10to 14 12 2 24 15 to 19 17 12 204 20 to 24 22 23 506 25 to 29 27 60 1,620 30 to 34 32 77 2,464 35 to 39 37 38 1,406 40 to 44 42 8 336 ∑f= 220 ∑fx=6,560 Mean (x ̅ )= ∑f.m / ∑f = 6560/220 =29.82
  • 14.
    Hours Midpoint (M orX) Frequenc y (f) fm or fx (M-X ̅ ) (M-X ̅ ) 2 (M-X ̅ ) 2 f 10 -14 12 2 12×2=24 12-29.82 = -17.82 (17.82 ) 2 =317.6 317.6×2 = 635.2 15 -19 17 12 17×12=204 17-29.82= -12.82 (12.82 ) 2 =164.4 164.4×12 =1,972.8 20 -24 22 23 22×23=506 -7.82 61.2 1,407.6 25 -29 27 60 27×60=1,620 -2.82 8.0 480.0 30 -34 32 77 32×77=2,464 2.18 4.8 369.6 35 -39 37 38 37×38=1,406 7.18 51.6 1,960.8 40 -44 42 8 42×8=336 12.18 148.4 1,187.2 ∑f= 220 ∑fx=6,560 ∑(M-X ̅ ) 2 f= 8,013.2 Mean (x)==29.82 SD Where n = ∑f M written as X in the equation (Middle point (M) of the class also called as X) SD
  • 15.
  • 16.
    Coefficient of standarddeviation Coefficient of standard deviation= Standard deviation (SD) Arithmetic mean (𝐗̅ ) = SD 𝐗̅
  • 17.
    Hours Midpoint (M orX) Frequenc y (f) fm or fx (M-X ̅ ) (M-X ̅ ) 2 (M-X ̅ ) 2 f 10 -14 12 2 12×2=24 12-29.82 = -17.82 (17.82 ) 2 =317.6 317.6×2 = 635.2 15 -19 17 12 17×12=204 17-29.82= -12.82 (12.82 ) 2 =164.4 164.4×12 =1,972.8 20 -24 22 23 22×23=506 -7.82 61.2 1,407.6 25 -29 27 60 27×60=1,620 -2.82 8.0 480.0 30 -34 32 77 32×77=2,464 2.18 4.8 369.6 35 -39 37 38 37×38=1,406 7.18 51.6 1,960.8 40 -44 42 8 42×8=336 12.18 148.4 1,187.2 ∑f= 220 ∑fx=6,560 ∑(M-X ̅ ) 2 f= 8,013.2 Mean (X ̅ )==29.82 Coefficient of SD = SD 𝐗̅ =6.03/29.2 = 0.206
  • 18.
    Uses of Standarddeviation: • Standard deviation is based on all the observations. • Of all the measures of dispersion, standard deviation is best because it is least effected by fluctuations. • It used in the finding of standard error.
  • 19.
  • 20.
    Variance: • The varianceis the arithmetic mean of the squares of sum the deviations for the mean value of the data. • It is represented by s2 or σ2 • Formula for the ungrouped data= s2 or σ2 =
  • 21.
    • Ascending order:16,17,18,19,20,21,22,23,24.
  • 22.
    Example: Variance forungrouped data: • 23,22,20,24,16,17,18,19,21, • Ascending order: 16,17,18,19,20,21,22,23,24. • Mean: 180/9= 20 S.No Observation s (X) Deviation from mean (Dx or d= X-X ̅ ) Square of deviation (X-X ̅ )2 or X 2 1 16 16-20= -4 42 = 16 2 17 17-20= -3 32 = 9 3 18 18-20= -2 4 4 19 19-20= -1 1 5 20 20-20= 0 0 6 21 21-20= 1 1 7 22 22-20= 2 4 8 23 23-20= 3 9 9 24 24-20= 4 16 N= 9 ∑X= 180 ∑X-X ̅ )2 = 60 S2 = S2 =60/9-1 S2 = 7.5
  • 23.
    Exercise 1 :Variance for ungrouped data: • 10,2,8,6,15,20,4,5 • Calculate variance for ungrouped data:
  • 24.
    Variance for groupeddata (continuous series): Variance of grouped data formula: X̅ : Mean M or X: Mid point of class interval. N= frequency The first step in the variance for grouped data is to calculate mean: Mean (X ̅ )= ∑fm / ∑f
  • 25.
    Example: Variance forgrouped data: Age H1N1 patients 31-35 2 36-40 3 41-45 8 46-50 12 51-55 16 56-60 5 61-65 2 66-70 2 The first step in the variance for grouped data is to calculate mean: Mean (X ̅ )= ∑fm / ∑f
  • 26.
    Class interval Midpoint (M or X) Frequency (f) Fm or fx 31-35 33 2 66 36-40 38 3 114 41-45 43 8 344 46-50 48 12 576 51-55 53 16 848 56-60 58 5 290 61-65 63 2 126 66-70 68 2 136 ∑f= 50 ∑fm= 2500 Mean (X ̅ )= ∑fm / ∑f = 2500/50 Mean (X ̅ ) = 50
  • 27.
    Mean (X ̅ )= 50 Class interval Mid point (M or X) Freque ncy (f) Fm or fx (X-X ̅ ) or (m-X ̅ ) (X-X ̅ )2 or (m-X ̅ )2 f(X-X ̅ )2 Or f(m-X ̅ )2 31-35 33 2 66 33-50= -17 172 = 289 289×2= 578 36-40 38 3 114 38-50= -12 144 144×3= 432 41-45 43 8 344 -7 49 392 46-50 48 12 576 -2 4 48 51-55 53 16 848 3 9 144 56-60 58 5 290 8 64 320 61-65 63 2 126 13 169 328 66-70 68 2 136 18 324 648 ∑f= 50 ∑fm= 2500 ∑(X-X ̅ )2 = 1052 ∑f(X-X ̅ )2 = 2900 = 2900/50-1 =2900/49= 59.18 SD or S = 59.18 SD or S= 7.69
  • 28.
    Co-efficient of Variation(CV) Co-efficient of Variation (CV)= Standard deviation × 100 Mean
  • 29.
    Calculate the Variance, standarddeviation and co-efficient of variance for the data. Yield of wheat per hectare No of wheat fields 10-20 22 20-30 5 30-40 2 50-60 12 60-70 16 70-80 10 First find SD and followed by CV Co-efficient of Variation (CV)= Standard deviation × 100 Mean
  • 30.
    Class interval Mid point (M or X) Freque ncy(f) Fm or fx (X-X ̅ ) Or (m-X ̅ ) (X-X ̅ )2 Or (m-X ̅ )2 f(X-X ̅ )2 Or f(m-X ̅ )2 10-20 22 20-30 5 30-40 2 50-60 12 60-70 16 70-80 10 ∑f= ∑fm= ∑(X-X ̅ )2 = ∑f(X-X ̅ )2 =
  • 31.
    Significance of Variance: •It is easy to calculate. • It indicates the variability clearly. • The variance is the most informative among the measures of dispersion for populations. • It is most frequently used measure of variation in data especially with normal, binomial or Poisson distribution.