A groundwork for risk assessment




                                   Diane Christina | 2009
   Descriptive Statistics used to describe the main
    features of a collection of data in quantitative terms

   Inferential Statistic comprises the use of statistics
    and random sampling to make conclusion
    concerning some unknown aspect of a population
                        Sample (mean)


           Random          • Calculate     Population
                             sample mean
           Sample            to estimate
                                            (mean)
                             population
                             mean
   Measures of central tendency
    (Mean, Median, Mode)

   Measures of dispersion
    (variance, standard deviation)

   Measures of shape
    (skewness)
   Mean
     Arithmetic Mean


     Geometric Mean


   Median

   Mode

   Quartiles
   Range: the difference between the largest value of data
    set and the smallest value

   Interquartile range: the range of values between the
    first and the third quartile

   Mean absolute deviation        MAD = ∑ | x – x | / n

   Variance                      S2 = ∑ X2 – (∑ X)2/n
    (for sample variance)                   n-1

   Standard Deviation             S  S2
Interpretation of Standard Deviation
                                               Eg. µ = 100 σ=15

                                               • ± 1σ = 85/115
                                               • ± 2σ = 70/130
                                               • ± 3σ = 55/145
        Frequency




                                                  Value Changes



                                        68%
                                        95%
                                       99.7%
Skewness
      is a measure of the asymmetry of the probability
         distribution of a real-valued random variable

Negatively Skew/                                                      Positively Skew/
Skewed to the left                                                    Skewed to the right
                            Mean




                                                                    Mean
                                            Mode




                                                    Mode
                                   Median




                                                           Median
                     Sk = 3 (mean − median) / standard deviation
Relative   Cumulative
Class Interval   Frequency   Mid Point
                                         Frequency   Frequency
 20 ≤ x < 30        6           25          .12          6
 30 ≤ x < 40        18          35          .36         24
 40 ≤ x < 50        11          45          .22          35
 50 ≤ x < 60        11          55          .22         46
 60 ≤ x < 70         3          65          .06         49
 70 ≤ x < 80         1          75          .02         50
   Totals           50                     1.00
STEM       LEAF
                          2     3
                          4     7
86   77   91   60   55
                          5     5   9

76   92   47   88   67    6     0   7
                          7     3   5      6   7
                          8     3   6      8
23   59   73   75   83
                          9     1   2
To determine likelihood of an event
Method of assigning probabilities:
 Classical (Apriority) probability
 Relative frequency of occurrence
 Subjective probability
   General law of addition
          P X  Y   P X   PY   P X  Y 
   Special law of addition
          P X  Y          P X      P Y   
   General law of multiplication
          P  X  Y   P  X  P Y | X   P Y  P  X | Y 
   Special law of multiplication
         P X  Y   P X  PY 
   Law of conditional probability
         P  X | Y   P X  Y   P X  PY | X 
                         PY             P Y 
Construct risk model and measure the degree of relatedness of variables
Find the equation of regression line
             ^
             Y  b0  b1 X
                                                                  _   _
   Where as the populationY intercept b0  Y  b1 X

   The population slope                            X  Y 
                                             XY 
                     b1  SSxy                     n
                                 SSxx              X 2
                                              X2 n
Hospitals   Number of beds   Full Time Employees   ^
                  X                  Y             Y  b0  b1 X
    1             23                 69            ^
   2             29                  95            Y  30 .9125  2.232 X
   3             29                 102
   4              35                118
   5             42                 126
   6             46                 125
    7            50                  138
   8             54                  178
   9             64                 156
   10            66                 184
   11             76                 176
   12             78                225
   Measure of how well the regression line
    approximates the real data points
   The proportion of variability of the dependent
    variable (Y) explained by independent variable (X)
   R2 = 0 ---> no regression prediction of Y by X
   R2 = 1 ---> perfect regression prediction of Y by X
       (100% of the variability of Y is accounted for by X )
   r2 = Explained Variation / Total Variation
   Total Variation = Explained Variation + Unexplained Variation
    (The dependent variable,Y , measured by sum of squares ofY (SSyy))

   Explained Variation = sum of square regression (SSR)
                                    
                     SSR   Yi  Y )
                            i
                                        2
                                            
   Unexplained Variation = sum of square of error (SSE)
                     SSE    Xi  Yi 
                                                2

                                i
   r2 = Explained Variation / Total Variation
                          2
                       ^
                          
                  Y  Y 
                         
   r2 = 1 -
                       Y 2
               Y 2  n
               i
Hospitals   Number of beds   Full Time Employees   ^
                  X                  Y             Y  b0  b1 X
    1             23                 69            ^
   2             29                  95            Y  30 .9125  2.232 X
   3
   4
                 29
                  35
                                    102
                                    118
                                                   SSE = 2448.6
   5             42                 126
   6             46                 125            r2 = 0,886
    7            50                  138
   8             54                  178
   9             64                 156
   10            66                 184
   11             76                 176
   12             78                225
Diane Christina | 2009

diane.christina@apb-group.com | me@dianechristina.com
                     http://dianechristina.wordpress.com

Business Statistics_an overview

  • 1.
    A groundwork forrisk assessment Diane Christina | 2009
  • 2.
    Descriptive Statistics used to describe the main features of a collection of data in quantitative terms  Inferential Statistic comprises the use of statistics and random sampling to make conclusion concerning some unknown aspect of a population Sample (mean) Random • Calculate Population sample mean Sample to estimate (mean) population mean
  • 3.
    Measures of central tendency (Mean, Median, Mode)  Measures of dispersion (variance, standard deviation)  Measures of shape (skewness)
  • 4.
    Mean  Arithmetic Mean  Geometric Mean  Median  Mode  Quartiles
  • 5.
    Range: the difference between the largest value of data set and the smallest value  Interquartile range: the range of values between the first and the third quartile  Mean absolute deviation MAD = ∑ | x – x | / n  Variance S2 = ∑ X2 – (∑ X)2/n (for sample variance) n-1  Standard Deviation S  S2
  • 6.
    Interpretation of StandardDeviation Eg. µ = 100 σ=15 • ± 1σ = 85/115 • ± 2σ = 70/130 • ± 3σ = 55/145 Frequency Value Changes 68% 95% 99.7%
  • 7.
    Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable Negatively Skew/ Positively Skew/ Skewed to the left Skewed to the right Mean Mean Mode Mode Median Median Sk = 3 (mean − median) / standard deviation
  • 8.
    Relative Cumulative Class Interval Frequency Mid Point Frequency Frequency 20 ≤ x < 30 6 25 .12 6 30 ≤ x < 40 18 35 .36 24 40 ≤ x < 50 11 45 .22 35 50 ≤ x < 60 11 55 .22 46 60 ≤ x < 70 3 65 .06 49 70 ≤ x < 80 1 75 .02 50 Totals 50 1.00
  • 9.
    STEM LEAF 2 3 4 7 86 77 91 60 55 5 5 9 76 92 47 88 67 6 0 7 7 3 5 6 7 8 3 6 8 23 59 73 75 83 9 1 2
  • 10.
  • 11.
    Method of assigningprobabilities:  Classical (Apriority) probability  Relative frequency of occurrence  Subjective probability
  • 12.
    General law of addition P X  Y   P X   PY   P X  Y   Special law of addition P X  Y  P X  P Y   General law of multiplication P  X  Y   P  X  P Y | X   P Y  P  X | Y   Special law of multiplication P X  Y   P X  PY   Law of conditional probability P  X | Y   P X  Y   P X  PY | X  PY  P Y 
  • 13.
    Construct risk modeland measure the degree of relatedness of variables
  • 14.
    Find the equationof regression line ^ Y  b0  b1 X _ _  Where as the populationY intercept b0  Y  b1 X  The population slope  X  Y   XY  b1  SSxy  n SSxx  X 2 X2 n
  • 15.
    Hospitals Number of beds Full Time Employees ^ X Y Y  b0  b1 X 1 23 69 ^ 2 29 95 Y  30 .9125  2.232 X 3 29 102 4 35 118 5 42 126 6 46 125 7 50 138 8 54 178 9 64 156 10 66 184 11 76 176 12 78 225
  • 16.
    Measure of how well the regression line approximates the real data points  The proportion of variability of the dependent variable (Y) explained by independent variable (X)  R2 = 0 ---> no regression prediction of Y by X  R2 = 1 ---> perfect regression prediction of Y by X (100% of the variability of Y is accounted for by X )
  • 17.
    r2 = Explained Variation / Total Variation  Total Variation = Explained Variation + Unexplained Variation (The dependent variable,Y , measured by sum of squares ofY (SSyy))  Explained Variation = sum of square regression (SSR)  SSR   Yi  Y ) i 2   Unexplained Variation = sum of square of error (SSE) SSE    Xi  Yi  2 i
  • 18.
    r2 = Explained Variation / Total Variation 2  ^   Y  Y     r2 = 1 -  Y 2 Y 2  n i
  • 19.
    Hospitals Number of beds Full Time Employees ^ X Y Y  b0  b1 X 1 23 69 ^ 2 29 95 Y  30 .9125  2.232 X 3 4 29 35 102 118 SSE = 2448.6 5 42 126 6 46 125 r2 = 0,886 7 50 138 8 54 178 9 64 156 10 66 184 11 76 176 12 78 225
  • 20.
    Diane Christina |2009 [email protected] | [email protected] http://dianechristina.wordpress.com