Please note : the data presented in all course material for the statistical module are
generated by computers to demonstrate the methodologies, and should not be confused with
actual clinical information
Introduction
Exercises
Exercise 1 contains exercises for statistics involving a single group, most commonly used in surveys, epidemiological studies, and quality controls.
The theoretical basis for normal distribution, 95% confidence interval, and sample size are discussed in
Contents_1. Probability.
The programs for calculating probabilities can be found in StatPgm 1. Probability of z and t, The programs for calculating sample size and confidence intervals can be found in StatPgm 2a. One Group : Survey for Means and Proportions.
The program to convert a column of data into mean and Standard Deviation or a table of counts can be found in StatPgm 7. Supportive Utilities
The Microsft Office package of Word, Excel, and Powerpoint, or similar software, should be activated during the exercise. Excel is a useful tool to manipulate data, Powerpoint is useful to edit graphics, and the results should be copied to and edited in a Word file.
1. Estimating a Standard Deviation
Question 1_1 : click to show contents
2. Estimating a Mean
Questions 2_1 : Sample Size : click to show contents
Calculate the sample size (number of cases to observe) required for the following auditing project
 To audit blood loss during Caesarean Section, we expect, from past experience that the Standard Deviation is 500mls
 Calculate the sample size required to establish the amount of blood loss with a 95% confidence interval precision
of 100mls, 200mls, and 400mls
 Produce a bar chart to show the relationship between sample size requirements for estimating blood loss of expected
Standard Deviation of 500mls with 95% confidence interval precision of ±100mls, ±100mls, and ±400mls
 We expect that, in twin pregnancies, the baby that delivered first is likely to be bigger than the second one, with a
Standard Deviation of the difference at 100g.
 Calculate the sample size (number of sets of twins) required to establish the difference between twin 1 and twin 2, with
a 95% confidence interval precision of ±20g, ±50g, and ±100g
 Produce a bar chart to show the relationship between sample size requirements for estimating difference in birth weight
between twins for 95% confidence interval precisions of ±20g, ±40g, and ±50g
 The ultraviolet lamp (to reduce infections and pollutants) for IVF laboratories should have a mean wave length of 150nm,
and a Standard Deviation of 50nm. In our lamp factory, we wish to establish that the wave length emmitted from lamps we
manufactured complies with specification.
 Calculate the sample size (number of lamps inspected) required to establish the departure from 150nm, with
a 95% confidence interval precision of ±10nm, ±15nm, and ±30nm
 Produce a bar chart to show the relationship between sample size requirements for estimating departure from 150nm for 95%
confidence interval precisions of ±10nm, ±15nm, and ±30nm
Answers 2_1 : click to show contents
 To audit blood loss with an expected Standard Deviation of 500mls
 The sample sizes are 99 Caesarean Sections for a 95% confidence interval precision of ±100mls, 27 Caesarean Sections
for a 95% confidence interval precision of ±200mls, 9 Caesarean Sections for a 95% confidence interval precision
of ±400mls
 To establish difference in birth weight between twin 1 and twin 2, with an expected Standard Deviation of the difference 100g
 The sample sizes are 99 sets of twins a 95% confidence interval precision of ±20g, 27 sets of twins for a 95%
confidence interval precision of ±40g, and 18 sets of twins for a 95% confidence interval precision of ±50g
 To establish departure from specification in ultraviolet emission, with an expected Standard Deviation of the difference 150nm
 The sample sizes are 99 lamps for a 95% confidence interval precision of ±10nm, 46 lamps for a 95%
confidence interval precision of ±15nm, and 14 lamps for a 95% confidence interval precision of ±30nm
Questions 2_2 : Precision : click to show contents

1660 1400 2350 1210 1050 1870 2120
2100 1680 730 2340 1480 1630 1850
1360 1500 1300 1140 1640 1490 2350
1320 1790 1700 1230 1600 700 1060
1470 1430 1470 1390 2530 1200 1250
2070 1030 1040 2470 1820 1270 1410
1780 680 1240 1630 2510 1410 2150

The table on the right are blood loss in mls from 49 Caesarean Sections
 Calculate mean, 95% confidence interval of blood loss, and 95% confidence interval of the mean blood loss
 Based on these observations, can we conclude that blood loss is in excess to the benchmark of 1500mls
 Using the data as a reference, estimate the 90^{th}, 95^{th}, and 99^{th} percentile in blood loss
 Using the data as a reference, estimate the percentile of blood loss for 2000mls, 2250mls, and 2500mls
 Plot all 49 values, and mark the mean, 95% confidence interval of blood loss and 95% confidence interval of the mean

Twin 1 Twin 2 Twin 1 Twin 2 Twin 1 Twin 2
1940 1830 1860 1840 2170 2020
2250 2260 1900 1910 1930 2040
1680 1520 2000 1850 1850 1840
1860 1740 2070 2010 1850 2010
2310 2170 1830 1890 2040 2060
2250 2140 2080 1860 2080 1900
2000 1770 1820 1910

The table on the right are birth weight of twins in order of delivery
 Calculate the paired difference (twin 1  twin 2) for each pair, and calculate the mean, Standard Deviation, 95%
confidence interval of the difference, and 95% confidence interval of the mean
 Interpret the results as to whether the order of birth is related to difference in the weight of the twins
 Produce a graphic plot to show the relationship of birth weight between the first and second twin, so that whether
one twin is likely to be larger than the other can be easily visualized
 Produce a plot to show the paired differences, their mean, and 95% confidence intervals of the difference and the mean

144 153 154 126 124 142
158 138 156 166 135 148
168 120 147 155 154 166
153 177 153 135 167 137
166 132 125 133 157 158
157 137 147 145 166 164
124 140 161 153 155 150
145 148 143 123 147 131
144 159

In our factory to produce ultraviolet light lamps for IVF laboratories, the standard required for emission is a mean wave length of 150nm. The table on the right are wave lengths emmitted from 50 lamps sampled as quality control
 Calculate the 95% confidence interval of the wave length emmitted
 Interpret the results as to whether the quality of the lamps complies with the standard
 Produce a graphic plot to show the relationship of measurements and the standard
Answers 2_2 : click to show contents
 Blood loss from Caesarean Section are
 n = 49, mean = 1569mls, Standard Deviation = 469mls, Standard Error of the mean = 67.0mls
 95% confidence interval blood loss = >783mls (one tail), 627mls to 2512mls (two tails)
 95% confidence interval of mean = >1457mls (one tail), 1435mls to 1705mls (two tails)
 90^{th} percentile = 2178mls, 95^{th}percentile = 2356mls, and 99^{th}percentile = 2698mls
 2000mls = 82^{nd} percentile, 2250mls = 92^{nd} percentile, and 2500mls = 97^{th} percentile
 The one tail 95% confidence interval is 783mls or more, and this overlaps the benchmark value of 1500mls. These
observations therefore does not support that blood loss in this groups is in excess of the benchmark 1500mls.
 Paired difference in twins
 n = 20, mean = 60gm, Standard Deviation = 113g
 95% confidence interval of paired difference (twin 1  twin 2) = 177g to 297g
 95% confidence interval of mean of paired difference = >60g (one tail) and 7g to 113g (two tail)
 The two tail 95% confidence interval does not overlap the null (0) value, so a statistically significant difference in
birth weight exists, the first twin weighing more than the second twin. Please note : This is artificially
generated data, and the results are not real.
 Quality control in ultraviolet wave length emissiom
 n = 50, mean = 148nm, Standard Deviation = 14nm
 95% confidence interval of measurements = 120nm to 175nm
 95% confidence interval of mean = 144nm to 152nm
paired difference (twin 1  twin 2) = 177g to 297g
 95% confidence interval of mean of paired difference = >60g (one tail) and 7g to 113g (two tail)
 The two tail 95% confidence interval of the mean overlap the standard value (150nm), so the sample mean does not depart
significantly from the Standard. The conclusion that the lamps produced conforms to the standard can be made.
3. Estimating a Proportion
Questions 3_1 : Sample Size : click to show contents
Questions 3_2 : Precision : click to show contents
Calculate the following from the following data collected
 An audit of a hospital over one week, there were 77 deliveries, 27 of them delivered by Caesarean Section
 What is the Caesarean Section rate
 If we use the 95% confidence interval to define the benchmark, can we conclude that the Caesarean Section
rate from this audit is significantly different to the 25% benchmark
 If we use the 95% confidence interval to define the benchmark, can we conclude that the Caesarean Section
rate from this audit is significantly higher than the 25% benchmark
 Produce a graphic representation of the one and two tail 95% confidence intervals, set against the 25% benchmark
 We observed 500 babies born in a month, and found 240 girls
 What is the percentage of girls in that month
 If we use the 95% confidence interval to make our statistical decision, can we conclude that the number of girls are
significantly less than the 49% we expect
 If we use the 95% confidence interval to make our statistical decision
 From these results, can we confirm or reject the hypothesis that some female fetuses were aborted since ultrasound sex
identification of the fetus became available.
 Produce a graphic representation of the one and two tail 95% confidence intervals for proportion of girls,
set against the 49% benchmark
 An audit of an IVF unit over one month found 100 cycles with 22 pregnancies
 What is the pregnancy rate per cycle
 If we use the 95% confidence interval to define the benchmark, can we conclude that the pregnancy rate
rate from this audit is significantly different to the 30% benchmark
 If we use the 95% confidence interval to define the benchmark, can we conclude the pregnancy
rate from this audit is significantly lower than the 30% benchmark
 Produce a graphic representation of the one and two tail 95% confidence intervals, set against the 30% benchmark
Answers 3_2 : click to show contents
 From an audit of 77 deliveries with 27 Caesarean Sections
 The Caesarean Section rate from this audit is 35%
 This is not significantly different to 25%, as the two tail 95% confidence interval is 24% to 46%, overlapping 25%
 This is significantly higher than 25%, and the one tail (right side) 95% confidence interval is 26% or more, not
overlapping 25%

 From observing 500 newborns, 240 of which are girls
 The girls constitute 48% of new born
 The 95% confidence interval is 51.7% or less for the one tail model, and 43.6% to 52.4% for the two tail model, both are
not statistically different from 49%
 This set of observation does not support the hypothesis that there are less girls than expected, therefore it does
not support the hypothesis that more female fetuses are lost than male fetuses, whatever the cause.
 The absence of significant difference is not the same as significant sameness, as the sample size is 500, well short of
the 9601 cases required to detect a difference of 1%

 From an audit of 100 IVF cycles with 22 Pregnancies
 The pregnancy rate from the audit is 22%
 This is not significantly different to 30%, as the two tail 95% confidence interval is 13.9% to 30.1%, overlapping 30%
 This is significantly lower than 30%, and the one tail (left side) 95% confidence interval is 28.8% or less, not
overlapping 30%

 Please note the 95% CI horizontal lines are much closer to the vertical benchmark lines (25% and 30%), but I have changed
the data to show that they just cross or miss the benchmark line. Students should not be alarmed if their graphics show that
the horizontal lines touch the vertical line.
