Please note : the data presented in all course material for the statistical module are
generated by computers to demonstrate the methodologies, and should not be confused with
actual clinical information
Introduction
Standard Deviation
Mean
Proportion
This page supports the programs in StatPgm 2a. One Group : Survey for Means and Proportions,
and discusses one of the most common research models, the Single Group Survey to establish population Standard Deviations, means, or proportions.
Single group surveys are usually fact finding exercises. Companies survey consumer needs and preferences (market research), political parties survey voter's intensions and priorities (polls), and social scientists survey opinions, prejudice, and personalities.
In health care, particularly in large public hospitals, the single group surveys are used mostly for quality assurance, to establish the current state of outcomes, to establish the bench mark.
Note : In this page, associated programs, and this module, error is expressed as the 95% confidence interval of measurements.
Nearly all parametric statistics depends on an accurate Standard Deviation. Once established, it is used in subsequent sample size determination, and to calculate statistical significance and error.
The tolerable error, as a percent of the Standard Deviation, is required for sample size determination. Once the data has been collected, the error is estimated using the sample size
Examples
 We wish to know what the Standard Deviation of birth weight is, with an error margin (95% confidence interval) of ±5%. The sample size required is 770.
After measuring 770 babies and found the Standard Deviation to be 400g, the error is 5% of 400, 20g.
The 95% confidence interval of Standard Deviation of birth weight is therefore 400±20 = 380g to 420g.
 We managed a biochemical laboratory, and establish a new measurement for a hormone. We wish to establish the Standard
Deviation of that hormone in the normal population to a 95% confidence interval of ±10%. The sample size required
is 193 measurements.
After measuring 200 blood samples, we found the Standard Deviation of the hormone to be 100 mgms/cc.
The 95% confidence interval for error is 9.8% of the Standard Deviation when the sample size is 200, so the 95% confidence
interval for the Standard Deviation of the hormone is 100±9.8 = 91.2mgms/cc to 109.8mgms/cc
Estimating the mean of a measurement of interest in a survey is commonly conducted in market research. These includes household income, amount of time spent on various activities, amount of money spent on various products, and so on. Estimating mean values is particularly important for industrial quality control, concerning variations from specification, cost of production, sales volume, and so on.
In health care, establishing mean values are commonly carried out, including normal range for a particular laboratory measurement, volume of blood loss in a particular operation, dosages required for certain drugs to work, duration of hospital stay, and so on. In epidemiology, age, weight, BMI, in fact, everything measurable and Normally distributed.
A particuarly useful way to use mean value is in the paired difference model, where the difference between the pair is calculated, and the 95% confidence interval is used to decide whether the difference deviates from null (0)
At the planning stage of a survey, the sample size, the number of cases that needs to be examined, is estimated. This depends on a known or expected Standard Deviation of the measurement, and the tolerable error in terms of the 95% Confidence Interval.
Once the data is collected, the 95% confidence interval of the measurement is estimated. This depends on the sample size used, and the Standard Deviation found.
Examples
 We wish to know what the birth weight of male babies are. From previous studies, we expect the Standard Deviation of birth weight to be 450g, and we require our results to have a 95% confidence interval of ±100g. The sample size required is 81 boys.
We managed to weighed 80 boys, and found mean birth weight 3700g, Standard Deviation 500g. With sample size=80 and SD=500, the 95% confidence interval of the measurement is ±111g. The result is therefore that birth weight of boys are 3700±111 =
3589g to 3811g
 We manage a biochemical laboratory, and established a measurement for a new hormone. We wish to know what the reference normal value is. From previous studies we know the Standard Deviation of this hormone is 100mgm/cc of blood, and we wish to have our normal value with a 95% confidence interval of ±10mgm/cc. The sample size is 387 samples of blood from normal subjects.
Having measured 400 samples of blood from normal subjects, we found a mean of 900mgm/cc and Standard Deviation of 95mgm/cc. With sample size=400 and SD=95, the 95% confidence interval for error is 9.3, so the 95% confidence interval of the mean is mean±9.3 = 900±9.3 = 890.7 to 909.3. We can conclude that the mean value of this hormone in normal individuals is 890.7mgm/cc to 909.3mgm/cc
 We wish to know whether intelligence (IQ) is effected by birth order. We know that the Standard Deviation of IQ is 10, and we want to know the difference (IQ_{older}IQ_{younger}) to a precision of ±5. The sample size is 18 sets of siblings.
We measure IQ from 20 sets of siblings, and caculate the difference diff= (IQ_{older}IQ_{younger}). The mean of difference is 3.2, and the Standard Deviation of the difference is 4.8. The precision = 2.2, and the 95% confidence interval of the difference is 3.22.2 to 3.2+2.2 = 1.0 to 5.4. As the 95% confidence interval does not overlap the null value (0), we can consider that the difference is significantly different to 0, and that birth order does effect IQ.
Estimating a proportion of cases in a survey with a characteristic of interest is one of the most common research conducted. Almost
weekly, there are published surveys on how many people will vote for a party in an election, whether the government is doing a good job, how many people will buy a particular product.
In health care, quality control requires a constant survey of admissions, discharges, complication rate, and so on.
At the planning stage of a survey, the sample size, the number of cases that needs to be examined, is estimated. This depends on the anticipated proportion and the 95% confidence interval of tolerable error.
After the data is collected, the proportion found and its 95% confidence interval are calculated. This depends on the sample size and the proportion observed.
Examples
 We wish to know what the Caesarean Section Rate is in a hospital. We suspect that it may be 20% (proportion=0.2), and the 95% confidence interval we required is ±5% (0.05). Using the expected proportion of 0.2 and tolerable error of 0.05, the sample size required is 246 births.
We examined 250 births, and found 55 Caesarean Sections, so the proportion positive, observed Caesarean Section rate, was 55/250 = 0.22 (22%). Using the sample size of 250 and observed proportion of 0.22, the 95% confidence is ±0.0513 (5.13%). Our Caesarean Section rate is therefore 0.22±0.0513 = 0.1687 to 0.2713 (16.9% to 27.1%). In other words, we are 95% sure
that our CS rate is 16.9% to 27.1%
 We wish to know the proportion of our IVF cycles that results in a viable pregnancy beyond 20 weeks. From past records, we suspect that this may be 20% (proportion = 0.2), and the 95% confidence interval we required is ±5% (0.05). Using the expected proportion of 0.2 and tolerable error of 0.05, the sample size required is 246 cycles.
We examined 250 IVF cycles, and found 55 pregnancies viable beyond 20 weeks, so the proportion positive, observed viable pregnancy rate, was 55/250 = 0.22 (22%). Using the sample size of 250 and observed proportion of 0.22, the 95% confidence is ±0.0513 (5.13%). Our Pregnancy rate is therefore 0.22±0.0513 = 0.1687 to 0.2713 (16.9% to 27.1%). In other words, we are 95% sure
that 16.9% to 27.1% of our IVF cycles result in a pregnancy viable beyond 20 weeks.
