StatTools : Sample Size for Prediction Statistics Explained and Tables

Links : Home Index (Subjects) Index (Categories) Contact StatTools

Related Links:
Prediction Statistics Explanation Page
Sample Size for Prediction Statistics Program Page

Introduction Sample Size Tables References

True Positive Rate (Sensitivity) and True Negative Rate (Specificity) are both proportions that follow the binomial distribution. Where two groups are being compared, the model does not differ from that of comparing two proportions as described in Sample size for Two Proportions Explanations and Tables Page

More recently Casagrande et.al. suggested an improved sample size calculation that provides greater precision, which allows both paired and unpaired comparisons. This algorithm is used for tables in this page, and calculations in the Sample Size for Prediction Statistics Program Page .

Unpaired comparisons

Unpaired comparison involves two groups of unrelated individuals. An example may be to compare the Sensitivity of the mother feeling decreased fetal movement as a predictor of impending stillbirth between one group with first pregnancies and another group who had a baby before. The sample size calculated is the number of subject needed in each of the groups. <

Paired comparisons

Paired comparison is used to compare two tests or predictors, when both are administered to the same individual to predict the same outcome. An example is to compare the Sensitivities of the mother feeling decreased fetal movement, and that of an ultrasound detection of abnormal blood flow pattern, as predictors of impending stillbirth. Both tests can be administered to the same pregnant woman, and the qualities of the tests compared against the outcome.

Paired comparison is very much more powerful, as it reduces or eliminates variations between individuals. The sample size required pertains to the number of subjects that received both tests, or the number of matched pairs.

Two sample sizes are calculated for paired comparisons, the minimum and the maximum. In theory, the correct sample size is somewhere between the minimum and the maximum, depending on the correlation (agreeing with each other) between the tests. In practice, a conclusion that a statistically significant difference exists can be drawn if this is demonstrated when the sample size reaches or exceeds the minimum, but a conclusion that there is no significant difference can only be drawn after the maximum sample size has been reached.

The sample size for paired comparison can also be used to calculate the approximate sample size required to estimate an effective predictor (True Positive or True Negative Rates), comparing the value to be detected against the3 diagnostic equivalent of null value (0.5). The program however over-estimates the sample size requirement as it assumes both values in the pair are sample estimates, when 0.5 is a constant with no error. A table for this sample size is also presented in this page

StatTools follows Casagrande's example and calculate sample size for prediction parameters using the one tail model. Users needing the two tail model can use the algorithm provided, but halve the Probability of Type I Error (α)