Latest Q&A
Archived Q&A
Q&A from Previous Modules## December 3rd 2017I understand one tail 95% confidence interval. I remained confused how to interpret it in rlationship to using which tail to
support or not support the research hypothesis
I will demonstrate with the following example. Lets say we want to know whether boys weighed more at birth than girls, and we have the results of 3 studies
right tail. In study 1 (>-45) and 3(>-405) the right tail 95% confidence intervals overlap the null value, so the results do not support the research hypothesis. In study 2 the right tail 95% confidence interval does not overlap the null value, so results of study 2 supports the research hypothesis If you are still confused, one way to clarify the situation is to draw the 95% confidence intervals as a Forest Plot. I will present a formal diagram (to the right), but you can do this quickly using paper and pencil. Please note the following - A determination to use the right tail was made from the srart, and not after examining the figures. In this case, the hypothesis is that group 1 > group 2, so the right tail is looked at. The 3 left tails, in black, are of no interest
- Only when the right tail does not overlap the null value, does the result supports the research hypothesis
- In study 1, the right tail (red) overlaps the null value, so the result does not support the research hypothesis
- In study 2, the right tail (blue) does not overlap null, so the result supports the research hypothesis
- In study 3, the right tail (red) overlaps null, so the result does not support the research hypothesis
## July 25th 2017I do understand one and two tail, as well as the 95% confidence interval. However I keep getting the calculations wrong. Is there a simple approach for me to use to get the right answers.
Let me use the example data in StatPgm_3a_2Measurements.php, pgm 3aii - Data
Grp n mean sd grp 1 24 153.9 3.1 grp 2 25 157.1 2.8 - Results
- Difference = mean
_{1}- mean_{2}= 153.9 - 157.1 = -3.2 - Standard Error (SE) = 0.8
- 95% CI (two tail)
- t for two tail = 2.0
- 95% CI = (mean- t x SE) to (mean + t x SE) = (-3.2 + 2.0 x 0.8) to (-3.2 + 2.0 x 0.8) = -4.9 to -1.9 (with minor
rounding errors)
As the whole of the 95% confidence interval not overlapping null (0), we can conclude that a significant difference exists
- 95% CI (one tail)
- t for 1 tail = 1.7
- The right tail, to be used if the hypothesis is "difference >0". The 95% CI excluding the 2.5% on the left = > 2.5
percentile = > mean - t x SE = >-3.2 - 1.7 x 0.8 = >-4.6
This set of data shows the difference >-4.6, overlapping 0, so we cannot conclude the difference is > then 0 - The left tail, to be used if the hypothesis is "difference <0". The 95% CI excluding the 2.5% on the right = < 97.5
percentile = < mean + t x SE = <-3.2 + 1.7 x 0.8 = <-1.8
This set of data shows the difference <-1.8, not overlapping 0, so we cannot conclude the difference is < then 0
- Difference = mean
- The things to check
- Make sure the t value you used to calculate the confidence interval is the correct one, as there is a t for one tail and a different t for two tail
- The two tail 95% CI is easy, as it is difference - t x se to difference + t x se
- The one tail 95% CI is equally easy when you are familiar with them, but a bit counter-intuitive for the beginner, because the
left/right, +/-, and </> are not aligned and have to be carefully placed
- Conventionally, the difference is group 1 - group 2, so the one tail hypothesis of grp 1 < grp 2 is the same as difference < 0. The 95% confidence to use here is -∞ to (difference + t x SE) or >(difference + t x SE). If this is <0 then it is significant. If this is >0 then it is not significant
- On the other hand, the one tail hypothesis of grp 1 > grp 2 is the same as difference > 0. The 95% confidence to use here is (difference - t x SE) to +∞, or <(difference - t x SE). If this is >0 then it is significant. If this is <0 then it is not significant
## July 25th 2017What is the null value
The null value is defined by Fisher as the value representing no difference in the null hypothesis ## July 11th 2017Why are the results published on the teaching and example pages are sometimes slightly different to when I do the calculations myself
As I have previously explained, computers have different processors so calculations are precise to different number of decimal places. On top of this, statistical calculations often uses multiple iterations (repeated calculations to obtain the best approxinmation). Depending on the machine and the programs written therefore, results may differ slightly, anything up to 1-2%. This usually shows up as a difference of 1 to 5 in sample size calculations, and differences at the second or third decimal places in precision results. Students should not be alarmed by these minor differences. ## July 6th 2017In what way are t and z values differ
Both z and t are calculated the same way, both t and z = (value-mean) / (Standard Deviation or Standard Error). Both means the number of Standard Deviations (or Standard Errors) from the mean. z was first devised by Fisher, who mathematically assumed that he was dealing with a population, every one involved, or very large numbers. t was devised later (by someone who called himself Student), as a correction for z when the data is from a sample (not everyone), or when the number of observations (sample size) is small. The reason for its development was so that conclusions can be drawn with few observations (small sample size) When the sample size is infinite (everyone), z and t (one tail) have the same value. As sample size decreases, the probability value from t becomes larger than the probability value for z. When the degrees of freedom (sample size - 1) is less than 400, the difference is big enough to be noticed. ## July 6th 2017Why do I get different results when I enter the data with different number of decimal point precision
Most modern computer uses calculators with 64 bits (64 0/1) processor. This means in multiplication and division, the numbers are accurate to more than 14 decimal points. For outputting results, such precision are both unnecessary and confusing, so most statistical programs truncate the results to a default number of decimal points. In the programs for the module, all output are truncated to 4 decimal places, even when this is unnecessarily too many for many situations. The calculations starts with the numbers entered as data. The number of decimal point precision entered is interpreted by the computer as different values. For example, 1.2 means 1.20000000000 and 1.22 means 1.22000000000. To the user they are the same with trivial difference in precision, but to the computer they are completely different values, and in a complex calculations, difference from high precision calculation accumulates, so the results becomes different. Using the computer to perform any calculation therefore requires consideration concerning precisions. Both in entering the data and presenting the results, the number of decimal points in precision should be no more than adequate for the purpose. For example, there is no point using any decimal points in birth weight when babies are weighed to the nearest 10g, and no point in using more than 1 decimal point in height when most heights are measured to the nearest half cms. ## June 1st 2017In difference between two means, the one tail model, two 95% confidence intervals are provided. Which one should I choose
It depends on the research question. I will illustrate with the following example
We are comparing the birthweight in grams of boys and girls, with the data as shown to the right. Using StatPgm 3aii from StatPgm_3a_2Measurements.php, the 95% confidence intervals (boys - girls) are: - One tail : <=254 or >=46
- Two tail : 26 to 274
^{th} to 97.5^{th} percentile of the difference) . Given that the 95% confidence interval does not overlap the null value (0), we can conclude that a significant difference exists, boys are found to be heavier than girls
If the research question is whether boys are heavier than girls, then the one tail model is used, with the exclusion of the left tail, so that the 95% confidence interval (5 If the research question is whether boys are lighter than girls, again the one tail model is used, with the exclusion of the right tail, so that the 95% confidence interval (0 Put this in another way, the <= interval is used to test group 1 (boys) < (lighter than) group 2 (girls), and the >= interval is used to test group 1 (boys) > (heavier than) group 2 (girls) |