Medical Statistics. David Machin
Читать онлайн книгу.calculation of the midpoint of two values. Differences between the results using the different conventions are usually small and unimportant in practice.
To calculate the quartiles for the foot corn size data in Tables 2.4 and 2.5, as the number of observations is even (n = 16), the upper quartile is the ¾(n + 1)th ordered value or ¾ (17)th = 12¾ ordered value. When the quartile lies between two observations the easiest option is to take the mean (there are more complicated methods). The upper quartile (simple method) is the mean of the 12th and 13th ordered values or (4 + 5)/2 = 4.5 mm. Rounding the 12¾ ordered value to the nearest integer (the 13th ordered value) gives an upper quartile of 5 mm.
A more complicated method for estimating the upper quartile is by interpolation between the 12th and 13th ordered values. The interpolation involves moving ¾ of the way from the 12th ordered value towards the 13th value, i.e. 4 + ¾ (4 to 5) = 4.75 mm (or 4.8 mm when rounding to one decimal place).
The lower quartile is the ¼(n + 1)th ordered value or ¼(17)th = 4¼ ordered value. The lower quartile (simple method) is the mean of the 4th and 5th ordered values or (2 + 2)/2 = 2.0 mm. Rounding the 4¼ ordered value to the nearest integer (the 4th ordered value) gives a lower quartile of 2 mm. A more complicated method for estimating the lower quartile is by interpolation between the fourth and fifth ordered values. The interpolation involves moving ¼ of the way from the fourth ordered value towards the fifth value, i.e. 2 + ¼ (2 to 2) = 2 mm (or 2.0 mm when rounding to 1 decimal place). The IQR estimated by the simple method is 2.0 to 4.5 mm vs 2.0 to 5.0 mm (using the rounding to the nearest integer method) vs 2.0 to 4.8 mm using the more complex interpolation method. As we already mentioned the differences between the results using the different conventions for calculating the quartiles are usually small and unimportant in practice.
2.10 Exercises
Figure 2.11 shows the anatomical site of the foot corn by randomised group for 201 patients (Farndon et al. 2013) who were taking part in a randomised controlled trial to investigate the effectiveness of salicylic acid plasters compared with usual scalpel debridement for treatment.
Figure 2.11 Anatomical site of corn on the foot by randomised group for 201 patients with corns
(Source: Farndon et al. 2013).
1 2.1 What type of graph is Figure 2.11?Bar chartPie chartHistogramScatterplotDot plot
2 2.2 What type of data in Figure 2.11 is anatomical site of foot corn?DiscreteNominalBinaryOrdinalContinuous
3 2.3 Using the data shown in Figure 2.11 which anatomical site of the corn on the foot was the least frequently reported patients in the corn plaster group? The least frequently reported anatomical site of the corn on the foot patients in the corn plaster group was:ApexProximal interphalangeal jointInterdigitalMetatarsal headPlantar calcaneus
4 2.4 Using the data shown in Figure 2.11, approximately how many patients in the scalpel treated group had corn on the proximal interphalangeal joint (middle part of toe on the top)?The approximate number of patients in the scalpel treated group with a corn on the proximal interphalangeal joint was:5811 50100
5 2.5 Using the data shown in Figure 2.11, what approximate percentage of the sample of patients in the corn plaster treated group had a corn on the metatarsal head (ball of the foot at the bottom)?The approximate percentage of patients in the corn plaster treated group who had a corn on the metatarsal head was:20%30%40%50%60%The baseline corn size, in mm, of 10 randomly selected patients from the corn plaster randomised controlled trial (RCT) (Farndon et al. 2013) are given below2 2 2 3 4 4 5 5 7 10
6 2.6 The mean corn size (in mm) for this sample of 10 patients is:1.02.03.04.04.4
7 2.7 The median corn size (in mm) for this sample of 10 patients is:1.02.03.04.04.4
8 2.8 The modal corn size (in mm) for this sample of 10 patients is:1.02.03.04.04.4
9 2.9 The range of corn sizes (in mm) for this sample of 10 patients is:1 to 10 2 to 52 to 103 to 113 to 12
10 2.10 The interquartile range (IQR) corn size (in mm) for this sample of 10 patients is:2 to 102 to 72 to 53 to 73 to 10
11 2.11 The variance in corn size (in mm2) for this sample of 10 patients is:0.82.53.54.46.5
12 2.12 The standard deviation corn size (in mm) for this sample of 10 patients is:0.82.53.54.46.5
3 Summary Measures for Binary Data
1 3.1 Summarising Binary and Categorical Data
2 3.2 Points When Reading the Literature
Summary
This chapter illustrates methods of summarising binary and categorical data. It covers proportions, risk, rates, relative risk, and odds ratios. The importance of considering the absolute risk difference (ARD) as well as the relative risk is emphasised.
3.1 Summarising Binary and Categorical Data
Categorical data are simply data which can be put into categories. Binary data are the simplest type of categorical data. Each individual has a label which takes one of two types. A simple summary would be to count the different types of label. However, a raw count is rarely useful. For example, there were 45 656 new cases of breast cancer registered in England in 2016. On its own this sounds like a large number, but there were 303 135 new cases of all cancers registered in 2016. Thus breast cancer accounts for 15.2% (45 656/303 135) of all new cancer registrations in England. Proportions are a special example of a ratio. When time is also involved (as in counts per year) then it is known as a rate. The mid‐year population of England in 2016 was estimated as 55 268 067. Thus, the breast cancer registration rate was 0.008 (45 656/55 268 067).
Ratios, Proportions, Percentages, Risk and Rates
A ratio is simply one number divided by another. If we measure how far a car travels in a given time then the ratio of the distance travelled to the time taken to cover this distance is the speed.
Proportions are ratios of counts where the numerator (the top number) is a subset of the denominator (the bottom number). Thus in a study of 50 patients, 30 are depressed, so the proportion is 30/50 or 0.6. It is usually easier to express this as a percentage (%), so we multiply the proportion by 100, and state that 60% of the patients are depressed. Clearly proportions must lie between 0 and 1 and percentages between 0 and 100%.
A proportion is known as a risk if the numerator counts events which happen prospectively. Hence if 100 students start an introductory statistics course