Statistics in Nutrition and Dietetics. Michael Nelson
Читать онлайн книгу.at the distributions of variables and check for extreme values. Some extreme values may be genuine. Others may be a result of ‘fat finger’ syndrome (like typing an extra zero and ending up with 100 rather than 10 as a data point).
Understand how to use ‘missing values’ in SPSS. These help you to identify gaps in the data and how to handle them (for example, the difference between ‘I don’t know’, Not Applicable, or missing measurement).
If you find an unusual observation, check it with your supervisor or research colleagues. They may want to inspect your data to see that your observations are correct. Don’t try and hide an unusual observation (or worse still, ignore it, or leave it out of the data set without telling anyone). Always be frank and open about letting others inspect your data, especially if you or they think there may be something wrong. We all make mistakes. It is no great shame if there are some errors in the data that we missed and that someone else helpfully spots for us. Be thick‐skinned about this. The real embarrassment comes if we do lots of analysis and the errors in the data only come to light when we make a presentation of our results.
1.7.2 Never Present Endless Detailed Tables Containing Raw Data
It is your job as a scientist to summarize data in a coherent form (in tables, graphs, and figures), tell an interesting story about the relationships between the variables you have measured, and interpret the results intelligently for your reader, using appropriate statistical analyses.
Of course, you need to keep accurate records of observations, and make sure that your data set (spreadsheet) is stored securely and that you have backup copies of everything. Bulking up a report with tables of raw data is bad practice, however. No one will read them.
Chapter 15 provides lots of examples about how to summarize data to make the presentation of results interesting. It also shows how to present results according to the type of audience you are talking to. If I am presenting new results about the impact of school food on attainment to scientific colleagues, I will include lots of information about the methods that I used to identify my samples, make observations, and analyze the data, as well as details about the findings themselves. My scientific colleagues will need enough information to be confident that my data are unbiased, that I have used the right analytical approaches, and that the results are statistically significant. This is the same basic approach that I will take when I am writing a paper for submission to a peer‐reviewed journal. In contrast, if I am presenting results on the same topic to a group of teachers, I will use lots of graphs and charts to summarize the results in a way that tells a good story. The underlying data will be the same, of course. But the teachers are likely to be bored by too much detail about methods – they probably just want to know the headline about whether better school food has a positive impact on attainment, and how big that impact is likely to be.
1.7.3 Significant Digits and Rounding
It always surprises me, when teaching undergraduate and postgraduate students, that they often don’t know how to round numbers properly. So when asked to present a result to two decimal places, for example, either they provide a string of numbers after the decimal place (far more than two) in the mistaken hope that somehow that is ‘better’ or ‘more accurate’, statistically speaking. Alternatively, it becomes evident that the concept of ‘rounding’ is not familiar to them, and their answers vary like leaves blowing in the wind.
The underlying principle is that when undertaking calculations, it is useful to retain as many digits as possible during the course of the calculation. This is what happens when you use your calculator, Excel, or SPSS [11]. This will produce the most mathematically precise result. However, when the final value is presented, it should contain no more significant digits than the numbers from which it is derived.
For example, calculate the average height of a group of five students, each of whom had been measured to the nearest centimetre (Table 1.1).
The average (the arithmetic mean) of these five values is 164.4 cm. However, presenting the result to the nearest millimetre (tenth of a centimetre) would be misleading, as it would imply that your starting observations had a similar level of precision. You should, instead, round the value for the result to the number of significant digits with which you started. In this case, round 164.4 cm (four significant digits) to 164 cm (three significant digits, to the nearest whole centimetre). This is the value that you should report for your result, as it reflects the level of precision of your original observations.12
TABLE 1.1 Height of Five Students (cm)
163 |
152 |
176 |
166 |
165 |
TABLE 1.2 Rules for Rounding
Rule | Original Value | Rounded Value |
If the final digit is less than 5, round to the value of the preceding digit | 164.4 | 164 |
If the final digit is greater than 5, round to the value which is one higher than the preceding digit | 164.6 | 165 |
If the final digit is equal to 5, and the preceding digit is odd, round up to the next even number | 163.5 | 164 |
If the final digit is equal to 5, and the preceding digit is even, round down to the preceding number | 164.5 | 164 |
The standard conventions for rounding are as shown in Table 1.2.
These conventions may differ from those which you have been taught, but they are the conventions followed by all calculators and all computers when undertaking computations.
Some people have been taught always to round up if the last digit is equal to 5, but the calculations in Table 1.3 illustrate the error which is introduced if that rule is followed.
Although this error seems small, it can be substantial if we are dealing with small numbers. For example, if the original values had been 3.5, 4.5, 5.5, and 6.5, and we were rounding to whole numbers, the average would be equal to 5 for the original values and correctly rounded values (4, 4, 6, 6), but the average for the incorrectly rounded values would be 5.5 – an error of 10%!
Curiously, Excel and some statistical packages (e.g. Minitab) display numbers which are always rounded up when the last digit is 5. However, underlying calculations are based on the correct rules for rounding shown in Table 1.2. To ensure that your calculations are correct, always follow these rules.
Finally, be careful when reporting values from Excel or Minitab – even though the calculations will be right, the final