The purpose of the Analysis stage is to use the collected data and information from the Plan to deal with the questions formulated in the Problem step. The form and formality of the Analysis depends on many things including: the complexity of the Problem and Plan, the skill of the analyst, the amount of variability induced by the Plan, and the intended audience of the study. We propose the following general breakdown of the stage:
A statistical model describes the behaviour of the measured response variates for the units included in the sample if we repeatedly executed the Data step according to the Plan. The model reflects properties of the study population, the sampling protocol and the measurement systems used. The model also includes the influence of measured explanatory variates on the response variate.
Once an initial model is postulated, fitting and model assessment tools can be used to suggest refinements to the model. This iterative process continues until the model is consistent with the internal structure of the collected data and known information about the sampling protocol and measurement systems. The final model is used to estimate attributes of interest in the study population and to assess the uncertainty due to sampling and measuring errors.
Michelson limited his analysis to the calculation of the average of the 100 measured velocities in air, a numerical summary and an estimate of possible error, a formal procedure. The error is based on a worse case scenario, combining probable errors based on the estimated standard deviations of replicate determinations and maximal systematic error, based on Michelson's knowledge of his apparatus and the functions used to calculate the speed of light from the measured response variates. For more discussion on the use of probable error, see Stigler [48].
After making a small adjustment for temperature (in air) based on the effects of temperature change on the
systems used to determine , the angle of deflection, and correcting to a vacuum, Michelson
concludes his analysis by reporting the speed of light in vacuo (kilometres per
second ) to be
Although Michelson did not formally propose a model, he carried out numerous checks that are equivalent to aspects of model assessment ([39] page 139). For example, to see if the measured speed of light was systematically influenced by the distinctness of the image, an explanatory variate, he calculated and compared the average velocities stratified by distinctness of image. This checking was repeated for many other explanatory variates.
Today, we can use corresponding graphical methods. Perhaps the speed depends on some of the
explanatory variates that are
not part of its calculation.
For example, has the effect of temperature been successfully removed from
the determinations?
Curiously, in his comparisons of group averages, Michelson did not compare morning and evening measurements nor attempt to relate the measurement to the date, as we explored in the Data stage. There are other interesting relationships to be found in this data; we leave further exploration to the reader.
Note that there is often not a clear distinction between the checks for internal consistency in the Data stage and these model checks in the Analysis stage. The same plots or summaries may appear in either.
Today, we can contemplate any number of ways to summarize, model and analyze the data. For example,
we might construct a histogram and calculate a 5-number summary of the 100
reported values. Based on a gausssian model, which appears to fit the data well, a confidence
interval for the mean is
Note that the confidence interval is much shorter than that reported by Michelson, who included both variability and possible bias in his calculation. Other more complex modelling, analyses and model assessment can be made. The above is used to demonstrate the sub-stages within the Analysis stage of PPDAC. Again it is evidence of Michelson's precision as a scientist that his analysis so carefully parallels what can be done today.
Another output of this stage are interesting observations that may well direct future investigations.