next up previous
Next: The Conclusion Up: Statistical Method and Michelson's Previous: Data Storage for subsequent

The Analysis

The purpose of the Analysis stage is to use the collected data and information from the Plan to deal with the questions formulated in the Problem step. The form and formality of the Analysis depends on many things including: the complexity of the Problem and Plan, the skill of the analyst, the amount of variability induced by the Plan, and the intended audience of the study. We propose the following general breakdown of the stage:

A statistical model describes the behaviour of the measured response variates for the units included in the sample if we repeatedly executed the Data step according to the Plan. The model reflects properties of the study population, the sampling protocol and the measurement systems used. The model also includes the influence of measured explanatory variates on the response variate.

Once an initial model is postulated, fitting and model assessment tools can be used to suggest refinements to the model. This iterative process continues until the model is consistent with the internal structure of the collected data and known information about the sampling protocol and measurement systems. The final model is used to estimate attributes of interest in the study population and to assess the uncertainty due to sampling and measuring errors.

Michelson limited his analysis to the calculation of the average of the 100 measured velocities in air, a numerical summary and an estimate of possible error, a formal procedure. The error is based on a worse case scenario, combining probable errors based on the estimated standard deviations of replicate determinations and maximal systematic error, based on Michelson's knowledge of his apparatus and the functions used to calculate the speed of light from the measured response variates. For more discussion on the use of probable error, see Stigler [48].

After making a small adjustment for temperature (in air) based on the effects of temperature change on the systems used to determine $\phi$, the angle of deflection, and correcting to a vacuum, Michelson concludes his analysis by reporting the speed of light in vacuo (kilometres per second ) to be

\begin{displaymath}299944 ~\pm 51
\end{displaymath}

Although Michelson did not formally propose a model, he carried out numerous checks that are equivalent to aspects of model assessment ([39] page 139). For example, to see if the measured speed of light was systematically influenced by the distinctness of the image, an explanatory variate, he calculated and compared the average velocities stratified by distinctness of image. This checking was repeated for many other explanatory variates.

Today, we can use corresponding graphical methods. Perhaps the speed depends on some of the explanatory variates that are not part of its calculation. For example, has the effect of temperature been successfully removed from the determinations?

  
Figure 12: Adjusted speed of light (jittered) versus temperature.
\begin{figure}
\centerline{\psfig{figure=speed-temp.ps,height=3.0in}}
\end{figure}

A plot of speed versus temperature is shown in Figure 12. A fairly weak increasing trend is discernible in the plot. However, even this trend depends heavily on the three points in the lower left corner and so is not likely to alter the result significantly. Again the values have been jittered to resolve the over-plotting of identical values.

Curiously, in his comparisons of group averages, Michelson did not compare morning and evening measurements nor attempt to relate the measurement to the date, as we explored in the Data stage. There are other interesting relationships to be found in this data; we leave further exploration to the reader.

Note that there is often not a clear distinction between the checks for internal consistency in the Data stage and these model checks in the Analysis stage. The same plots or summaries may appear in either.

Today, we can contemplate any number of ways to summarize, model and analyze the data. For example, we might construct a histogram and calculate a 5-number summary of the 100 reported values. Based on a gausssian model, which appears to fit the data well, a $95\%$ confidence interval for the mean is

\begin{displaymath}299852.3 ~{\pm} ~15.7
\end{displaymath}

Correcting for temperature, following Michelson, and converting to a vacuum, a $95\%$ confidence interval for the speed of light (km/s) in vacuo is

\begin{displaymath}299944.3~{\pm}~15.7
\end{displaymath}

Note that the confidence interval is much shorter than that reported by Michelson, who included both variability and possible bias in his calculation. Other more complex modelling, analyses and model assessment can be made. The above is used to demonstrate the sub-stages within the Analysis stage of PPDAC. Again it is evidence of Michelson's precision as a scientist that his analysis so carefully parallels what can be done today.

Another output of this stage are interesting observations that may well direct future investigations.


next up previous
Next: The Conclusion Up: Statistical Method and Michelson's Previous: Data Storage for subsequent

2000-05-24