\documentstyle[psfig,fullpage]{article}
\title{Scientific Method, Statistical Method, and the Speed of Light.}
\author{R.J. Mackay and R.W. Oldford \thanks{Research supported by the Natural
Sciences and Engineering Research Council of Canada}\\
Department of Statistics and Actuarial Science\\
University of Waterloo}
\begin{document}
\bibliographystyle{plain}
\maketitle
\begin{abstract}
What is ``statistical method''?
Is it the same as ``scientific method''?
This paper answers the first question by specifying the elements and procedures
common to all statistical investigations and organizing these into a single structure.
This structure is illustrated by careful examination of the first scientific
study on the speed of light carried out by A.A. Michelson in 1879.
Our answer to the second question is negative.  To understand this 
a history on the speed of light up to the time of Michelson's study is presented.
The larger history and the details of a single study allow us to place the method
of statistics within the larger context of science.
\end{abstract}
\section{Introduction.}
{\small
\input{pearson}
}
{\small
\input{kendall-long}
}
The view that statistics entails the quantitative expression of scientific
method has been around since the birth of statistics as a discipline.
Yet statisticians have shied away from articulating the relationship between
statistics and scientific method, perhaps with good reason.
For centuries great minds have debated what constitutes
science and its method without resolution (e.g. see \cite{Madden:methods}).
%And in this century, historical examination of scientific episodes  (e.g. \cite{Kuhn:rev})
%has shown most definitions of scientific method to be found wanting.
And in this century, historical examinations of scientific episodes  (e.g. \cite{Kuhn:rev})
have cast doubt on method in scientific discovery.
One radical position, established by examination of the works of Galileo, is that of the
philosopher Paul Feyerabend who writes of method in science:
{\small
\input{feyerabend1}
}
\noindent Feyerabend then proposes, somewhat facetiously, that the only universal method to
be found in science is ``anything goes.''
Whether Feyerabend's view holds for science in general is debatable;
that it does not hold for statistics is the primary thesis of this paper.

By examining in some detail one particular scientific study, namely A.A. Michelson's
1879 determination of the speed of light \cite{aamich:1880}, we illustrate what we consider to be
the common structure of statistics, what we propose to call {\em statistical method}.

There are several reasons for selecting Michelson's study. 
First, physical science is sometimes regarded as presenting a greater challenge to the explication of 
statistical
method than, say, medical or social science where {\em populations of interest are well defined}.
An early instance is Edgeworth's hesitation in 1884 to describe statistics as the ``Science of Means in 
general
(including physical observations)'', preferring instead the less ``philosophical'' compromise that
it is the science ``of those Means which are presented by social phenomena'' (\cite{edge:methods}).


Second, the speed of light in vacuum is a fundamental constant whose
value has become ``known''; in 
1974,
it was {\em defined}
\footnote{
By that time
the determinations had so little variability that it
was considered known to 1 part in $10^9$, and the standard metre could
not be measured to that great a precision.
The second is similarly defined; it is the time taken for
9,192,631,770 periods of the radiation corresponding to the transition
between two hyperfine levels of the atom Cesium-133.
By {\em defining} these two quantities all uncertainty was shifted
to the unit of distance, a metre, now defined to be
distance travelled by light through a vacuum in 1/299792458
second! See \cite{metre:def}.
}
to be 299,792.458 km/s.
So we are in the extremely rare inferential position of ``knowing the answer.''

Third, Michelson reported his study in an era when it was possible to publish significant
amount of detail, permitting others insight into the difficulties he faced and the solutions
he found.

Fourth, the determination of the speed of light has been
(and continues to be) important to science and to technology.
Consequently its history is rich enough to 
provide a backdrop on which large scale questions of the nature
of science and statistics can be discussed.

Fifth, the determinations are known in the statistical literature,
first appearing in Stigler's paper (\cite{Stigler:robust}) on robust
estimates of location.  

Finally, and most importantly, a historical study has the important characteristic of being
based entirely on public material.  Information gathered together into
a single source is information that can be checked against common sources,
that can be improved as new historical material becomes available,
and that can be a common test bed for others to use.
To these ends, we have tried to present the history without reference to method.

These discussions require separate contexts of differing detail.
A broad historical sweep is necessary to appreciate what can be meant by scientific method.
It is provided in Section 2, where we give a history of the
determination of the speed of light from antiquity to the late 1800s.
The stage thus set, the optics, apparatus, and method of Michelson's first determinations of the speed of
light are described in Section 3.
These provide the details necessary for discussion
of statistical method.
The structure which we propose is described in Section 4.
Scientific method is examined in Section 5 and contrasted with statistical method in
Section 6.
A final section explores what we consider to be important ramifications of our approach.

\section{Historical background.}
The thought of Aristotle (384-322 BC) dominated western science for
nearly two millenia.
So powerful is his cosmology that it compels him to declare that
``$\ldots$ light is due to the presence of something,
but it is not a movement'' (\cite{Aristotle:sense}$446^b25-447^a10$).
No movement, no speed.
And if that were not enough, the argument for finite speed is easily dismissed:
{\small
\input{aristotle.tex}
}
{\noindent This view was echoed by many thinkers in
western history: Augustine (ca 354-430), John Pecham (ca 1230-1292),
Albert the Great (ca 1200-1280),
Thomas Aquinas (ca 1225-1274), and Witelo (ca 1230-ca 1275) to name a few.
So too, the opposite view was argued by some, notably Ibn Al-Haytham
(ca 965-1040) and Roger Bacon (ca 1219-1292).
But without empirical demonstration to the contrary, the case for instantaneous perception
of the source could always be made.
In the absence of data, arguments pro and con were forced to be based on
the contemporary theory of light, or on interpretation of the conflicting views
of ancient authorities, or on established religious doctrines, or on
mathematical arguments that demonstrated the necessity or absurdity of
one of the alternatives \cite{Lindberg:medieval}.}

The debate continued into the beginning of the ``scientific revolution''
of the seventeenth century.\footnote{C.D. Lindberg presents preliminary
evidence of the debate in medieval Europe \cite{Lindberg:medieval}.}
Such giants
as Francis Bacon\footnote{Bacon had doubts about the infinite
speed when considering the great distances that light must travel
from the stars to Earth but found such speed easier to swallow
given the already fantastic speeds at which stars must travel in their
daily orbit about the Earth! See Aphorism 46 of Book II of the Novum Organum
e.g. \cite{Bacon:Novum}}
 (1561-1626), Johannes Kepler (1571-1630),
and Ren\'{e} Descartes (1596-1650), believed the speed to be infinite.

Descartes, for example, likened the transmission of light to that of pushing
on a stiff stick  -- the instant one end (the source) was pushed the other end (the
perception) moved (pp. 258-9 of \cite{Gaukroger:Descartes}).
The analogy is powerful; there is no perceptible movement anywhere
along the stick, no matter how long a stick is used!
Descartes strongly held this view;
when his colleague and scientific mentor, Issac Beeckman
(1588-1637), claimed to have performed an experiment
which demonstrated the speed was finite,
Descartes dismissed the claim saying that if it were true, then
Descartes knows nothing of philosophy and his whole theory would
be refuted!\footnote{From \cite{Descartes:speed} page 307: {\em ``Contra ego,
si quae talis mora sensu perciperetur, totam meam Philosophiam
funditus eversam fore inquiebam.''} A rough translation, due to our
classically trained colleague G.W. Bennett, is
``On the contrary, I would be worried that my entire Philosophy would be
on the point of being completely overturned if any delay of this sort
were to be perceived by the senses.''}
Beeckman and Descartes could not agree on an experiment to resolve the
issue.\footnote{It is doubtful that Beeckman's 1629 experiment \cite{Beeckman:1629}
was successful.  The experiment involved firing a mortar and observing
its' flash in a mirror situated some 1851.85 metres away; the movement of a clock
situated at the side of the mortar would measure the time elapsed.
With today's value, the time for the flash to reach the mirror
and return would be about $\frac{1}{100,000}$ of a second!
Descartes argues that even if Beeckman could detect a delay of $\frac{1}{24}$ of
a pulse beat (or about $\frac{1}{24}$ of a second yielding
a speed of only around 89 km/s), then it should be possible to detect a delay
between the occurrence and perception of a lunar eclipse of about one hour.
The flaws in this argument are discussed in detail in \cite{Descartes:speed}.}

%At least since Aristotle (384-322 BC), many thinkers
%including Johannes Kepler (1571-1630) and Ren\'{e} Descartes (1596-1650)
%believed light's speed to be infinite.
%Galileo Galilei (1564-1642) disagreed:
Among these giants, Galileo Galilei (1564-1642) stands alone
in his disagreement;
he wrote
{\small
\input{galileo.tex}
}
{\noindent
In the same book, Galileo proposed a demonstration to determine whether light was instantaneous.
It was essentially the same that Beeckman had proposed earlier and drew similar fire from Descartes.
In a letter to the great experimental scientist Marin Mersenne (1588-1647),
dated 11 October 1638, Descartes gave a scathing review\footnote{E.g. ``... his fashion of
writing in dialogues, where he introduces three persons who do nothing but exalt
each of his inventions in turn, greatly assists in [over]pricing his merchandise.''
Page 388 of \cite{Drake:sci-bio}. The substantive criticisms are generally
directed at Galileo's not having identified the causes of the phenomena he investigated.
For most scientists at this time, and particularly for Descartes, that is the whole point of science.}
of Galileo's book. Of the proposed demonstration, Descartes wrote
``His experiment to know if light is
transmitted in an instant is useless, since eclipses of the moon, related so closely to
calculations made of them, prove this incomparably better than anything that could be tested on earth.''
\footnote{
Page 389 of \cite{Drake:sci-bio}.
This refutation appears to be based on the argument he gave to Beeckman as described in note 5.}
Nevertheless, the demonstration was tried in 1667 by members of the Florentine Academy,
%
%A method was proposed by Galileo in 1638 \cite{Galileo:1638}
%and subsequently tried by the Florentine Academy in 1667,
but without success.
\cite{cohen:1940}
Light's movement was either instantaneous or near enough so as to be too fast
to measure successfully.


In 1676 the first empirical evidence of a finite speed was presented.
The Danish astronomer Ole R\"{o}mer (1644-1710), while investigating an
an entirely different matter, gathered data and found a discrepancy
which led to the discovery.
Interestingly, this important and purely
scientific discovery came about while R\"{o}mer was working on what we would today call
a very applied problem.

\subsection{Longitude.}
One of the great practical problems of that time was the
determination of longitude, particularly at sea.
The basis for the determination is the comparison of the local time at sea with the time
at a fixed reference point --- the prime meridian.
If, for example, the local time is determined to be
two hours earlier than the time at the
prime meridian, the location must be 360 $\times$ 2/24 = 30 degrees
longitude west of the prime meridian.

The times can be determined astronomically.
For example, local time zero can be defined to be that time when
some star, say Arcturus, is observed to cross the imaginary line
of longitude running directly north-south through the local position;
the corresponding standard time zero would be
that time when the same star crosses the prime meridian.
Stars are far enough away from us
that these two crossings will occur at
different moments of time.
Carefully determined tables of prime meridian crossing times
of various stars would allow navigators
to set their local clock.
To determine the difference between the local clock and the standard
clock, closer astronomical events like an eclipse or occultation
of the moon or a planet can be used.
These events are observed at essentially the same moment
of time whatever the observer's location on Earth, and furthermore are predictable.
So comparison of the local time of the close event with its tabulated
standard time would give the time difference necessary to calculate
longitude. 

In 1609, after hearing Flemish reports of a spyglass constructed from
two lenses that would enlarge the image of distant objects,
Galileo set about the design and construction of the first astronomically
useful telescope.\footnote{According to Stillman Drake
(\cite{Drake:disc} page 29), Hans Lipperhey
a lens grinder from the Netherlands is generally assigned credit for the
telescope's invention and applied for its patent in 1608.}
In March of the next year, Galileo reported his discovery of the
four principal moons of Jupiter \cite{Galileo:starry}.
For the first time,
here was an orbital system that was demonstrably not centred about
the Earth.
Galileo argued that this was compelling evidence against
the the Ptolemaic system (all celestial
bodies revolve around a fixed Earth) and in favour of the
Copernican sun-centred system.
His public support of the Copernican system as a true
representation of the movement of the planets (as opposed to a convenient
calculational model)
brought Galileo into conflict with those who would interpret certain
Biblical passages literally \cite{Galileo:duchess}.
Some of these people wielded considerable influence
within the Catholic church of Rome;
by order of Pope Urban VIII he was banned from further publication 
and placed under house arrest from 1633 until his death in 1642.
This did not prevent him from continuing his
scientific work.\footnote{Today's visitor to Florence's Museum of Science can find
a glass and ivory case displaying an ironic relic
-- Galileo's bony middle finger pointing heavenward.}

But this momentous scientific
discovery also had commercial potential.
King Philip III of Spain had offered a handsome prize
to anyone who could come up with
a practical method of determining a ship's position
when out of sight of land.
Galileo hit upon the idea of using the predicted times of the eclipses
of Jupiter's moons to provide the common celestial clock
necessary to determine longitude.
In November of 1616 he began negotiations with
Spain for navigational uses of his astronomical discoveries
and in 1617 worked on developing a telescope for use at sea while
continuing his negotiations with Spain \cite{Drake:disc}.
Unfortunately the tables he produced were not accurate enough
for their intended purpose --- the theory at the time
did not account for the perturbations of the moons due to their
mutual interaction \cite{nauthist:1968}.

Although many writers advocated their use at sea,
those who appreciated the practical difficulty of directing a
very long telescope at Jupiter while aboard a lively ship
were skeptical and undoubtedly amused by the proposed
method.
It was never to become successful at sea.
\footnote{The problem remained unsolved for more than 150 years until the development 
of accurate portable clocks by the English inventor John Harrison. For
a popular account, see \cite{sobel:long}}
But on land, very accurate determinations of
longitude could be obtained this way and resulted in
a substantial reform of geography in the 17th and 18th centuries.

\subsection{The first evidence.}
In 1671 R\"{o}mer went to Hven, an island community near Copenhagen,
to help re-determine the longitude of the observatory located there.
With others, he began observing a series
of eclipses of Io, Jupiter's largest moon.
In the end they
had eight months of observations or, since Io makes one revolution
of Jupiter in 42 hours,
timings on about 140 eclipses over 2/3 of the
year.
The time intervals between these eclipses
were not regular but depended on where the Earth
was in its orbit.
The length of the
interval was shorter when the Earth approached Jupiter than it was when
the Earth moved away from the planet.
The mathematically predicted time of an eclipse was too early if the
Earth was near Jupiter and too late if the Earth was far from Jupiter.
This systematic lack of fit allowed R\"omer to announce in Paris
in September 1676 that the eclipse predicted for November 9 that year
would actually occur 10 minutes later.
The observation bore him out and R\"omer argued that
the discrepancy was due to the finite speed of light. 
The light takes longer to reach us the farther we are from its source.

From his observations, R\"{o}mer estimated that light takes about twenty-two
minutes to cross the full diameter of Earth's orbit or about eleven minutes
for light from the sun to reach us on Earth.
On this basis, he estimated light's speed to be about 214,000 kilometres per
second.\footnote{For more on R\"{o}mer see \cite{Romer:bio}.  For more detail
on this study see \cite{cohen:1940}.}

R\"{o}mer's ``proof'' was not immediately accepted by all.
Alternative explanations were provided by Gian Domenico Cassini (1625-1712)
then an astronomer at the newly formed Academie des Sciences in Paris.
In 1666 Cassini had published tables on the eclipses of the satellites
of Jupiter from which work he also noticed 
inequalities in time intervals of eclipses
that depended on the location
of Jupiter in its own elliptical orbit.
He had briefly considered a finite speed
of light in 1675 but soon rejected it for a more traditional explanation.
Cassini, and later his nephew Giacomo Filippo Maraldi (1665-1729),
suggested that Jupiter's orbit and the motion of its satellites
might explain the observed inequalities
(\cite{Cassini:bio}, \cite{Newcomb:1882} and \cite{Romer:bio}).
Many astronomers continued to hold the view that
light' movement was instantaneous.

It was not until a study by James Bradley (1693-1762)\footnote{See
\cite{Bradley:bio} and \cite{Romer:bio}.}
was reported in 1729 that nearly all agreed that the speed is finite.
Bradley had been studying the parallax of the stars and discovered an annual
variation in the position of stars that could not be explained by the parallax
effect.
However, it could be explained by the motion of the Earth if light's
speed were finite.
Based on careful observations, Bradley estimated that light took 
eight minutes and twelve seconds to reach the Earth from the sun
resulting in a value for light's speed of 301,000 km/sec.

In 1809, based on observations on the eclipses of Jupiter's moons for 150
years, Jean-Baptiste Joseph Delambre (1749-1822) estimated the time
taken by light to travel from the sun to Earth to be eight minutes and
13.2 seconds resulting in a speed of about 300,267.64 $\approx$ 300,300 km/sec.\footnote{
The time here is as reported in \cite{Newcomb:1882}.
To calculate the speed, the distance between the Earth and sun must be known.
In the estimate reported here, the distance used was 148,092,000 km as derived from
Bradley's figures above.}

The results of these early astronomical estimates are summarized in Table
\ref{table:astronomy}.
\begin{table}[ht]
\scriptsize{
\begin{center}
\begin{tabular}{|lllc|}
\hline
Year & Authors & Observational Source & Speed (km/sec) \\
\hline
1676 & R\"{o}mer & Jupiter satellites & 214~000 \\
1726 & Bradley & Aberration of stars & 301~000 \\
1809 & Delambre & Jupiter satellites & 300~300 \\
\hline
\end{tabular}
\end{center}
\caption{Studies based on astronomical observation.}
\label{table:astronomy}
}
\end{table}

Unfortunately, measurements of the speed made in this way depended on the
astronomical theory and observations used.
Simon Newcomb (1835-1909) tells of an inaugural dissertation in 1875 by Glasenapp
whereby observations of the eclipses of Io from 1848 to 1870
show that widely ranging values for the speed
``could be obtained from different classes
of these observations by different hypotheses'' (\cite{Newcomb:1882} page 114).
It was shown that values for the sun to Earth time could be produced between 496 and
501 seconds resulting in
speeds between 295,592.8 $\approx$ 295,600 and 298,572.6 $\approx$ 298,600 km/s.
\footnote{Again, using Bradley's Earth to sun distance.}

Better determinations of the speed might be made if both
source and observer were terrestrial.
Because all would then be accessible, greater control could be exerted
over the study and hence the observations.
But this brings us back to the age old problem:
how could the speed of light be measured terrestrially?

\subsection{Terrestrial determinations.}
Imagine two people standing at either end of a very long track.
The first uncovers a powerful light source at an appointed time and
the second records the time at which the light is seen.
The length of the track divided by the difference between the start time
and the time the light is perceived gives a
measurement of the speed of light.\footnote{This is essentially the experiment proposed by Isaac
Beeckman to Descartes in 1629.  See footnote 5.}
The trouble, of course, is that light is so fast that the distance must either be
very large or the time taken very small.
Extremely large distances and extremely short time intervals
are very difficult to measure directly.

Matters can be improved if both observers have light sources
which they cover with a screen.
Time measurement begins when the first observer removes the screen
sending light to the second.
The second light source is uncovered when the
second observer sees the first.
Now when the first observer sees the second light source
he again screens his source.
The time between uncovering and covering the first light source
is a measure of the time light takes to travel twice the
distance between the two observers.
The improvements are obvious. The distance is doubled and a single clock
has replaced two supposedly synchronized clocks.
Here was Galileo's proposed study of 1638; nearly 200 years would
pass before it was improved sufficiently to produce results.

The necessary innovations were introduced by Hippolyte Fizeau (1819-1896).
One innovation was to replace the second person by a fixed flat mirror
whose surface is perpendicular to the beam of light from the source.
When this was done, the light beam was reflected directly
back at its origin and 
one human source of variation was completely removed from the system.
The second innovation was to automate the covering and uncovering of
the source, thereby further reducing the variation from the first human source. 

Together, these advances allowed Fizeau to replace the direct measurement of time with
an indirect measurement of speed.
Rather than measure time between uncovering and covering, Fizeau
could measure the minimum speed that the screen must travel in order to
cover the source at the exact time the light returns.
The trick was to use an accurately machined toothed wheel
placed spinning in front of the source to act as the moving screen.
The teeth screen the source while the gaps uncover it
and so the wheel acted just as Galileo's observer.
Any light returning to the source strikes either a tooth or a gap.
If the wheel was set spinning fast enough that every beam sent out 
struck a tooth on its way back, no image of the source is observed.
Twice this speed produces a full image as the beam sent out
returns through the next available gap.
Three times the speed produces no image, and so on.
The speed of rotation, coupled with the distance travelled
(twice 8,633 metres in Fizeau's setup),
could be transformed into a measure of the speed of light.
In this way, Fizeau produced the
first terrestrial determination of the speed of light in 1849.

Others were quick to build on this monumental achievement.
Only two years later Leon Foucault (1819-1868), a former collaborator of Fizeau,
produced more accurate measurements based on a rotating mirror rather than
a toothed wheel.
\section{Michelson's 1879 determinations of the speed of light}
In November of 1877 Albert Abraham Michelson (1852-1931),
then a twenty-four year old 
ensign in the US Navy and an instructor in physics at the
U.S. Naval Academy in Annapolis Maryland,
hit upon the means to improve Foucault's rotating mirror approach.
Even then, he needed to conduct many preliminary studies before being
confident of an improved value for the speed of light.
In his own words (\cite{aamich:1880} page 115) ``Between this time and March
of the following year a number of preliminary experiments were performed
in order to familiarize myself with the optical arrangements.
Thus far the only apparatus used was such as could be adapted from the
apparatus in the laboratory of the Naval Academy.''

In April 1878, he initiated contact with Professor Simon Newcomb (1835-1909)
of the US Navy
(\cite{swenson:1972} page 38)
who was then superintendent of the navy's {\em Nautical Almanac}
and renown in the navy and the scientific community as an astronomer.
Michelson discussed his work and methods with Newcomb.
At this point however, Michelson was still an unknown who would not
be funded by the US Navy for such specialized research.
Fortunately, having married Margaret McLean Heminway in the spring of 1877,
he could turn to a wealthy father-in-law for financial support.
His father-in-law\footnote{Referred to in \cite{aamich:1880} only as
a ``private gentleman''.}
had become deeply interested in
Michelson's preliminary results
and in July of 1878 provided him the \$2000 necessary to purchase the fine
optical instruments to carry out his measurements.
So began a lifelong quest to determine the speed of light.

\subsection{Optical theory.}
One of the difficulties with having great distances between the
source and the mirror in Fizeau's scheme is that the intensity of the light will decrease
with distance.
In order to keep the image as bright as possible, a lens can be placed
between the source of the light and the mirror.
If, as in the diagram below,
\begin{figure}[htp]
\centerline{\psfig{figure=point-source.ps,height=.75in,width=5in}}
%\centerline{\psfig{figure=/usr/people/rwoldford/admin/courses/st231/notes/cases/light/point-
%source.ps,height=.75in,width=5in}}
\caption{S and M are placed at the point-source focus of each other.}
\label{fig:point-source}
\end{figure}
the source, S, and the mirror, M are placed so that a point-source light from
one is focused precisely on the other,
then the return image will be as bright and as crisp as possible.

Note that the distance between L and M
is not equal to that between L and S.
As M moves farther from the lens, S will need to be moved closer in
order for both points to remain at the focus of the other's point source.
This is true provided both points
are beyond the focal length of the lens (that point where
beams of light parallel on one side of the lens
would meet on the other side).

By moving S and M farther apart, all the while keeping each at the other's
point focus, we increase the distance the light must travel and therefore
the time it will take.
Even so, the time taken is exceedingly short and difficult to measure.

Instead of Fizeau's wheel, Foucault
used a rotating mirror interposed between S and L
as in the next diagram.\footnote{According to Newcomb (page 117) this had been
suggested much earlier by Charles Wheatstone (1802-1875)
and tried without success
by Dominique Francois Jean Arago (1786-1853) in 1838.}
\begin{figure}[htbp]
\centerline{\psfig{figure=focus.ps,height=1.5in,width=5in}}
%\centerline{\psfig{figure=/usr/people/rwoldford/admin/courses/st231/notes/cases/light/focus.ps,height=1.5in
%,width=5in}}
\caption{Interposing a mirror, R, between the source S and the lens L.}
\label{fig:focus}
\end{figure}
Light rays from the source that strike R and proceed through the lens L
will strike M and return to the source S.
If after the light beam first strikes R outbound from S, R can be rotated
\begin{figure}[htp]
\centerline{\psfig{figure=mirror.ps,height=2.0in,width=2in}}
%\centerline{\psfig{figure=/usr/people/rwoldford/admin/courses/st231/notes/cases/light/mirror.ps,height=2.0i
%n,width=2in}}
\caption{Rotating the mirror R causes the returning beam to be deflected.}
\label{fig:deflect}
\end{figure}
before it is struck again by the beam returning from M, then the
returning beam will no longer return exactly to the source S but
will instead be deflected away from S in the direction of the rotation.

By rotating the mirror at a constant speed, the amount of deflection will be the
same for all light beams that go through L, strike M and return.
Then, for a continuous beam of light from S and a constant high speed of rotation
of R, an image of the source will appear beside S instead of coincident
\begin{figure}[htp]
%\centerline{\psfig{figure=/usr/people/rwoldford/admin/courses/st231/notes/cases/light/displacement.ps,heig
%ht=1.0in,width=2.6in}}
\centerline{\psfig{figure=displacement.ps,height=1.0in,width=2.6in}}
\caption{The return image I is displaced from the source S by the
rotating mirror R.}
\label{fig:displacement}
\end{figure}
upon it (as shown in Figure
\ref{fig:displacement}).
The faster R rotates or the longer is $|RS|$, the farther the returned image, I, will be displaced from
the source, S and the easier it will be to measure the deflection.

By carefully measuring the amount of displacement from S to I (see Figure
\ref{fig:displacement}),
and the distance from
S to R, the angle of deflection can be determined. 
Together with the known, fixed speed of rotation, this angle can be used to
determine the time it took light to travel the distance from R to M and back.
Dividing distance by time gives a determination of the speed of light.

Let $\theta$ denote the angle of deflection. Then the angle through which the mirror has rotated
is  easily shown to be $\theta / 2$.
The angle $\theta$ in degrees is $arctan(|IS|/|IR|)$.
If the speed of rotation is $n$ measured in cycles per second, then the time taken for the light beam to travel
from $R$ to $M$ and back is $\frac{1}{n} \times \frac{\theta /2}{360}$ seconds.
The speed of light transmitted under the conditions of the study is therefore
\[
2 \frac{360 n}{arctan(|IS|/|SR|)}\times  2|RM|
\]

In this arrangement, the distances $|$IS$|$ and $|$SR$|$ should be as large
as possible to reduce the error in measuring $\theta$. The distance $|$IS$|$ is maximized by
maximizing the speed of rotation of R and the distance $|$RM$|$.
Michelson's principal innovation in Foucault's design allowed
$|$RM$|$ to be very large.
In Foucault's setup, M was spherical with centre at R.
The greatest distance $|$RM$|$ achieved by Foucault
was 20 metres
(page 117 \cite{aamich:1880})
which produced a displacement $|$IS$|$
of only 0.7mm
(page 118 \cite{Newcomb:1882}).
Michelson chose to place the rotating
mirror at the focal point of the lens
which allowed him to
use a flat mirror for M.
That is, R should be placed
at that point where {\em parallel} light beams passing through
the lens from M meet on the other side as in Figure \ref{fig:parallel}.
\begin{figure}[htp]
%\centerline{\psfig{figure=/usr/people/rwoldford/admin/courses/st231/notes/cases/light/parallel.ps,height=0.
%75in,width=5in}}
\centerline{\psfig{figure=parallel.ps,height=0.75in,width=5in}}
\caption{R at the focal point of L.}
\label{fig:parallel}
\end{figure}
Then if the diameter of M was as large as that of L
any single beam passing from R through L would {\em necessarily} strike
M {\em and return} through L to R {\em whatever the distance between L and M}.
This permitted M to be placed very far away.
The only difficulty is that the farther away M is from L, the closer the
point-source focus S will
be to the focal point R which conflicts
with maximizing the distance between S and R.
This can be remedied somewhat by using a lens of large focal length. 

These innovations produced a displacement of more than 100 mm. Such a large displacement solved another difficulty.
Originally the eyepiece to observe the displaced image at S was offset using an inclined plate of silvered
glass to avoid interference between the observer and the outgoing
beam of light. Once the the displacement exceeded 40 mm, it was possible to remove the 
inclined plate and observe the displaced image directly. Michelson (page 116 {\cite{aamich:1880}) noted
``Thus the eye-piece is much simplified and many possible sources of error are removed.''

\subsection{Physical apparatus}

The following quotations and details are taken from Michelson's description of his study
(pages 118-124\cite{aamich:1880}).

``The study would take place on a clear, almost level, stretch along the north
sea-wall of the Naval Academy.  A frame building was erected at the western
end of the line, a plan of which is represented
in Fig. 3 \footnote{See our Figure \ref{fig:room} which reproduces Michelson's Fig. 3}
\begin{figure}[htp]
\centerline{\psfig{figure=light-path.ps,height=2.0in}}
\caption{Room showing experimental setup.}
\label{fig:room}
\end{figure}

%``The building was 45 feet long and 14 feet wide, and raised so that the line
The building was 45 feet long and 14 feet wide, and raised so that the line
along which the light travelled was about 11 feet above the ground.
A heliostat at H reflected the sun's rays through the slit at S to the revolving
mirror R, thence through a hole in the shutter, through the lens, and to
the distant mirror.''
%\footnote{{\em Ibid.}}

The heliostat is an instrument used to focus the sun's rays and direct them
in a narrow beam. This then was the source of light.
Because it is easier than the heliostat to adjust,
a small mirror, F, directs the beam from the heliostat to the slit.

``The lens was mounted in a wooden frame, which was placed on a support moving
on a slide, about 16 feet long, placed about 80 feet from the building.
... The fixed mirror was ... about 7 inches in diameter, mounted in a brass
frame capable of adjustment in a vertical and horizontal plane by screw motion.
.... To facilitate adjustment, a small telescope furnished with cross-hairs was
attached to the mirror by a universal joint.
The heavy frame was mounted on a brick pier, and the whole surrounded by a
wooden case to protect it from the sun.''
%\footnote{{\em Ibid} page 122.}

Unlike Foucault, a flat mirror was used as the fixed mirror and
a lens of long focal length focused the light
(an eight inch non-achromatic lens with a 150 foot focus).
The lens was placed in position about 80 feet from the building
and the fixed mirror a distance of about 1920 feet from the building.
Both the mirror M and the lens L needed to be placed perpendicular to a common central axis
as in Figure \ref{fig:focus}.

Michelson gives no account 
%in \cite{aamich:1880}
of how the lens came to be positioned but he does
describe the positioning of the mirror in some detail.
First it was placed in position with the reflective surface
facing the hole in the building.

``A theodolite\footnote{A land surveying instrument used to measure
angles.} was placed at about 100 feet in front of the mirror,
and the latter was moved about by the screws till the observer at the theodolite
saw the image of his telescope reflected in the center of the mirror.
Then the telescope attached to the mirror was pointed (without moving
the mirror itself) at a mark on a piece of card-board attached to the
theodolite.''
%\footnote{{\em Ibid}, page 122.}

In this way the telescope atop the mirror was placed at right angles
to its reflective surface.

``The theodolite was then moved to 1,000 feet, and, if found necessary,
the adjustment\footnote{to the telescope.} repeated.''
%\footnote{{\em Ibid.}}

With the telescope thus placed, the mirror was moved until its
telescope pointed at the hole in the building. A final adjustment was made by having someone
focus a spyglass at the fixed mirror from inside the building.
The mirror was then moved using the screws until the observer saw the image of his
spyglass reflected centrally in the mirror.
%This last adjustment had to be repeated before every series of observations
%as the mirror would change its position between morning and evening.

The rotating mirror was a 1.25 inch circular disc (0.2 in. thick)
silvered on one side.
It was held on a vertical spindle that was in turn held in a cast iron frame.
This frame could be tilted side to side and forwards
and backwards by means of small cords.
The spindle had pointed ends which pivoted in
conical sockets in the frame; these were the only contact points between the
frame and the spindle.
The top part of the spindle passed through the centre of a small wheel
inside a circular enclosure attached to the frame.
This wheel held the spindle by friction.
Forcing air into the enclosure, over the surface of the wheel, and out
again in a circular fashion would cause the wheel, and hence the spindle,
to turn.
The spindle would have to be carefully balanced so that it turned smoothly
without wobbling.
The air to power this small turbine came
from a steam-powered pump located in the basement
of the building.
A tube connected the pump to the turbine.
Because
the mirror's rotational speed remains constant only while the pressure from
the pump is constant,
a system of regulators, valves and feed-back control\footnote{\em{Ibid} figures 11 and 12, page 124}
was installed to adjust the pressure and hence the speed.Michelson notes that the system could
hold the speed of rotation constant for three or four seconds which was sufficient to
make a measurement. 

So as to further increase the distance $|$SR$|$,
the rotating mirror was placed slightly closer to the lens 
than at the focal point of the lens ({\em i.e.} its parallel beam focus).
This would make for a slightly less clear image than having R at the
focus as fewer rays strike and are returned from M.

``A limit is soon reached, however, for the quantity of light received
diminishes rapidly as the revolving mirror approaches the lens.''
%\footnote{{\em Ibid} page 118.}

This limit is about 15 feet closer to L than is its focal point.
Michelson's previous studies showed that
if R rotates at about 258 revolutions per second, and
the distance $|$SR$|$, or {\em radius},\footnote{Names of variates, like ``radius,''
whose values Michelson recorded 
are italicized here when first mentioned.}
is about 28.6 feet, then the
deflection should be around 115 mm.

\subsection{Measurement equipment}

Michelson made use of several pieces of measurement equipment.

Distances $|$SR$|$ and $|$RM$|$ were measured using a steel tape, nominally 100
feet long.

The {\em displacement} $|$IS$|$ was measured by means of a calibrated
micrometer as shown in Figure \ref{fig:micrometer}.
\begin{figure}[htbp]
%\centerline{\psfig{figure=/usr/people/rwoldford/admin/courses/st231/notes/cases/light/micrometer.ps,height
%=2.0in,width=2.0in}}
\centerline{\psfig{figure=micrometer.ps,height=2.0in,width=2.0in}}
\caption{Micrometer measures the displacement $|$IS$|$.}
\label{fig:micrometer}
\end{figure}
The source of the light was a narrow vertical slit that was
fixed in place on the micrometer.
The micrometer had a small telescope that could be moved left to right
using a dial at the right.
Each turn of the screw 
would move the telescope some small known amount. In Figure
\ref{fig:micrometer}, the horizontal scale shown marks the amount turned.
At the focus of the telescope
lens (about 2 inches), and in nearly the same plane as
the slit, S, was a single vertical silk fibre that served as a vertical
cross-hair for alignment purposes.
By turning the screw, the telescope could be positioned so that this fibre
was centred on the returning image of the slit at $I$.
The amount the telescope had to be moved from its initial position at the slit,
to the position of the image would be the displacement $|$IS$|$.

The speed of rotation $n$, {\em number of revolutions per second}, of the
revolving mirror was set using an electric tuning fork which vibrated at 
about 128 cps. The valve from the pump
was opened to rotate the mirror R and make its speed in revolutions per second 
match the frequency of the electric tuning fork in vibrations per second.
The speed and frequency were matched by having a small mirror attached to one arm
of the tuning fork placed so that
some light reflected from the revolving mirror was in turn
reflected by the tuning fork's mirror to produce an image
of the disk of the revolving mirror on a piece of plane glass located
near the lens of the eyepiece of the micrometer.
If the tuning fork frequency and the speed of the revolving mirror were the same, 
then the final image appearing on the glass would be distinct.
In most of Michelson's determinations,
the frequency of the fork was half that of the revolving mirror, so that two distinct 
images were produced. \footnote{{\em Ibid} figure 13, page 124}

The frequency of the electric tuning fork, called $Vt_2$, was measured by counting the
{\em beats per second} between it and a standard tuning fork $Vt_3$ with known
frequency 256.070 cps at 65 degrees fahrenheit. A 60 second count period was used. The {\em temperature} was recorded
to correct  the frequency of the
standard fork for temperature. 
The frequency of the electric fork is thus one half of the sum of 256.070, the number of beats per second 
and the correction for temperature.

The final result for the speed of the revolving mirror in revolutions
per second is determined from the frequency of the electric tuning fork and the number of distinct images on 
the glass plate 

\subsection{Producing one determination of the speed of light}

\begin{enumerate}
\item
The distance $|$RM$|$ from the rotating mirror to the fixed mirror was measured
five times, each time allowing for temperature,  and the average used as the
``true distance'' between the mirrors for all determinations. 

\item
%On each occasion that the apparatus was to be used, 
The fire for the pump was
started about a half
hour before measurement began. After this time, there was sufficient
pressure to begin the determinations. 

\item
The fixed mirror M was adjusted as described above and the heliostat placed and adjusted
so that the sun's image was directed at the slit. 

\item
The revolving mirror was adjusted in two different axes.
First it was inclined to the right or left so that the direct reflection of the
light from the slit fell above or below the eyepiece of the micrometer.
Michelson
found that he had to tilt the revolving mirror as ``Otherwise this light would
overpower that which forms the image to be observed.''\footnote{{\em Ibid}.} The
revolving mirror was then adjusted by being moved about, and inclined forward and
backward, till the light was seen reflected back from the distant
mirror.''\footnote{{\em Ibid}, page 124}. Some adjustment in the calculations
was made for the tilting of the mirror.

\item
The distance $|$SR$|$ from the revolving mirror to the cross-hair of the eyepiece
was measured using the steel tape.
 
\item
The vertical cross-hair of the eyepiece of the micrometer
was centred on the slit and its position recorded in terms of the position of
the screw.

\item
The electric tuning fork was started.  The  frequency of the fork was measured
two or three times for each set of observations.

\item
The temperature was recorded.

\item
The revolving mirror was started. The eyepiece was set approximately to capture the displaced image.
If the image did not appear in the eyepiece,
the mirror was inclined forward or back until it came into sight. 
\item
The speed of
rotation of the mirror was adjusted until the image of the revolving mirror came
to rest.

\item
The micrometer eyepiece was moved by turning the screw until its
vertical cross-hair was centred on the return image of the slit.
The number of turns of the screw was recorded.
The displacement is the difference in the two positions.
To express this as the distance $|$IS$|$ in millimetres
the measured number of turns was multiplied by the calibrated number of mm.
per turn of the screw.

\item
Steps 10 and 11 were repeated 
until ten measurements of the displacement $|IS|$ were made.

\item
The rotating mirror was stopped, the temperature noted and the frequency of the electric fork was 
determined again.
\end{enumerate}


\section{Statistical Method and Michelson's 1879 Study}

Using the apparatus and methods described above, Michelson began the first
of his many studies to determine the speed of light.
The study was conducted in 1879 and the results were published in 1880.
Here we examine the detail of the study 
to illustrate what we mean by {\em statistical method}. 

Statistical method, unlike Scientific Method as we will see in the next section,
can be usefully represented as a series of five stages - {\em Problem, Plan, Data,
Analysis, Conclusion}. We use the acronym PPDAC to refer to this series.
One stage leads to the next and is dependent on previous
stages. Looking back, this means that each stage is carried out and legitimized
(or not) in the context of the stages which precede it (e.g. there is little
value in a Plan that does not address the Problem). In such a case, one of the
two stages must be modified. Looking ahead at any stage, decisions can be made
that will simplify actions taken in a later stage (e.g. a well specified Problem
can be addressed by a simple Plan).

The structure for statistical method is useful in two ways: first to provide a
template for actively using empirical investigation and second, to 
critically review completed studies. The structure of all empirical studies, either implicitly or explicitly
can be represented by the five stage model. Below, we use the structure in the second manner to
examine Michelson's 1879 study.

Each stage of statistical method comes with its own issues to be understood
and addressed. In the context of  Michelson's study, we introduce language
appropriate to describe these issues. As noted in the Introduction,  we chose Michelson's study for several 
reasons. In many ways, it is not typical of a  statistical investigation. We urge the readers to test the 
proposed structure and language
on other applications. 

\subsection{The Problem}
The purpose of this stage is to provide a clear statement of what is to be
learned from the study. 
To do so, it is important to translate the contextual problem under study into
a language that can guide the design and implementation of the subsequent stages
of Statistical Method. Understanding what is to be learned from the study is so
important that it is surprising that it is rarely, if ever,  treated in any introduction to 
statistics. In a cursory review, we could find no elementary statistics text that provided a structure to
understand the  problem. 
For example, the popular and well-regarded book by Moore and McCabe  \cite{MooreMcCabe:text} 
makes no mention of the role of statistics in problem formulation. 

Two exceptions are the paper by Hand \cite{Hand:decon} and  Chatfield's book, \cite{Chatfield:prob}. Hand's 
aim was ``to stimulate debate about the need to formulate research questions sufficiently precisely
that they may be unambiguously and correctly matched  with statistical techniques''. He suggests five 
principles to aid in this matching but no structure or language. Chatfield provides excellent advice to get a 
clear understanding of the physical background to the situation under study,  to clarify the objectives
and to formulate the problem in statistical terms. 

One of our goals is to provide appropriate 
language to formulate the problem and a structured approach  to this formulation. 

To execute the Problem stage, issues can be addressed using the following
terminology. 
\begin{enumerate}
\item
{\em Units and Target Population} - these specify the collective to which we are
interested in applying our learning or conclusion.
\item
{\em Variates} - numerical or categorical values attached to every unit in the
target population (values of variates may differ from unit to unit). 
\item
{\em Population Attributes} - functions that apply to the entire 
population calculated through the variate values on individual units.
\item
{\em Problem Aspect}  - either {\em causative}, {\em predictive} or
{\em descriptive}. A problem with a causative aspect corresponds to one where
interest lies in investigating the nature of a causative relationship
involving two or more variates in the target population. A problem has a predicitve aspect if the object is to 
predict the values of variates on one or more units in the target population. A problem has a descriptive 
aspect
if the object is to estimate or describe one or more attributes of the population. 
\end{enumerate}

To illustrate the terminology, in 1879, Michelson was keen to determine the speed of white light as it travels
between any two relatively stationary points in a vacuum.  A unit is one
transmission of such light between a source and destination, both located in a
vacuum. The target population is all such transmissions, before, during and
after 1879. The primary variate of interest, which we call the {\em response
variate}, is the speed of the light associated with each such transmission.  There are
many other variates, which we call {\em explanatory variates}
attached to  each unit  such as the distance between the two
points, the motion of the points with respect to each other, properties of the
source and so on.  In Michelson's problem, there is no direct interest in these other
variates. 

The attribute of interest is the speed of light averaged across all units in the target
population. This example is unusual in that it was believed that there is no
variation in the value of the response variate from unit to unit i.e that the speed of white light in a vacuum 
is constant.

The problem here is descriptive. The aim is to estimate a population attribute.
If Michelson had been attempting to show that the speed of light can be changed
by, for example, having the destination move with respect to the source, then the Problem
has a causative aspect. It is important to decide the aspect at the problem stage
because of the special requirements of the Plan needed to establish causation.

This language can be used to clarify many statistical issues. Consider, for example, Deming's notion of an 
enumerative versus analytic study \cite{Deming:1953}, \cite{HahnandMeeker:1993}.  If the target 
population can be listed so that a probabilistic sampling protocol giving every unit a 
positive inclusion probability can be used, the study is enumerative. Otherwise it is analytic.
Deming was particularly interested in 
contrasting the use of formal statistical procedures in sample surveys 
to studies of industrial processes \footnote{We have found it is often useful to specify the target population by describing 
the process that generates the units. This process is called, naturally, the target process}, where the target 
population included units not yet produced. All studies on such processes are analytic. 

In applying Exploratory Data Analysis, the target population is the sample, the units that are included in the 
study. The object is to find interesting relationships among the usually many variates available. The 
relationships are defined as attributes in the target population. 

Attributes can be numerical or graphical. For example, a scatter plot constructed in our imaginations using
all units in the targt population is an attribute. The coefficients of the least squares line fitted to this scatter
plot and the
residual variation around the line are numerical attributes. 
A clear specification of the attribute of interest can resolve many issues. Lord's paradox, as presented by 
\cite{Hand:decon}, is easily resolved by noting that it involves two different attributes. See our discussion to 
Hand. 

Problems with a predictive aspect pose a serious challenge to the language introduced here. Suppose 
the value of a response variate is measured yearly for
a number of years and the object is to predict the value for a subsequent year. What is the target population? Are there attributes
that can be defined to specify the Problem? We return to this issue in Section 6 when the language for
all stages of PPDAC is available.   
 
\subsection {The Plan}
The purpose of this stage is develop a plan for the collection and analysis of the data. We propose to break 
the planning into several sub-stages,  some of
which inevitably overlap. In an active use of PPDAC, some iteration may be
required within the stage and between stages before a satisfactory plan is developed.

\subsubsection{Specifying the study units and study population}

The {\em study population} is the collective of {\em study units} for which the values of
the variates of interest could possibly be determined. This notion corresponds directly to the frame in sample survey 
literature. 
For numerical attributes, the difference between the attributes of interest in the study population and the corresponding attributes
in the target population is called {\em study error}.

The study units may or may
not be part of the target population, as is the case in Michelson's study.
Because the distances required to measure the speed of light were so large, it
was not practical to have the light travel through even a partial vacuum.\footnote{Even as he was dying, Michelson directed a
study to measure the speed of light in a mile long tube that was evacuated to a near vacuum \cite{aamich:vacuum}}
All of the units in Michelson's study involved the transmission
of light through air at a particular location over a specified time period. The source and destination were a
fixed distance apart and both remained stationary over the course of the study.
Michelson decided to look at transmission of light at one hour before
sunset or one hour after sunrise during a few days in June 1879. Within these
constraints, he was free to choose the units on which he would determine the
speed of light.

The study population and the study units were very different from the target
in this instance.  Michelson recognized that measuring the speed of light in air
would result in a study error. He planned to
correct the error by using a factor based on the refractive index of air. Note that this correction is 
outside the purview of statistical method. It requires contextual knowledge. 

There are several competing criteria in specifying the study population. Ideally, the attributes of interest, 
defined for the study population should equal those in the target population to eliminate study error.
This can be guaranteed 
only if the target and study populations are the same, a very rare circumstance. In most cases, we need to 
rely on the judgement of the investigators to assess how different these attributes are. Applying statistical 
method ensures that this comparison is considered before the Plan is executed.
Since all units in the study population by definition must be available for inclusion in the study, this 
population is at best a subset of the target population. For example, in process studies, the study population 
is limited to those units to be produced over a selected time frame. In many cases, such as the use of 
animals to study human health issues, for economic or ethical reasons, the study units are fundamentally 
different from the units in the target population. 

 
This language can again be used to discuss many important issues. For example, in meta-analysis 
, one major issue is the inclusion or exclusion of studies from the analysis. One aspect 
of this issue can be discussed by comparing the study population to the target for each study considered for 
inclusion. 

It is our contention that all study populations are finite. Within the specified time frame, the limited financial 
resources available, and the time taken to make a determination,
Michelson's study population could have been enumerated. Of course, this was not done; the point is that it 
was possible. Accepting the contention, it follows that all studies are potentially enumerative if conclusions 
are applied to the study population. Extrapolating these conclusions to the target population requires 
induction and is context dependent. 


\subsubsection{Selection of the response variates to be measured}

The Plan must include a step in which we decide what variates we will measure
on each unit to be selected in the sample. 
Response variates, corresponding as much as possible to those used to
define attributes of interest in the target population, must be clearly defined. 

Michelson could not measure the speed of light on a unit directly with his
apparatus. Instead, for each determination,  he measured the following response
variates to calculate the speed of light.
\begin{enumerate}
\item
the displacement $d$ of the image in the slit. This was measured on each unit.
\item
the radius $r$, the distance between the cross-hairs of the slit and the front
face of the rotating mirror. This value was not always determined for units
measured in the same time period but was measured each morning or evening when
units were sampled.
\item
the number of beats $B$ per second 
between the electric $Vt_2$ fork and the standard $Vt_3$. This variate was
determined once for each set of 10 determinations of $d$.
\item
the temperature $T$ measured once for each set of 10 determinations of $d$.
\end{enumerate}

The values of the response variates were combined with several constants
according to the formulae (3) and (4) (\cite{aamich:1880} page 133) to produce a value for the
speed of light in air at temperature $T$.

\subsubsection{Dealing with explanatory variates}

There are usually many other variates associated with each unit in the study population.
We call these explanatory variates that can be used to explain differences
in the response variates from unit to unit in the study population.

It is important to decide how explanatory variates will be dealt with during
the planning stage. There are three choices. First, the study population can be
redefined by holding an explanatory variate fixed or by restricting its range of values.
Second, the explanatory variate can be measured or deliberately set for each unit included in
the study and its value utilized in the analysis or third, the
explanatory variate can be ignored completely. The third course of action is
taken if it is known in advance that the explanatory variate is unimportant (i.e.
it does not explain variation in the response variates) or out of ignorance, not
recognizing the presence or importance of the variate.

Reviewing Michelson's apparatus and proposed method, there are many explanatory
variates in the study population that may explain why the speed of
light as determined from the measured response variates
varies from unit to unit. 
Michelson recognized that it was important to consider these variates and in his
Plan dealt with them in all three ways. For example, he fixed the distance from
the rotating to the fixed mirror, thus further defining the study population. He
also deliberately varied the angle of inclination of the plane of rotation of the
revolving mirror from $arctan(0.02)$ in the early determinations to
$arctan(0.015)$ in the final twelve sets. He measured a large number of
explanatory variates such as the observer, the day, the quality of the image and
so on. He ignored barometric pressure because {\cite{aamich:1880} (page 141)
``... error due to neglecting
barometric height is exceedingly small".

We have constructed a fishbone diagram (Figure \ref{fig:fishbone}),
a useful tool for deciding how to deal with potentially important explanatory variates. 

\begin{figure}[htbp]
\centerline{\psfig{figure=fishbone.ps,height=5.0in}}
\caption{Fishbone diagram.}
\label{fig:fishbone}
\end{figure}

The primary difference between experimental and observational studies can be explained with  
this language. In an experimental study, values of explanatory variates corresponding to factors of
interest are set by the experimenter and assigned to units in the sample. In an observational study,
these explanatory variates are not controlled, except perhaps by the sampling protocol. Their
measured values are used in the analysis.

\subsubsection{The measurement processes}


A key element of the Plan is to decide how to measure the selected response
and explanatory variates on the units in the sample. To determine the value of any variate on a unit, we
call the measuring devices, methods and individuals involved the {\em measurement
process}. Once a measurement process is specified, it is important to understand
its properties. 
We call {\em measurement error} the difference between the value of the variate
determined by the measurement process and the ``true'' value. Measurement error is
propogated through the Analysis and hence to the Conclusions.

In many applications, an iteration  of PPDAC is
applied to investigate the attributes of the measurement process within the
overall study.
We define the
properties of the measurement process in terms of repeatedly measuring the same
study unit. Two concepts are {\em bias}, an attribute of the (target) measurement process
describing systematic measurement error, and {\em variability}, an attribute of
the (target) measurement process describing the change in the error from one determination
to the next. 

Michelson paid careful attention to the measurement processes he had specified
for his study and discussed at great length investigations he
undertook to ensure that there was little bias and variability.
Consider, for example, the measurement of the distance between the
two mirrors \cite{aamich:1880}(page 125).
To avoid bias, he calibrated a steel tape against a Wurdeman copy of
the standard yard. The calibration used a comparator with two microscopes, one
fixed and one that can be moved towards or away from the fixed microscope by
turning a screw. The distance between the microscopes was set to 1 standard yard.
Then the tape was placed in the comparator so that .1 ft corresponded to the
cross-hairs of the fixed microscope and the length of the first yard of the tape
was determined by rotating the screw until the cross-hairs of the movable
microscope corresponded to 3.1 ft on the tape. This procedure was repeated 33
times to determine the cumulative number of turns of the screw corresponding to
the length of the tape from .1 ft to 99.1 ft. The temperature was recorded so
that an adjustment (unexplained) could be made.

Next, he carried out a separate study to determine the distance corresponding
to 1 turn of the screw of the movable microscope. This was accomplished by
measuring 20 times the number of turns that correspond to 1 mm and then
averaging. It is clear that Michelson appreciated the power of averaging to
reduce variability in measurement. Combining the results of the two studies and
adjusting  for temperature, the corrected length of the 100 ft steel tape was
100.006 ft. 

To measure the distance between the two mirrors (approximately 2000 ft), the
plan was to place lead markers along the ground and use the tape to measure the
distance from one to the next following a carefully defined standard procedure.
The tape was to be placed along the (nearly) level ground and stretched using a
constant weight of 10 lbs. This led Michelson to investigate the stretch of the
tape.

To adjust for stretch, another small study was conducted in which the tape was
stretched using a 15 lb force and the stretch in mm at 20 ft intervals was
measured.  The data are shown below.
\begin{center}
\begin{tabular}{c c}
Length&Amount of Stretch \\
100&8.0 \\
80&5.0 \\
60&5.0 \\
40&3.5\\
20&1.5 \\
\end{tabular}
\end{center}
The correction, in mm,  for stretch in the tape to measure the distance between
the mirrors is then
\[
correction ~=~ \frac{8.0+5.0+5.0+3.5+1.5}{300}~ \times ~100~ \times ~ \frac{10}{15}
\]
Converted to feet and multiplied by 20, the overall correction for stretch was
+0.33 feet

In the language we have introduced, for this small study, the study population
using a 15 lb force is different from the target population which requires a 10 lb
stretching force. Note also the curious weighted average for estimating the
amount of stretch per foot of tape.

The goal of introducing the corrections for stretch and length of the tape was 
to reduce bias in the final measurement of the distance between the two mirrors.
To reduce the variability of the distance measurement, the procedure was repeated
5 times (with corrections for temperature on each). The temperature corrected
measurements varied from 1984.93 to 1985.17 ft. Michelson used the average of the
5 determinations and then corrected for stretch and bias in the tape to get his
final measure of distance between the two mirrors. 

The case study is an excellent example of a careful scientist reducing measurement error 
from his measurement processes using two different approaches. Based on empirical studies,
he reduced bias by calibration and correction, and he reduced 
and reducing variability by averaging. At the conclusion of his paper,
Michelson provided a detailed discussion of the effects of possible measurement bias on his
estimate of the speed of light. It is alarming to realize how often modern
data are produced and analyzed with little consideration for the properties of
the measurement process. 
\footnote{And no wonder since so little attention is paid to the measurement process in the teaching of statistics. Consider the advice of 
Moore and McCabe \cite{MooreMcCabe:text} page 223 ``But, by and large,
questions of measurement belong to the substantive fields of science, not
the methodological field of statistics. We will therefore take for granted
that all variables we work with have specific definitions
and are satisfactorily measured.'' Two useful references are Youden \cite{Youden:meas} and
Wheeler and Lyday \cite{Wheeler:meas}. }

\subsubsection{The sampling protocol}

The {\em sampling protocol} is the procedure used to select units from the
study population to be measured. The goal of the sampling protocol is to select
units that are representative of the study population with respect to the
attribute(s) of interest. The sampling protocol deals with how and when the units
are selected and how many units are selected.

Michelson decided to sample a number of units one hour after sunrise and one
hour before sunset for a number of days between June 13 and July 2. The units
were selected in groups of 10 with from one to six groups taken per time period.
Units were selected by Michelson and, on two occasions, by his assistants
Lieutenant Nazro and Mr. Clason.  In all, 1000 units were sampled. Over the
course of the sampling, other explanatory variates were manipulated (speed of
rotation of the mirror, the angle of inclination of the rotating mirror etc.)
Michelson recognized the importance of selecting units with different values for
these explanatory variates so that he could verify that they did not effect the
measured velocity of light. Consider, for example, his discussion of observer
bias in the final section of the paper. To deal with this issue, additional sets
of measurements were taken by another observer who was blind to Michelson's
results. There was no systematic difference in the two sets of values. 

We call {\em sampling error} the difference between the attribute of interest
in the study popualtion
and the corresponding attribute in the sample. As with measurement processes, there may be
bias and variability
associated with the sampling protocol. These are properties of the protocol and
not of any particular sample of units. As with the measurement process, bias and
variability are defined in terms of the properties of the sampling error when
repeatedly applying the sampling protocol to the study popualtion. These replications are always
hypothetical which means that we can describe sampling bias and variability only
through a representation of the sampling protocol by a mathematical model. We
postpone discussion of  this model to the Analysis section although in the active
use of PPDAC, mathematical  models for the potential sampling protocol (and
measurement processes) are used to help with issues such as sample size.

In experimental plans, once the units have been sampled, the value of one or more explanatory variates is deliberately
manipulated on the sampled units before the response variate is measaured. The {\em experimental design} is that part of the sampling
protocol that deals with how this manipulation will be done.

\subsubsection{The data collection protocol}

The {\em data collection protocol} is the procedure for executing the above steps of the Plan to collect and record the data. 
It deals with management and adminstrative issues such as who does what and when.
The goal is to avoid mistakes.  

Michelson gives us no indication of how
he planned to record his data. However, the meticulous care he showed elsewhere
in the planning of his study suggests that he would have been especially careful
to ensure that the data were recorded as measured.

In today's context, amongst other issues, this step will include consideration of data entry,
file structures, analysis software, and so on, especially for Plans in which a
large amount of data is to be accumulated. 

\subsection{The Data}

The primary purpose of the Data stage is to execute the Plan, monitoring any
deviations or exceptional occurrences as they occur.  Once the data are collected, processed (Michelson 
had to calculate the speed of light for each determination)
and stored, we propose to search for anomalies and to cleanse the data set when
appropriate. The goal is internal consistency. This is likely to be more profitable in an
active use of PPDAC as questions about the validity of any particular value can
be answered directly by the individuals making and recording the measurements.

As far as we can tell, Michelson  used all of the measurements on the 1000
units that he collected. Unfortunately, he did not report all 1000 data points
but instead gave the average value of the displacement $d$  for the 10
determinations in each set. 

All recorded explanatory variates were treated as
constant over the set. The values for the measured speed of light 
in air for each
set and the associated response and explanatory variates are given in Table 2
and 3. 
Table 4 explains the columns in the tables. 
\begin{table}
\input{data1.tex}
\caption{Michelson's data: First 50 observations.}
\label{table:michelson-data-1}
\end{table}

\begin{table}
\input{data2.tex}
\caption{Michelson's data: Last 50 observations.}
\label{table:michelson-data-2}
\end{table}

\begin{table}
{\small
\input{data-key.tex}
}
\caption{Michelson's data: Key to variates.}
\label{table:michelson-data-key}
\end{table}

Michelson did not question the internal consistency
of his data in print. Given our computational resources, it is an easy 
task to examine a large number of plots of the response variates versus the order of collection or other 
explanatory variates not involved in the Problem. 
Figure \ref{fig:speed-day}
is a plot of the recorded values for the speed 
of light in air versus the day of collection. Because so many values were recorded as identical, the plotted 
values
have uniform random noise in the range from -4 to 4 added;
this has the desired visual effect of spreading 
the points out in the plot.

\begin{figure}[hbp]
%\centerline{\psfig{figure=/usr/people/rwoldford/admin/courses/st231/notes/cases/light/speed-
%day.ps,height=2.0in}}
\centerline{\psfig{figure=speed-day.ps,height=2.0in}}
\caption{Adjusted speed of light (jittered) versus day.}
\label{fig:speed-day}
\end{figure}

There is an apparent decreasing relationship that is only stronger
if the three outlying values are ignored.
The noticeable exceptions to this relationship appear to be the values
obtained on the last three days.
However, checking with the data as presented in the tables, we see that on
the third last day, Michelson inverted the rotating mirror R.
After two days in this position, he inverted it again to get the original
position.
Arguably, these changes affected the process and prior to that time the
study process seemed to be drifting downwards.
This also holds for morning and evening measurements considered separately.
Clearly, the date is acting as a surrogate for some other lurking variable.
What that could be is not known.

\subsection{The Analysis}

The purpose of the analysis stage is to use the collected data and information
from the Plan to deal with the questions formulated in the problem step. The form
and formality of the Analysis depends on the complexity of the Problem and Plan, the
skill of the analyst, the
amount of variability induced by the Plan, and the audience for the documentation of the study. We propose 
the
following general breakdown of the stage:
\begin{itemize}
\item
construct graphical and numerical summaries selected to address the Problem directly. 
\item
model the Plan and data
\item
fit and assess the model 
\item
develop formal statistical procedures
\end{itemize}
All sub-stages are directed to addressing the Problem. 

Michelson limited his
analysis to the calculation of the average of the 100 measured velocities in air,
a numerical summary and an estimate of possible error, a formal procedure. The
error is based on a worse case scenario, combining  probable errors based on the
estimated standard deviations of replicate determinations and maximal systematic
error, based on Michelson's knowledge of his apparatus and the functions used to
calculate the speed of light from the measured response variates. For more
discussion on the use of probable error, see Stigler \cite{stig:hist}. 

After making a small adjustment for temperature (in air) based on the effects of temperature change on the 
systems used to determine $\phi$, the angle of deflection, and correcting to a vacuum, Michelson
concludes his analysis by reporting the speed of light in vacuo  (kilometres per
second ) to be
\[
299944 ~\pm 51
\]  

Although Michelson did not formally propose a model, he carried out numerous
checks that are equivalent to aspects of model assessment  (\cite{aamich:1880} page 139). For
example, to see if the measured speed of light was systematically influenced by
the distinctness of the image, an explanatory variate, he calculated and compared
the average velocities stratified by distinctness of image.  This checking was repeated for many other 
explanatory variates.

Today, we can use corresponding graphical methods. Perhaps the speed depends on some of the 
explanatory variates that are
not part of its calculation.
For example, has the effect of temperature been successfully removed from
the determinations?
A plot of speed versus temperature is shown in Figure \ref{fig:speed-temp}.
A fairly weak increasing trend is discernible in the plot.
However, even this trend depends heavily on the three points in the lower
left corner and so is not likely to alter the result significantly. Again the values have been jittered to resolve 
the over-plotting of identical values. 

\begin{figure}[htp]
\centerline{\psfig{figure=speed-temp.ps,height=2.0in}}
\caption{Adjusted speed of light (jittered) versus temperature.}
\label{fig:speed-temp}
\end{figure}

Curiously, in his comparisons of group averages, Michelson
did not compare morning and evening measurements
nor attempt to relate the measurement to the date, as we explored in 
the Data stage.
There are other interesting relationships to be found in this
data; we leave further exploration to the reader.

Note that there is often not a clear distinction between the checks for internal consistency in the Data stage and 
these model checks in the Analysis stage. The same plots or summaries may appear in either.  

Today, we can contemplate any number of ways to summarize, model and analyze the data. For example,
we might construct a histogram and calculate a 5-number summary of the 100
reported values. Based on a gausssian model, which appears to fit the data well, a $95\%$ confidence 
interval  for the mean  is 
$$
299852.3 ~{\pm} ~15.7
$$
Correcting for temperature, following Michelson, and converting to a vacuum,
a $95\%$ confidence interval for the speed of light (km/s) in vacuo is
$$
299944.3~{\pm}~15.7 
$$

Note that the confidence interval is much shorter than that reported by Michelson,
who included both variability and possible bias in his calculation. Other more
complex modeling, analyses and model assessment can be made. The above is used to
demonstrate the sub-stages within the Analysis stage of PPDAC. Again it is
evidence of Michelson's precision as a scientist that his analysis so carefully
parallels what can be done today. 

\subsection{The Conclusion}

The purpose of the Conclusion stage is to report the results of the study in the
language of the Problem. Concise numerical and graphical summaries can be used to clarify the discussion. Statistical jargon should be avoided.
As well, the Conclusion provides an opportunity to discuss the
strengths and weaknesses of the Plan, Data and Analysis especially in regards to possible errors.

In Michelson's study, he concludes by 
reporting the speed of light (km/s) in vacuo
as
$299944 ~\pm 51$.
He then discusses possible ``Objections'' including among others not mentioned
above, uncertainty of the laws of reflection and refraction in media in rapid
rotation, retardation caused by reflection , imperfections in the lens, periodic
variation in friction at the pivots of the rotating mirror and change of speed of
rotation. In each case, he refers back to the Plan and the model assessment to
demonstrate that the objection would have little effect on the estimate of the
speed of light.

In our language, we would start with the reported speed of light based on the
confidence interval. Other than the discussion given by Michelson, we would add
the possible error due to 
the difference between the target and study population.

We can find no reason in the paper as to why there is such a relatively
large bias in Michelson's final reported speed.
Note that the defined true value is well outside both the confidence interval
and Michelson's interval of plausible values.

\section{On method in science.}
When examining the writings of those who have thought long and hard about the nature of science
one finds the same difficulties appearing again and again.\footnote{John
Losee's book \cite{Losee:intro} provides a
reasonable starting point.}
There is, for the most part, a great enthusiasm that science is progressing in some sense,
that we are learning ever more about the world around us, that we are continually solidifying that
knowledge, that our increasingly sophisticated technology is testament to the power of science.
Yet, when pressed, not only can we not agree on the method of science,
we can't quite agree on what science
is, or even whether what it talks about is real!
Looking over the history described in this paper we can get some inkling as to why this state
of affairs persists.

The progress seems real enough, from the question of light's speed being meaningless, to
discussion of whether it is finite or not, to increasing evidence for finite speed, to
ever `better' estimates of its value.
It might seem that scientific knowledge is the conjunction of the facts accumulated so far,
that theories live or die according to their verification or falsification by these facts,
and that, eventually, the truth will be inferred from the collection of facts.

Kuhn's work \cite{Kuhn:rev} describes a framework for this progress --
within a scientific `paradigm' normal science is pursued as a puzzle-solving activity,
this eventually produces anomalies, anomalies accumulate until a crisis is reached, a new paradigm
is somehow introduced , normal science proceeds again, and so on.
For example, normal science was pursued within a paradigm where light was without speed,
astronomical anomalies began to appear, leading ultimately to a theory where light had
a finite speed, whereupon normal science set about solving problems to establish its value.
In a more elaborate history, many such Kuhnian cycles would have been detectable. 

But what about method?
Long ago Aristotle wrote that knowledge, being ``a state of capacity to demonstrate'',
required the teaching of the principles of demonstration and so
the teaching of science necessarily ``$\ldots$
proceeds sometimes through induction and sometimes by deduction''(\cite{Aristotle:Nicomachean}
1139$^b$19 - 36).
But each is tricky to apply -- Francis Bacon, that strongest of proponents
of inductive method, allowed his perception of the incredible speed at which
stars move in their orbit about the Earth to form his inductive base and so concluded that
an infinite speed of light was reasonable;
no lesser talents
than Aristotle and Descartes by pure deduction demonstrated that light could
not possibly have finite speed.
Using induction and deduction in combination as in the
hypothetico-deductive approach is no easier.
It appears explicitly only twice in the above history
-- once by Aristotle to dismiss the argument of Empedocles, and once
by Descartes to dismiss that of Beeckman -- and wrong in both cases!
At various times each of these has been suggested as {\em the} method of science.
 
A slightly different tack is to take one such method and raise it to the status of
a criterion to distinguish science from non-science.
Karl Popper did this in 1934 with the hypothetico-deductive approach.
Contemptuous of the widely held view that the use of inductive methods
distinguished science from non-science, Popper proposed instead that
``it must be possible for an empirical scientific system to be refuted by experience.''
\footnote{\cite{Popper:logic}, page 41. }
That is, to merit the name scientific a theory must be falsifiable;\footnote{In a
paper meant to be a general resource \cite{Good:science},
I.J. Good gives partial prior credit to R.A. Fisher since tests of significance
\cite{Fisher:methods} predate Popper.
This credit seems misplaced -- Popper uses falsifiability as a {\em demarcation criterion}
for science, Fisher does nothing of the sort.}
a decisive experiment which refutes the theory is a crucial falsifying experiment.
By this criterion, the geocentric theory of the universe is scientific being falsifiable
by any orbital system not centred about the Earth; Galileo's discovery of the moons of
Jupiter refuted this theory.
Similarly the scientific theories of light held by Aristotle and Descartes were refuted by
R\"{o}mer's determination of the speed of light.
This criterion is turned into method by having scientists focus on trying to refute theory;
theories are corroborated only by surviving the most stringent of testing.

But normal science is conservative. 
Crucial experiments are typically only recognized as such long after the fact
-- Cassini et al
showed at the time that R\"{o}mer's observations could be accommodated by existing
theory.\footnote{See \cite{Lakatos:meth} pages 71 - 90 for further examples and discussion.}
If theories were thrown out when first refuted, the result would
be chaos.  Instead normal science motors along, sometimes fine tuning its theory
to accommodate the new information,
sometimes patching the theory with auxiliary hypotheses, and sometimes just
tossing the information into the back seat
where Popper's refutations become Kuhn's anomalies.
As the anomalies accumulate, the ride gets rougher and some members of the scientific community
become increasingly uneasy that a crisis is around the corner.

It is here that Kuhn's work is most interesting and most troublesome.
Kuhn likens the transition from one paradigm to the next to that of a gestalt
shift in visual perception.
Like a gestalt shift, a paradigm shift is sudden and without reason.
Unlike a gestalt shift, a paradigm shift does not allow the scientist to switch
between paradigms; no neutral third viewpoint exists from which both paradigms can be seen
-- if there were then this would be the new paradigm.
This is not to say that the new paradigm cannot be reasoned about and justified to some
satisfaction, but rather that it may not be possible to do so by comparing it to the old.
For once the transition is complete, the convert's view of the
field will have changed -- its methods, its concepts, its questions, even its data --
and the old paradigm can only be viewed from the perspective of the new.
In a word, the two paradigms are incommensurate.  Concepts, theory, methods, and data that
are meaningful according to one might not be according to the other.

Consider the concept of light.
According to Aristotle, light required an intervening transparent substance (like air or water);
it could not exist in a vacuum.
Things are transparent, of course, only because they contain a `certain substance' which is `also
found in the eternal upper body' (possibly aether? itself a concept Aristotle tells us he has
changed from that of Anaxagoras\footnote{\cite{Aristotle:heaven} 270$^b$20-25.}).
`Of this substance, light is the activity.' But it is not movement.
Moreover, the visibility in the dark
of bioluminescent plants and animals does {\em not} depend upon light! 
\footnote{See \cite{Aristotle:soul} 418$^a$26 to
419$^b$2 for most of the points made here.}
From this Aristotle says he has explained light.
Not only is Aristotle's concept different from ours, but to really understand what he
means by light we would need to become immersed in his paradigm.
Scientific concepts like light change in irreversible ways; some like aether disappear
altogether -- even after thousands of years of service.

Nor are concepts alone determined by the paradigm. 
So too are the `empirical facts' --
Francis Bacon's data included fantastic speeds for the movement of the stars about the Earth;
Glaseknapp demonstrated that different theory produced different `observed' speeds of light.
Even relatively raw `sense data' can be dependent upon theory.
Soon after Galileo announced the discovery of Jupiter's moons, he had others verify his
observations using his telescopes.
Many could not see the satellites;
those who could see multiple lighted spots could not be certain that these were not
artefacts of the new instrument. 
Only once the optics of telescopes was developed could there be confidence in the verity of the
observations.\footnote{See chapter 9 of \cite{Feyerabend:method}.}
Modern instruments produce observations that are irrevocably `theory laden.'

Paradigm shifts, incommensurability, and theory laden data have all contributed
to what Ian Hacking \cite{Hacking:phil} calls ``a crisis in rationality''  -- at least for
philosophers of science.  Is there such a thing as scientific reasoning?
Are the entities with which science deals real or are they human constructs?
Does it make sense to think that there is in fact an ideal truth to which science might
converge?

\section{And what of statistics?}
When statisticians look at the nature of science, they
see reflected the nature of statistics.\footnote{A notable exception is Pearson's
{\em The Grammar of Science} \cite{pearson:grammar}.}
Deduction becomes probability theory, induction, statistical theory (e.g. 
pp 6-7 of \cite{Barnett:comparative});
scientific method is hypothetico-deductive
(e.g. \cite{Box:science}, \cite{Durbin:pres-rss}, \cite{Nelder:pres-rss}),
self-evident in statistics through
formal hypothesis testing and model criticism; put it together and you have,
reminiscent of Aristotle,
what George Box has called ``the advancement of learning'' \cite{Box:science}.
But, as the previous section has shown, science is not really like that.
Neither should be our understanding of statistics.\footnote{
Indeed, John Tukey's long battle for the legitimacy of exploratory data analysis might have
been easier if there had been greater sympathy in the statistical research community
for separate contexts for discovery and for justification in science.
E.g. see \cite{Tukey:both}.}

Certainly statistical investigation meets with the same issues raised in the previous section
but it can deal with them more easily. This is because has a considerably more focussed domain
of application.  For example,
consider the two old chestnuts of the philosophy of science -- the realist/anti-realist debate and the problem 
of
induction.

The realist/anti-realist debate concerns whether the entities of science are real or
mere theoretical constructs.
The primary entities of statistical investigation are the units of the {\em study} population
and the values of variates measured on them.
The units and their collective must be determined with sufficient care for it to be
possible to select any individual from the collective.
Sometimes considerable effort must be put into ensuring that measurement systems
return reliable values of the variates they purport to measure.
Within this context, statisticians become scientific realists in Hacking's sense --
if we can select them and take measurements on them, they are real \cite{Hacking:phil};
if we cannot, then statistical investigation ceases.
Whether future scientific study shows the units to be composites of other more `fundamental'
units or that the variates measured are to be interpreted differently
is beside the point.

\begin{figure}[htp]
\centerline{\psfig{figure=induction.eps,height=4.0in}}
\caption{Induction from the set of measured values to the target population.}
\label{fig:induction}
\end{figure}


As regards induction, for statistics the problem can be neatly separated into two pieces (see Figure \ref{fig:induction}).
Ultimately, interests lies in the {\em target} population, as it is nearest
to the broad scientific concerns of the problem.
This population may be infinite, possibly uncountably so, and its definition can
involve phrases like `all units now and {\em in the future}.'
Drawing conclusions about this population will often require
arguments that are extra-statistical for they will be based on the similarities of, and
differences between, the {\em target} population and the {\em study} population.
Such arguments may ultimately be unable to avoid assuming
Hume's `uniformity of nature' principle (\cite{Hume:treatise} page 89) and hence what
philosophers mean by the `problem of induction.'
Such weighty problems dissipate when focus shifts to drawing
conclusions about the {\em study} population.
Such is its definition that
all study populations are finite in size and random selection of units
to form a sample is possible.
Random selection provides the strongest grounds for inductive inference.
When, for whatever reason, random selection has not been employed then either the case that
it has been near enough approximated, or that the sample is itself similar in its attributes
of interest to the study (or target) population must be made.
The latter is much like
making the case for the transfer of conclusions from the {\em study} to the
{\em target} population and so can be just as difficult.
In either case, the arguments will to a large extent be extra-statistical.

The critical reader might suppose that the structure we propose is designed
to relegate all the difficult problems to the realm of the `extra-statistical.'
But this is not sweeping them under the rug.  Just the opposite. They are exposed
as potentially weak links in the chain of inference about which statistics has nothing to say.\footnote{This 
does not
preclude further statistical studies being carried out to address some of these problems
(e.g. further investigation of study error)}
The five stage structure is a template for any statistical investigation
and so its applicability could be regarded as a demarcation criterion for statistics.
Post-hoc, the structure allows us to identify the strengths and weaknesses in the
statistical argument; in some investigations, even weak arguments may be all that
are available.
Ad hoc, it provides a useful strategy for finding out about populations and their attributes.

\section{Conclusions}
Statistics is not about the method of science with its paradigm shifts and incommensurability;
it is about investigating phenomena as they relate to populations of units.
As fascinating as the questions raised in Section 5 might be, they are not our questions.
That is a good thing; the empirical evidence to date suggests that they may not be
resolvable.

The five stage PPDAC process with the associated language and sub-stages
provides a good framework for describing investigations such as Michelson's,
especially for people learning the intricacies of Statistics. More importantly,
in actively planning and executing an empirical investigation, we believe that
the framework is very valuable to ensure that important issues are at least
considered.  And this is the case for every statistical investigation.

\begin{figure}[h]
\begin{center}
\begin{tabular}{|ll|ll|}
\hline
&$~~~~~~~~$ && \\
{\bf Problem} & & & \\
&& - Units \& Target Population (Process)&\\
&& - Response Variate(s)&\\
&& - Explanatory Variates &\\
&& - Population Attribute(s) &\\
&& - Problem Aspect(s) --  causative, descriptive, predictive &\\
& & & \\
{\bf Plan} & & & \\
&& - Study Population (Process)&\\
& &  $~~~$(Units, Variates, Attributes)&\\
&& - Selecting the response variate(s) &\\
&& - Dealing with explanatory variates &\\
&& - Sampling Protocol &\\
&& - Measurement processes &\\
&& - Data Collection Protocol&\\
& & & \\
{\bf Data} & & &\\
&& - Execute the Plan &\\
&& $~~~$ and  record all departures &\\
&& - Data Monitoring &\\
&& - Data Examination &\\
&& $~~~$ for internal consistency&\\
&& - Data storage &\\
& & & \\
{\bf Analysis} & & &\\
&& - Data Summary&\\
&& $~~~$ numerical and graphical&\\
&& - Model construction &\\
&& $~~~$ build, fit, criticize cycle &\\
&& - Formal analysis &\\
& & & \\
{\bf Conclusions} & & &\\
&& - Synthesis&\\
&& $~~~$ plain language, effective presentation graphics&\\
&& - Limitations of study &\\
&& $~~~$ discussion of potential errors &\\
& && \\
\hline
\end{tabular}
\caption{The statistical method.}
\label{fig-ppdac}
\end{center}
\end{figure}
Karl Pearson had it almost right.  Whatever the case for science, we can say that
the unity of Statistics consists alone in its method, not in its material.
And it is this method that should be given the broadest dissemination.

\section*{Acknowledgements}
Thanks are due to many people for many helpful discussions.
They include our colleagues Greg Bennett and
Winston Cherry of the Department of Statistics and Actuarial
Science,
astronomers Judith Irvin of Queen's University
and Dieter Brookner of Kingston who pointed out Cotter's book
\cite{nauthist:1968} to us,
and Stephen Stigler of the University of Chicago for his
helpful comments on early drafts of this paper.

All quantitative graphics were produced using the Quail statistical software
environment now available on the world-wide web.

\bibliography{research}
\end{document}


Better is the original definition of a crucial experiment  (Bacon) suggesting that its
purpose is to provide sign posts (i.e. different possible directions).

Serendipity will always play a large role.

Ultimately, there is the problem of induction that will never go away.
induction has its problems as it requires a uniformity of nature in order that what
we see today will co;  speed of light could change (suggeested in the 50s)
  -- Kuhn's solution.
Kuhn suggests that there is not.
We have gleaned this structure from examining many studies from our own experience
and from published sources.
We propose that the structure now be tested on other studies so that it
may be verified, falsified, or modified in true scientific fashion.