\subsection{The Problem}
The purpose of the Problem stage is to provide a clear statement of what is to be
learned from the study. 
To do so, it is important to translate the contextual problem under study into
a language that can guide the design and implementation of the subsequent stages
of Statistical Method. Understanding what is to be learned from the study is so
important that it is surprising that it is rarely treated in any introduction to 
Statistical Method. In a cursory review, we could find no elementary text that provided a structure to
understand the  problem. 
For example, the popular text  by Moore and McCabe  \cite{MooreMcCabe:text} 
makes no mention of the role of Statistics in problem formulation. The cost of this 
omission is incalculable.

Two exceptions are the paper by Hand \cite{Hand:decon} and  Chatfield's book, \cite{Chatfield:prob}. Hand's 
aim was ``to stimulate debate about the need to formulate research questions sufficiently precisely
that they may be unambiguously and correctly matched  with statistical techniques''. He suggests five 
principles to aid in this matching but no structure or language. Chatfield provides excellent advice to get a 
clear understanding of the physical background to the situation under study,  to clarify the objectives
and to formulate the problem in statistical terms. One of our goals is to provide appropriate statistical terms, 
a language to formulate the problem, and a structured approach  to this formulation. 

We break the Problem stage into several sub-stages.
\subsubsection{Specifying the Target Units and Population}
Statistical method is concerned with gaining knowledge about a population. Here we borrow the
language of sample surveys \cite{cochran:1977} with the intent to apply it
broadly. The {\em
target
population} is the collection of {\em units} which we want to learn
about or which we want to apply the conclusions to.

To illustrate the terminology, in 1879, Michelson was keen to determine the speed of white light as
 it travels
between any two relatively stationary points in a vacuum.  A unit is one
transmission of such light between a source and destination, both located in a
vacuum. The target population is all such transmissions, before, during and
after 1879.

-poorly specified, looking back, sometimes
arbitrary
-finite or not philosophical - limits may be unknown, conceptually
uncountable in this case - does'nt matter!:wq

-one or two
-unit changes eg blood pressure over tieme, meas system study

draw a
conclusion.  A {\em unit} is
{\em Units and Target Population}-these specify the collective to which we are
interested in applying our learning.
\item
{\em Variates} - numerical or categorical values attached to every unit in the
target population (values of variates may differ from unit to unit). 
\item
{\em Population Attributes} - functions that apply to the entire 
population calculated through the variate values on individual units.
\item
{\em Problem Aspect}  - either {\em causative}, {\em predictive} or
{\em descriptive}. A problem with a causative aspect corresponds to one where
interest lies in investigating the nature of a causative relationship
between two or more variates in the target population. A problem has a predicitve aspect if the object is to 
predict the values of variates on one or more units in the target population. A problem has a descriptive 
aspect
if the object is to estimate or describe one or more attributes of the population. 
\end{enumerate}

To illustrate the terminology, in 1879, Michelson was keen to determine the speed of white light as it travels
between any two relatively stationary points in a vacuum.  A unit is one
transmission of such light between a source and destination, both located in a
vacuum. The target population is all such transmissions, before, during and
after 1879. The primary variate of interest, which we call the {\em response
variate}, is the speed of light associated with each such transmission.  There are
many other variates, which we call {\em explanatory variates}
attached to  each unit  such as the distance between the two
points, the motion of the points with respect to each other, properties of the
source and so on.  In Michelson's problem, there is no direct interest in these other
variates. 

The attribute of interest is the speed of light  averaged across all units in the target
population. This example is unusual in that it was believed that there is no
variation in the value of the response variate from unit to unit i.e that the speed of white light in a vacuum 
is constant.

The problem here is descriptive. The aim is to estimate a population attribute.
If Michelson had been attempting to show that the speed of light can be changed
by, for example, having the source move towards the destination, then the Problem
has a causative aspect. It is important to decide the aspect at the problem stage
because of the special requirements of the Plan needed to establish causation.

This language can be used to clarify many statistical issues. Consider, for example, Demings notion of an 
enumerative versus analytic study, \cite{Deming:1953}, \cite{HahnandMeeker:1993}.  If the target 
population can be listed so that a probabilistic sampling protocol giving every unit a 
positive inclusion probability can be used, the study is enumerative. Deming was particularly interested in 
industrial processes \footnote{We have found it is often useful to specify the target population by describing 
the process that generates the units. This process is called, naturally, the target process}, where the target 
population included units not yet produced. All studies on such processes are analytic. 

In applying Exploratory Data Analysis, the target population is the sample, the units that are included in the 
study. The object is to find interesting relationships among the usually many variates available. The 
relationships are defined as attributes in the target population. 

A clear specification of the attribute of interest can resolve many issues. Lord's paradox, as presented by 
\cite{Hand:decon}, is easily resolved by noting that it involves two different attributes. See our discussion to 
Hand. 

