Part One: Probability

Notes on Proportions

A proportion is a simple ratio - it is computed (and viewed) differently based on the context in which it is defined. We consider population proportions, sample proportions and how a proportion is a special kind of mean.

The Population Proportion

We begin with a Population (T) and an attribute (A). The proportion in this context is simply the ratio of members of T with attribute A to the total number of members in T.

In symbols, PT,A = ( NT,A ) / ( NT ).

In short, the proportion PT,A tells us how often the attribute A occurs in the population T.

The Sample Proportion

We begin with a Random Sample (S) from Population (T) and an attribute (A). The proportion in this context is simply the ratio of members of S with attribute A to the total number of members in S.

In symbols, pS,A = ( nS,A ) / ( nS ).

In short, the proportion pS,A tells us how often the attribute A occurs in the sample S.

The Linkage between Population and Sample Proportions

Sometimes, the quantity PT,A can be used to indicate something about pS,A.

In probability theory, PT,A predicts the behavior of families of pS,A. In statistics, pS,A can say something about PT,A.

Under certain circumstances, knowing how often A occurs in a sample drawn from a population can lead to reliable estimates of how often A occurs in the population itself. Conversely, knowing how often A occurs in a population predicts the frequency of appearance of A in samples drawn from the population.

The Probability Proportion

One way of interpreting the probability of an event is as the proportion of long runs of trials of an experiment that result in the occurrence of an Event. For example, if we are tossing a fair coin a very large number of times, most large runs of tosses yield a proportion of "heads" of approximately .50 or 50%. Alternatively, if a large number of draws with replacement from a bowl containing blue, green, red and yellow balls yields approximately 25% blue draws, then we may be led to believe that approximately 1 in 4 balls in the bowl is blue.

This interpretation is also known as the "frequentist" or "long run" interpretation of probability.

The Proportion Random Variable and The Proportion Mean

The random variable for an attribute A is simple - code a trial as "1" if and only if that trial presents attribute A. Otherwise, code the trial as "0". The proportion is a special kind of mean, using this simple random variable.

For example, suppose that a coin is tossed 50 times, and that we write X=0 if heads show and X=1 if tails shows. Suppose that our 50 tosses show 28 heads and 22 tails. Then the proportion of tails in the sample of tosses is then the ratio 22/50, or .44 . Simply view the 50 as sample size, and the 22 as the sum of 22 1's and 28 0's.