This is the second of a three part series on sampling. The third part will come more quickly than the second part. 🙂
There are four types of sampling. Simple Random Sample, Stratified, Cluster, and Systematic. I will give a brief definition as well as an example for each.
Simple random sample is the most basic. It is where every person of interest (the population) has an equal chance of being selected for the survey. We will talk more about it later.
Stratified sampling is one in which the population is divided into groups, and the sample is obtained with respect to the relative sizes of each group. For instance, if the sampling is to come from any individual in either of two cites, and one city has 1000 people and the other city has 2000 people, then the sample would consist of 1/3 of it subjects (people) from the first city (since it has 1/3 of the total) and 2/3 from the second city.
Cluster sampling is similar to systematic in that the population is divided into groups (called clusters in this case) but for cluster sampling, one (or more) of the clusters is chosen and represents other clusters.
So, for example. assume that voters across the country are to be surveyed. Assume also we would like to sample proportional to the states’ population. But instead of going to each state, they may just sample from a handful of states if it is believed that one state is representative of other states. Perhaps they sample only from Oregon to represent three states on the west coast.
Finally, and in no particular order there is Systematic Sampling. Systematic sampling occurs when every kth sample is obtained (where k is some natural number, such as 4). So, for example, assume that one wants to sample hospital patients, and are interested in patients in some 24-hour period, perhaps a Saturday. Assume also that they expect about 400 patients in a day. Thus, they would like to sample about 100 people. Assume they sample every fourth person on the register sheet. The advantage of this is that by sampling people throughout the day, they are more apt to avoid peculiarities related to time of day.
For example, if they sample the first 100 people that go to the hospital some Saturday morning, it might be that they are getting a different type of patient. Perhaps the people going early in the day are more apt to be giving blood. Thus, if the sample is to try to ascertain the reasons people go in, they are likely to get a distorted picture.
Back to a Simple Random Sample. When taking a Simple Random Sample, it is usually impossible (for all intents and purposes) to give everyone an equal chance of being selected. For instance, in the example of polling for the Presidential election, I am not quite sure of their exact methods, but one thing I am sure of is that not every registered voter has an equal chance of being selected. For instance, if they do this by way of telephone, not everybody has a telephone (though in this day and age, just about everybody does) but some people may not pick up their phone, or more so, if asked, do not want to divulge who they are leaning toward.
Also, just about any sample that is not a Simple Random Sample is going to include a Simple random sample (loosely speaking, as we discussed in the paragraph above) within it. Take the example given above with stratified sampling. Perhaps it is not expedient to give everyone an equal chance of being selected. You might not have their names, their phone numbers, etc. But you try to make it as random as is reasonably possible.
At the crux of all sampling is bias, and specifically the ability to avoid it. Bias is where the sampling is a distorted representation of the population.
In part 3, we will discuss the most famous case of bias, as well as the polling in the Presidential election.
Many aspects to discuss with that. Everybody seems to have an opinion on it.