Sampling (statistics)
Sampling is that part of statistical practice concerned with the selection of individual observations intended to yield some knowledge about a population of concern, especially for the purposes of statistical inference. Each observation measures one or more properties (weight, location, etc.) of an observable entity enumerated to distinguish objects or individuals. Survey weights often need to be applied to the data to adjust for the sample design. Results from probability theory and statistical theory are employed to guide practice.
The sampling process comprises several stages:
Defining the population of concern
Specifying a sampling frame, a set of items or events possible to measure
Specifying a sampling method for selecting items or events from the frame
Determining the sample size
Implementing the sampling plan
Sampling and data collecting
Reviewing the sampling process
Contents
1 Population definition
2 Sampling frame
3 Sampling method
3.1 Quota sampling
3.2 Simple random sampling
3.3 Stratified sampling
3.4 Cluster sampling
3.5 Random sampling
3.6 Matched random sampling
3.7 Systematic sampling
3.8 Mechanical sampling
3.9 Convenience sampling
3.10 Line-intercept sampling
4 Sample size
5 Types of data
5.1 Categorical and numerical
6 Sampling and data collection
7 Review of sampling process
7.1 Non-response
8 Survey weights
9 History
10 See also
11 External links
12 Notes
13 References
//
more
Process Sampling
The following is an excerpt from Chapter 11 of Pyzdek's Guide to SPC, Volume 2: Applications and Special Topics by Thomas Pyzdek, © 1992 by Quality Publishing. It may be ordered from the Quality Publishing Order Form.
Sampling to determine process control is more an art form than a science. The objective is to select subgroups such that the variation of measurements or counts within the subgroup will be produced by only common causes. The spread of the control limits will be based on only within subgroup variation. Thus, any addition variation will cause the production of subgroup statistics which fall beyond the control limits, signaling a special cause of variation.
I have always found it helpful to think about the process as a bowl of blue chips with numbers written on them. A controlled process is one where the same bowl of chips is sampled time-after-time. If the chips in the bowl have different numbers on them, there will be a variation in the sample.
However since the bowl doesn’t change, the variation will be relatively consistent from one sample to the next. After sampling the bowl numerous times we will become more and more comfortable setting up some limits on the variation we expect to see in the future samples from the same bowl. The bowl represents a controlled process, a predictable process.
Now lets say that there are two bowls, one with blue chips and one with green chips. Assume further that the number written on the blue chips are quite different than those written on the green chips. Furthermore, lets say that you don’t get to see the chips themselves; all you know is the numbers you obtained. Sometimes the sample is taken from the blue chips and sometimes from the green chips. Could you tell the difference?
The answer depends a great deal on the way you formed your subgroups. If your subgroups were formed from a mixture of blue and green chips, then the process is neither blue nor green; the process is blue + green. The subgroup variation would include the variation from both the blue + green and the difference between them.
For example, if the blue chip varied from 10 to 50 and the green varied from 60 to 100, a mixed sample of both blue and green would vary from 10 to 100. Control limits based on the mixed sample would show a greater spread, and your estimate of the process capability would indicate a less capable process than either the blue or the green alone. In other words, you would probably conclude that the blue + green process was "in control and capable of holding a tolerance of 10 to 100."
The objective of forming rational subgroups is to identify the underlying process so that departure from the underlying process can be quickly detected and corrected. The underlying process can be thought of as the performance that could be attained if all special causes of variation were eliminated and the process was operating at its best. To do this you must plan carefully to avoid mixing processes from different cause systems, which is comparable to mixing the blue chips and the green chips.
Here’s a more down- to- earth example. An o-ring is made in a mold with fifty cavities. It is known that there is a substantial difference between the cavities. It would be a mistake to form a subgroup using a o-ring from cavities known to be different because the cavity- to- cavity variation would mask the variation caused by other factors such as material, temperature, etc..
SPC methods useful for this type of data are presented in chapter 21. However, taking a longer-term perspective, you should try to modify the mold so that there is less variation between the different cavities. Eventually you would like to get the molding process so consistent that the o-rings are all alike regardless of which cavity in the mold produced them. That is the ultimate goal of SPC, to change the real world for the better not to make a control chart look better.
more
more
Monday, May 26, 2008
Subscribe to:
Post Comments (Atom)

0 comments:
Post a Comment