**Observational Studies and Designed Experiments**

Besides classifying statistical studies as either descriptive or inferential, we often need to classify them as either observational studies or designed experiments. In an observational study, researchers simply observe characteristics and take measurements, as in a sample survey. In a designed experiment, researchers impose treatments and controls and then observe characteristics and take measurements. **Observational studies can reveal only association, whereas designed experiments can help establish causation**.

**Census, Sampling, and Experimentation**

If the information you need is not already available from a previous study, you might acquire it by conducting a census – that is, by obtaining information for the entire population of interest. However, conducting a census may be time consuming, costly, impractical, or even impossible.

Two methods other than a census for obtaining information are sampling and experimentation. If sampling is appropriate, you must decide how to select the sample; that is, you must choose the method for obtaining a sample from the population. Because the sample will be used to draw conclusions about the entire population, it should be a representative sample – that is, it should reflect as closely as possible the relevant characteristics of the population under consideration.

Three basic principles of experimental design are: **control**, **randomization**, and **replication**. In a designed experiment, the individuals or items on which the experiment is performed are called experimental units. When the experimental units are humans, the term subject is often used in place of experimental unit. Generally, each experimental condition is called a treatment, of which there may be several.

**Sampling**

Most modern sampling procedure involve the use of **probability sampling**. In probability sampling, a random device – such as tossing a coin, consulting a table of random numbers, or employing a random-number generator – is used to decide which members of the population will constitute the sample instead of leaving such decisions to human judgement. The use of probability sampling may still yield a non representative sample. However, probability sampling helps eliminate unintentional selection bias and permits the researcher to control the chance of obtaining a non representative sample. Furthermore, the use of probability sampling guarantees that the techniques of inferential statistics can be applied.

Simple Random Sampling

Simple random sampling is a sampling procedure for which each possible sample of a given size is equally likely to be the one obtained. There are two types of simple random sampling. One is **simple random sampling with replacement (SRSWR)**, whereby a member of the population can be selected more than once; the other is **simple random sampling without replacement (SRS)**, whereby a member of the population can be selected at most once.

Simple random sampling is the most natural and easily understood method of probability sampling – it corresponds to our intuitive notion of random selection by lot. However, simple random sampling does have drawbacks. For instance, it may fail to provide sufficient coverage when information about subpopulations is required and may be impractical when the members of the population are widely scattered geographically.

Systematic Random Sampling

One method that takes less effort to implement than simple random sampling is systematic random sampling.

Cluster Sampling

Another sampling method is cluster sampling, which is particular useful when the members of the population are widely scattered geographically.

Stratified Sampling

Another sampling method, known as stratified sampling, is often more reliable than cluster sampling. In stratified sampling, the population is first divided into subpopulations, called **strata**, and then sampling is done from each stratum. **Ideally, the members of each stratum should be homogenous relative to the characteristic under consideration**. In stratified sampling, the strata are often sampled in proportion to their size, which is called **proportional allocation**.

**Statistical Designs**

Once we have chosen the treatments, we must decide how the experimental units are to be assigned to the treatments (or vice versa). In a **completely randomized design**, all the experimental units are assigned randomly among all the treatments. In a **randomized block design**, experimental units are similar in ways that are expected to affect the response variable are grouped in **blocks**. Then the random assignment of experimental units to the treatment is made block by block. Or, the experimental units are assigned randomly among all the treatments separately within each block.

**One-Mean z-Interal Procedure**

**One-Mean t-Interval Procedure**

**One-Mean z-Test**

**One-Mean t-Test**

**Wilcoxon Signed-Rank Test**

Note: The following points may be relevant when performing a Wilcoxon signed-rank test:

- If an observation equals 𝜇0 (the value for the mean in the null hypothesis), that observation should be removed and the sample size reduced by 1.
- If two or more absolute differences are tied, each should be assigned the mean of the ranks they would have had if there were no ties.

**Pooled t-Test**

**Pooled t-Interval Procedure**

**Nonpooled t-Test**

**Nonpooled t-Interval Procedure**

**Mann-Whitney Test** (Wilcoxon rank-sum test, Mann-Whitney-Wilcoxon test)

Note: When there are ties in the sample data, ranks are assigned in the same way as in the Wilcoxon signed-rank test. Namely, if two or more observations are tied, each is assigned the mean of the ranks they would have had if there had been no ties.

**Paired t-Test**

**Paired t-Interval Procedure**

**Paired Wilcoxon Signed-Rank Test**

**One-Proportion z-Interval Procedure**

**One-Proportion z-Test**

**Two-Proportions z-Test**

**Meta-Analysis: Which Model Should We Use?**

Fix effect model

It makes sense to use the fixed-effect model if two conditions are met. First, we believe that all the studies included in the analysis are functionally identical. Second, our goal is to compute the common effect size for the identified population, and not to generalize to other populations. For example, suppose that a pharmaceutical company will use a thousand patients to compare a drug versus placebo. Because the staff can work with only 100 patients at a time, the company will run a series of ten trials with 100 patients in each. The studies are identical in the sense that any variable which can have an impact on the outcome are the same across the ten studies. Specifically, the studies draw patients from a common pool, using the same researchers, dose, measure, and so on.

Random effects

By contrast, when the researcher is accumulating data from a series of studies that had been performed by researchers operating independently, it would be unlikely that all the studies were functionally equivalent. Typically, the subjects or interventions in these studies would have differed in ways that would have impacted on the results, and therefore we should not assume a common effect size. Therefore, in these cases the random-effects model is more easily justified than the fixed-effect model. Additionally, the goal of this analysis is usually to generalize to a range of scenarios. Therefore, if one did make the argument that all the studies used an identical, narrowly defined population, then it would not be possible to extrapolate from this population to others’ nd the utility of the analysis would be severely limited.

Heterogeneity

To understand the problem, suppose for a moment that all studies in the analysis shared the same true effect size, so that the (true) heterogeneity is zero. Under this assumption, we would not expect the observed effect to be identical to each other. Rather, because of within-study error, we would expect each to fall within some range of the common effect. Now, assume that the true effect size does vary from one study to the next. In this case, the observed effects vary from one another for two reasons. One is the real heterogeneity in effect size, and the other is the within-study error. If we want to quantify the heterogeneity we need to partition the observed variation into these two components, and then focus on the former.

The mechanism that we use to extract the true between-studies variation from the observed variation is as follows:

- We compute the total amount of study-to-study variation actually observed.
- We estimate how much the observed effects would be expected to vary from each other if the true effect was actually the same in all studies.
- The excess variation (if any) is assumed to reflect real differences in effect size (that is, the heterogeneity)

**Clinical Trials**

**Randomization**

The function of randomization include:

- Randomization removes the potential of bias in the allocation of participants to the intervention group or to the control group. Such selection bias could easily occur, and cannot be necessarily prevented, in the non-randomziared concurrent or historical control study because the investigator or the participant may influence the choice of intervention. The direction of the allocation bias may go either way and can easily invalidate the comparison. This advantage of randomization assumes that the procedure is performed in a valid manner and that the assignment cannot be predicted.
- Some what related to the first, is that randomization tends to produce comparable groups; that is, measured as well as unknown or unmeasured prognostic factors and other characteristics of the participants at the time of randomization will be, on the average, evenly balanced between the intervention and control groups. This dose not mean that in any single experiment all such characteristics, sometimes called baseline variables or covariates, will be perfectly balanced between the two groups. However, it does mean that for independent covariates, whatever the detected or undetected differences that exist between the groups, the overall magnitude and direction of the differences will tend to be equally divided between the two groups. Of course, many covariates are strongly associated; thus, any imbalance in one would tend to produce imbalances in the others.
- The validity of statistical tests of significance is guaranteed. The process of randomization makes it possible to ascribe a probability distribution to the difference in outcome between treatment groups receiving equally effective treatments and thus to assign significance levels to observed differences. The validity of the statistical tests of significance is not dependent on the balance of the prognostic factors between the randomized groups. The chi-square test for two-by-two tables and Student’s t-test for comparing two means can be justified on the basis of randomization alone without making further assumptions concerning the distribution of baseline variables. If randomization is not used, further assumptions conceding the comparability of the groups and the appropriateness fo the statistical models must be made before the comparisons will be valid. Establishing the validity of these assumptions may be difficult.

In the simplest case, **randomization is a process by which each participant has the same chance of being assigned to either intervention or control**. An example would be the toss of a coin, in which heads indicates intervention group and tails indicates control group. Even in the more complex randomization strategies, the element of chance underlies the allocation process. Of course, neither trial participant nor investigator should know what the assignment will be before the participant’s decision to enter the study. Otherwise, the benefits of randomization can be lost.

**The Randomization Process**

Two forms of experimental bias are of concern. The first, **selection bias**, occurs if the allocation process is predictable. In this case, the decision to enter a participant into a trial may be influenced by the anticipated treatment assignment. If any bias exists as to what treatment particular types of participants should receive, then a selection bias might occur. A second bias, **accidental bias**, can arise if the randomization procedure does not achieve **balance** on risk factors or prognostic covariates. Some of the allocation procedures are more vulnerable to accidental bias, especially for small studies. For large studies, however, the chance of accidental bias is negligible.

Fixed Allocation Randomization

Fixed allocation procedures assign the interventions to participants with a respecified probability, usually equal (e.g., 50% for two arms, 33% for 3, or 25% for 4, etc.) and that allocation probability is not altered as the study progresses. Three methods of randomization belong to the fixed allocation, including: simple, blacked, and stratified randomization. The most elementary form of randomization is referred to as **simple or complete randomization**. One simple method is to toss an unbiased coin each time a participant is eligible to be randomized (for two treatment combinations). Using this procedure, approximately one half of the participants will be in group A and one half in group B. In practice, for small studies, instead of tossing a coin to generate a randomization schedule, a random digit table on which the equally likely digits 0 to 9 are arranged by tows and columns is usually used to accomplish simple randomization. For large studies, a more convenient method for producing a randomization schedule is to use a random number producing algorithm, available on most computer systems. Another simple randomization is to use a uniform random number algorithm to produce random numbers in the interval from 0.0 to 1.0. Using a uniform random number generator, a random number can be produced for each participant. If the random number is between 0 and p, the participant would be assigned to group A; otherwise to group B. For equal allocation, the probability cut point, p, is one-half (i.e., p = 0.50). If equal allocation between A and B is not desired, then p can be set to the desired proportion in the algorithm and the study will have, on the average, a proportion p of the participants in group A. In addition, this strategy could be adapted easily to more than two groups.

**Blocked randomization**, sometimes called permuted block randomization, avoids serious imbalance in the number of participants assigned to each group, an imbalance which could occur in the simple randomization procedure. More importantly, blocked randomization guarantees that at no time during randomization will the imbalance be large and that at certain points the number of participants in each group will be equal. This protects against temporal trends during enrollment, which is often a concern for larger trials with long enrollment phases. If participants are randomly assigned with equal probability to groups A or B, then for each block of even size (for example, 4, 6, or 8) one half of the participants will be assigned to A and the other half to B. The order in which the interventions are assigned in each block is randomized, and this process is repeated for consecutive blocks of participants until all participants are randomized.

## Post a Comment