Perspectives on Hematology, Health Care, and The Profession of Pharmacy.

Factorial Designs

March 5, 2018 Clinical Trials, Medical Statistics, Research No comments , , , , , , , , ,

In this section we will describe the completely randomized factorial design. This design is commonly used when there are two or more factors of interest. Recall, in particular, the difference between an observational study and a designed experiment. Observational studies involve simply observing characteristics and taking measurements, as in a sample survey. A designed experiment involves imposing treatments on experimental units, controlling extraneous sources of variation that might affect the experiment, and then observing characteristics and taking measurement on the experimental units.

Also recall that in an experiment, the response variable is the characteristic of the experimental outcome that is measured or observed. A factor is a variable whose effect on the response variable is of interest to the experimenter. Generally a factor is a categorical variable whose possible values are referred to as the levels of the factor. In a single factor experiment, we will assign experimental unit to the treatments (or vice versa). Experimental units should be assigned to the treatments in such a way as to eliminate any bias that might be associated with the assignment. This is generally accomplished by randomly assigning the experimental units to the treatments.

In certain medical experiments, called clinical trials, randomization is essential. To compare two or more methods of treating illness, it is important to eliminate any bias that could be introduced by medical personnel assigning patients to the treatments in a nonrandom fashion. For example, a doctor might erroneously assign patients who exhibit less severe symptoms of the illness to a less risky treatment.

PS: Advantages of randomized design over other methods for selecting controls

  • First, randomization removes the potential of bias in the allocation of participants to the intervention group or to the control group. Such selection bias could easily occur, and cannot be necessarily prevented, in the non-randomized concurrent or historical control study because the investigator or the participant may influence the choice of intervention. This influence can be conscious or subconscious and can be due to numerous factors, including the prognosis of the participant. The direction of the allocation bias may go either way and can easily invalidate the comparison. This advantage of randomization assumes that the procedure is performed in a valid manner and that the assignment cannot be predicted.
  • Second, somewhat related to the first, is that randomization tends to produce comparable groups; that is, measured as well as unknown or unmeasured prognostic factors and other characteristics of the participants at the time of randomization will be, on the average, evenly balanced between the intervention and control groups. This does not mean that in any single experiment all such characteristics, sometimes called baseline variables or covariates, will be perfectly balanced between the two groups. However, it does mean that for independent covariates, whatever the detected or undetected differences that exist between the groups, the overall magnitude and direction of the differences will tend to be equally divided between the two groups. Of course, many covariates are strongly associated; thus, any imbalance in one would tend to produce imbalances in the others.
  • Third, the validity of statistical tests of significance is guaranteed. As has been stated, “although groups compared are never perfectly balanced for important covariates in any single experiment, the process of randomization makes it possible to ascribe a probability distribution to the difference in outcome between treatment groups receiving equal effective treatments and thus to assign significance levels to observed differences.” The validity of the statistical tests of significance is not dependent on the balance of prognostic factors between the randomized groups.

Often in clinical trials, double blind studies are used. In this type of study, patients (the experimental units) are randomly assigned to treatments, and neither the doctor nor the patient knows which treatment has been assigned to the patient. This is an effective way to eliminate bias in treatment assignment so that the treatment effects are not confounded (associated) with other non experimental and uncontrolled factors.

Factorial design involve two or more factors. Consider the experiment in this example. There the researchers studied the effects of two factors (hydrophilic polymer and irrigation regimen) on weight gain (the response variable) of Golden Torch cacti. The two levels of the polymer factor were: used and not used. The irrigation regimen had five levels to indicate the amount of water usage: none, light, medium, heavy, and very heavy. This is an example of a two-factor or two-way factorial design.

In this experiment every level of polymer occurred with every level of irrigation regimen, for a total of 2 * 5 = 10 treatments. Often these 10 treatments are called treatment combinations to indicate that we combine the levels of the various factors together to obtain the actual collection of treatments. Since, in this case, every level of one factor is combined with every level of the other factor, we say that the levels of one factor are crossed with the levels of the other factor. When all the possible treatment combinations obtained by crossing the levels of the factors are included in the experiment, we call the design a complete factorial design, or simply a factorial design.

It is possible to extend the two-way factorial design to include more factors. For example, in the Golden Torch cacti experiment, the amount of sunlight the cacti receive could have an effect on weight gain. If the amount of sunlight is controlled in the two-way study so that all plants receive the same amount sunlight, then the amount of sunlight would not be considered a factor in the experiment.

However, since the amount of sunlight a cactus receives might have an effect on its growth, the experimenter might want to introduce this additional factor. Suppose we consider three levels of sunlight: high, medium, and low. The levels of sunlight could be achieved by placing screens of various mesh sizes over the cacti. If amount of sunlight is added as a third factor, there would be 2 * 5 * 3 = 30 different treatment combinations in a complete factorial design.

Possibly we could add even more factors to the experiment to take into account other factors that might affect weight gain of the cacti. Adding more factors will increase the number of treatment combinations for the experiment (unless the level of that factor is 1). In general, the total number of treatment combinations for a complete factorial design is the product of the number of levels of all factors in the experiment.

Obviously, as the number of factors increases, the number of treatment combinations increases. A large number of factors can result in so many treatment combinations that the experiment is unwieldy, too costly, or too time consuming to carry out. Most complete factorial designs involve only two or three factors.

To handle many factors, statisticians have devised experimental designs that use only a fraction of the total number of possible treatment combinations. These designs are called fractional factorial designs and are usually restricted to the case of all factors having two or three levels each. Fractional factorial designs cannot provide as much information as a complete factorial design, but they are very useful when a large number of factors is involved and the number of experimental units is limited by availability, cost, time, or other considerations. Fractional factorial designs are beyond the scope of this thread.

Once the treatment combinations are determined, the experimental units need to be assigned to the treatment combinations. In a completely randomized design, the experimental units are randomly assigned to the treatment combinations. If this random assignment is not done or is not possible, the treatment effects might become confounded with other uncontrolled factors that would make it difficult or impossible to determine whether an effect is due to the treatment or due to the confounding with uncontrolled factors.

Besides the random assignment of experimental units to treatment combinations, it is important that we use randomization in other ways when conducting an experiment. Often experiments are conducted in sequence. One treatment combination is applied to an experimental unit, and then the next treatment combination is applied to the next experimental unit, and so forth. It is essential that the order in which the experiments are conducted be randomized.

For example, consider an experiment in which measurements are made that are sensitive to heat or humidity. If all experiments associated with the first level of a factor are conducted on a hot and humid day, all experiments are associated with the second level of the factor are conducted on a cooler, less humid day, and so on, then the factor effect is confounded with the heat/humidity conditions on the days that the experiments are conducted. If the analysis indicates an effect due to the factor, we do not know whether there is actually a factor effect or a heat/humidity effect (or both). Randomization of the order in which the experiments are conducted would help keep the heat/humidity effect from being confounded with any factor effect.

Experimental and Classification Factors

In the description of designing experiments for factorial designs, we emphasized the idea of being able to assign experimental units to treatment combinations. If the experimental units are assigned randomly to the levels of a factor, the factor is called an experimental factor. If all the factors of a factorial design are experimental factors, we consider the study a designed experiment.

In some factorial studies, however, the experimental units cannot be assigned at random to the levels of a factor, as in the case when the levels of the factor are characteristics associated with the experimental units. A factor whose levels are characteristics of the experimental unit is called a classification factor. If all the factors of a factorial design are classification factors, we consider the study an observation study.

Consider, for instance, in the household energy consumption study, the response variable is household energy consumption and the factor of interest is the region of the United States in which a household is located. A household cannot be randomly assigned to a region of the country. The region of the country is a characteristic of the household and, thus, a classification factor. If we were to add home type as a second factor, the levels of this factor would also be a characteristic of a household, and, hence, home type would also be a classification factor. This two-way factorial design would be considered an observational study, since both of its factors are classification factors.

There are many studies that involve a mixture of experimental and classification factors. For example, in studying the effect of four different medications on relieving headache pain, the age of an individual might play a role in how long it takes before headache pain dissipates. Suppose a researcher decides to consider four age groups: 21 to 35 years old, 36 to 50 years old, 51 to 65 years old, and 66 years and older. Obviously, since age is a characteristic of an individual, age group is a classification factor.

Suppose that the researcher randomly selects 40 individuals from each age group and then randomly assigns 10 individuals in each age group to one of the four medications. Since each person is assigned at random to a medication, the medication factor is an experimental factor. Although one of the factors here is a classification factor and the other is an experimental factor, we would consider this designed experiment.

Fixed and Random Effect Factors

There is another important way to classify factors that depends on the way the levels of a factor are selected. If the levels of a factor are the only levels of interest to the researcher, then the factor is called a fixed effect factor. For example, in the Golden Torch cacti experiment, both factors (polymer and irrigation regimen) are fixed effect factors because the levels of each factor are the only levels of interest to the experimenter.

In the levels of a factor are selected at random from a collection of possible levels, and if the researcher wants to make inferences to the entire collection of possible levels, the factor is called a random effect factor. For example, consider a study to be done on the effect of different types of advertising on sales of a new sandwich at a national fast-food chain. The marketing group conducting the study feels that the city in which a franchise store is located might have an effect on sales. So they decide to include a city factor in the study, and randomly select eight cities from the collection of cities in which the company’s stores are located. They are not interested in these eight cities alone, but want to make inferences to the entire collection of cities. In this case the city factor is a random effect factor.

Analysis of Variance

March 4, 2018 Medical Statistics No comments , , , , , ,

Analysis-of-variance procedures rely on a distribution called the F-distribution, named in honor of Sir Ronald Fisher. A variable is said to have an F-distribution if its distribution has the shape of a special type of right-skewed curve, called an F-curve. There are infinitely many F-distribution (and F-curve) by its number of degrees of freedom, just as we did for t-distributions and chi-square distributions.

Screen Shot 2018 03 03 at 10 07 34 PMAn F-distribution, however, has two numbers of degrees of freedom instead of one. Figure 16.1 depicts two different F-curves; one has df = (10, 2), and the other has df = (9, 50). The first number of degrees of freedom for an F-curve is called the degree of freedom for the numerator, and the second is called the degrees of freedom for the denominator.

Basic properties of F-curves:

  • The total area under an F-curve equals 1.
  • An F-curve starts at 0 on the horizontal axis and extends indefinitely to the right, approaching, but never touching, the horizontal axis as it does so.
  • An F-curve is right skewed.

One-Way ANOVA: The Logic

In older threads, you learned how to compare two population means, that is, the means of a single variable for two different populations. You studies various methods for making such comparisons, one being the pooled t-procedure.

Analysis of variance (ANOVA) provides methods for comparing several population means, that is, the means of a single variable for several populations. In this section we present the simplest kind of ANOVA, one-way analysis of variance. This type of ANOVA is called one-way analysis of variance because it compares the means of a variable for populations that result from a classification by one other variable, called the factor. The possible values of the factor are referred to as the levels of the factor.

For example, suppose that you want to compare the mean energy consumption by households among the four regions of the United States. The variable under consideration is “energy consumption,” and there are four populations: households in the Northeast, Midwest, South, and West. The four populations result from classifying households in the United States by the factor “region,” whose levels are Northeast, Midwest, South, and West.

One-way analysis of variance is the generalization to more than two populations of the pooled t-procedure (i.e., both procedures give the same results when applied to two populations). As in the pooled t-procedure, we make the following assumptions.Screen Shot 2018 03 03 at 10 48 45 PMRegarding Assumptions 1 and 2, we note that one-way ANOVA can also be used as a method for comparing several means with a designed experiment. In addition, like the pooled t-procedure, one-way ANOVA is robust to moderate violations of Assumption 3 (normal populations) and is also robust to moderate violations of Assumption 4 (equal standard deviations) provided the sample sizes are roughly equal.

How can the conditions of normal populations and equal standard deviations be checked? Normal probability plots of the sample data are effective in detecting gross violations of normality. Checking equal population standard deviations, however, can be difficult, especially when the sample sizes are small; as a rule of thumb, you can consider that condition met if the ratio of the largest to the smallest sample standard deviation is less than 2. We call that rule of thumb the rule of 2.

Another way to assess the normality and equal-standard-deviations assumptions is to perform a residual analysis. In ANOVA, the residual of an observation is the difference between the observation and the mean of the sample containing it. If the normality and equal-standard-deviations assumptions are met, a normal probability plot of (all) the residuals should be roughly linear. Moreover, a plot of the residuals against the sample means should fall roughly in a horizontal band centered and symmetric about the horizontal axis.

The Logic Behind One-Way ANOVA

The reason for the word variance in analysis of variance is that the procedure for comparing the means analyzes the variation in the sample data. To examine how this procedure works, let’s suppose that independent random samples are taken from two populations – say, Populations 1 and 2 – with means 𝜇1 and 𝜇2. Further, let’s suppose that the means of the two samples are xbar1 = 20 and xbar2 = 25. Can we reasonably conclude from these statistics that 𝜇1 ≠ 𝜇2, that is, that the population means are (significantly) different? To answer this question, we must consider the variation within the samples.

The basic idea for performing a one-way analysis of variance to compare the means of several populations:

  • Take independent simple random samples from the populations.
  • Compute the sample means.
  • If the variation among the sample means is large relative to the variation within the samples, conclude that the means of the populations are not all equal (significantly different).

To make this process precise, we need quantitative measures of the variation among the sample means and the variation within the samples. We also need an objective method for deciding whether the variation among the sample means is large relative to the variation within the samples.

Mean Squares and F-Statistic in One-Way ANOVA

As before, when dealing with several population, we use subscripts on parameters and statistics. Thus, for Population j, we use 𝜇j ,xbarj, sj, and nj to denote the population mean, sample mean, sample standard deviation, and sample size, respectively.

We first consider the measure of variation among the sample means. In hypothesis tests for two population means, we measure the variation between the two sample means by calculating their different, xbar1 – xbar2. When more than two populations are involved, we cannot measure the variation among the sample means simply by taking a difference. However, we can measure that variation by computing the standard deviation or variance of the sample means or by computing any descriptive statistic that measures variation.

In one-way ANOVA, we measure the variation among the sample means by a weighted average of their squared deviations about he mean, bxar, of alll the sample data. That measure of variation is called the treatment mean square, MSTR, and is defined as

MSTR = SSTR / (k -1)

where k denotes the number of populations being sampled and

SSTR = n1(xbar1 -xbar)^2 + n2(xbar2 – xbar)^2 + … + nk(xbark – xbar)^2

The quantity SSTR is called the treatment sum of squares.

We note that MSTR is similar to the sample variance of the sample mans. In fact, if all the sample sizes are identical, then MSTR equals that common sample size times the sample variance of the sample means.

Next we consider the measure of variation within the samples. This measure is the pooled estimate of the common population variance, 𝜎^2. It is called the error mean square, MSE, and is defined as

MSE = SSE / (n – k)

where n denotes the total number of observations and 

SSE = (n1 -1)S1^2 + (n2 -1)S2^2 + … + (nk -1)Sk^2

The quantity SSE is called the error sum of squares. Finally, we consider how to compare the variation among the sample means, MSTR, to the variation within the samples, MSE. To do so, we use the statistic F = MSTR/MSE, which we refer to as the F-statistic. Large values of F indicate that the variation among the sample means is large relative to the variation within the samples and hence that the null hypothesis of equal population means should be rejected.

In summary,

Screen Shot 2018 03 04 at 5 08 51 PM

Screen Shot 2018 03 24 at 7 13 35 PM

The Logic Behind Meta-analysis – Random-effects Model

December 25, 2017 Clinical Research, Clinical Trials, Evidence-Based Medicine, Medical Statistics, Research No comments , , ,

Screen Shot 2017 12 20 at 4 04 13 PMThe fixed model starts with the assumption that true effect size is the same in all studies. However, in many systematic reviews this assumption is implausible. When we decide to incorporate a group of studies in a meta-analysis, we assume that the studies have enough in common that it makes sense to synthesize the information, but there is generally no reason to assume that they are identical in the sense that the true effect size is exactly the same in all the studies. For example, suppose that we are working with studies that compare the proportion of patients developing a disease in two groups (vaccinated versus placebo). If the treatment works we would expect the effect size (say, the risk ratio) to be similar but not identical across studies. The effect size might be higher (or lower) when the participants are older, or more educated, or healthier than others, or when a more intensive variant of an intervention is used, and so on. Because studies will differ in the mixes of participants and in the implementations of interventions, among other reasons, there maybe different effect sizes underlying different studies.

Or suppose that we are working with studies that assess the impact of an educational intervention. The magnitude of the impact might vary depending on the other resources available to the children, the class size, the age, and other factors, which are likely to vary from study to study. We might not have assessed these covariates in each study. Indeed, we might not even know what covariates actually are related to the size of the effect. Nevertheless, logic dictates that such factors do exist and will lead to variations in the magnitude of the effect.

One way to address this variation across studies is to perform a random-effects meta-analysis. In a random-effects meta-analysis we usually assume that the true effects are normally distributed. For example, in Figure 12.1 the mean of all true effect sizes is 0.60 but the individual effect sizes are distributed about this mean, as indicated by the normal curve. The width of the curve suggests that most of the true effects fall in the range of 0.50 to 0.70.

Screen Shot 2017 12 21 at 2 21 19 PMSuppose that our meta-analysis includes three studies drawn from the distribution of studies depicted by the normal curve, and that the true effects in these studies happen to be 0.50, 0.55, and 0.65. If each study had an infinite sample size the sampling error would be zero and the observed effect for each study would be the same as the true effect for that study. If we were to plot the observed effects rather than the true effects, the observed effects would exactly coincide with the true effects.

Of course, the sample size in any study is not infinite and therefore the sampling error is not zero. If the true effect size for a study is 𝜗i, then the observed effect for that study will be less than or greater than 𝜗i, because of sampling error. This figure also highlights the fact that the distance between the overall mean and the observed effect in any given study consists of two distinct parts: true variation in effect sizes (𝜁i) and sampling error (𝜀i). More generally, the observed effect Yi for any study is given by the grand mean, the deviation of the study’s true effect from the grand mean, and the deviation of the study’s observed effect from the study’s true effect. That is,

Screen Shot 2017 12 21 at 2 31 19 PM

Therefore, to predict how far the observed effect Yi is likely to fall from 𝜇 in any given study we need to consider both the variance of 𝜁i and the variance of 𝜀i. The distance from 𝜇 to each 𝜗i depends on the standard deviation of the distribution of the true effects across studies, called 𝜏 (or 𝜏2 for its variance). The same value of 𝜏2 applies to all studies in the meta-analysis, and in Figure 12.4 is represented by the normal curve at the bottom, which extends roughly from 0.50 to 0.70. The distance from 𝜗i to Yi depends on the sampling distribution of the sample effects about 𝜗i. This depends on the variance of the observed effect size from each study, VYi, and so will vary from one study to the next. In Figure 12.4 the curve for Study 1 is relatively wide while the curve for Study 2 is relatively narrow.

Performing A Random-Effects Meta-Analysis

Screen Shot 2017 12 21 at 9 30 36 PMIn an actual meta-analysis, of course, rather than start with the population effect and make projections about the observed effects, we start with the observed effects and try to estimate the population effect. In other words our goal is to use the collection of Yi to estimate the overall mean, 𝜇. In order to obtain the most precise estimate of the overall mean (to minimize the variance) we compute a weight mean, where the weight assigned to each study is the inverse of that study’s variance. To compute a study’s variance under the random-effects model, we need to know both the within-study variance and 𝜏2, since the study’s total variance is the sum of these two values.

The parameter 𝜏2 (tau-squared) is the between-studies variance (the variance of the effect size parameters across the population of studies). In other words, if we somehow knew the true effect size for each study, and computed the variance of these effect sizes (across an infinite number of studies), this variance would be 𝜏2. One method for estimating 𝜏2 is the method of moments (or the DerSimonian and Laird) method, as follows.

Screen Shot 2017 12 21 at 9 28 23 PM

where

Screen Shot 2017 12 21 at 9 28 56 PM

where k is the number of studies, and

Screen Shot 2017 12 21 at 9 29 45 PM

In the fixed-effect analysis each study was weighted by the inverse of its variance. In the random-effects analysis, each study will be weighted by the inverse of its variance. The difference is that the variance now includes the original (within-studies) variance plus the estimate of the between-studies variance, T2. To highlight the parallel between the formulas here (random effects) and those in the previous threads (fixed effect) we use the same notations but add an asterisk (*) to represent the random-effects version. Under the random-effects model the weight assigned to each study is

Screen Shot 2017 12 21 at 10 51 46 PM

where Vyi(*) is the within-study variance for study I plus the between-studies variance, T2. That is,

Screen Shot 2017 12 21 at 10 53 20 PM

The weight mean, M(*), is then computed as

Screen Shot 2017 12 21 at 10 56 39 PM

that is, the sum of the products (effect size multiplied by weight) divided by the sum of the weights.

The variance of the summary effect is estimated as the reciprocal of the sum of the weights, or

Screen Shot 2017 12 25 at 2 12 10 PM

and the estimated standard error of the summary effect is then the square root of the variance,

Screen Shot 2017 12 25 at 2 13 16 PM

Summary

  • Under the random-effects model, the true effects in the studies are assumed to have been sampled from a distribution of true effects.
  • The summary effect is our estimate of the mean of all relevant true effects, and the null hypothesis is that the mean of these effects is 0.0 (equivalent to a ratio fo 1.0 for ratio measures).
  • Since our goal is to estimate the mean of the distribution, we need to take account of two sources of variance. First, there is within-study error in estimating the effect in each study. Second (even if we knew the true mean for each of our studies), there is variation in the true effects across studies. Study weights are assigned with the goal of minimizing both sources of variance.

The Logic Behind Meta-analysis – Fixed-ffect Model

December 19, 2017 Clinical Research, Clinical Trials, Evidence-Based Medicine, Medical Statistics, Research No comments , , , , , , , ,

Screen Shot 2017 12 18 at 9 37 19 PM

Effect Size (Based on Means)

When the studies report means and standard deviations (more precisely, the sample standard error of the mean), the preferred effect size is usually the raw mean difference, the standardized mean difference mean difference, or the response ratio. When the outcome is reported on a meaningful scale and all studies in the analysis use the same scale, the meta-analysis can be performed directly on the raw data.

Consider a study that reports means for two groups and (Treated and Control) and suppose we wish to compare the means of these two groups, the population mean difference (effect size) is defined as

Population mean difference = 𝜇1 – 𝜇2

Population standard error of mean difference (pooled) = Spooled*(Square Root of [1/n1 + 1/n2])

Overview

Most meta-analyses are based on one of two statistical models, the fixed-effect model or the random-effects model. Under the fixed-effect model we assume that there is one true effect size (hence the term fixed effect) which underlies all the studies in the analysis, and that all differences in observed effects are due to sampling error. While we follow the practice of calling this a fixed-effect model, a more descriptive term would be a common-effect model.

By contrast, under the random-effects model we allow that the true effect could vary from study to study. For example, the effect size might be higher (or lower) in studies where the participants are older, or more educated, or healthier than in others, or when a more intensive variant of an intervention is used, and so on. Because studies will differ in the mixes of participants and in the implementations of interventions, among other reasons, there may be different effect sizes underlying different studies.

Since all studies share the same true effect, it follows that the observed effect size varies from one study to the next only because of the random error inherent in each study. If each study had an infinite sample size the sampling error would be zero and the observed effect for each study would be the same as the true effect. If we were to plot the observed effects rather than the true effects, the observed effects would exactly coincide with the true effects.

In practice, of course, the sample size in each study in not infinite, and so there is sampling error and the effect observed in the study is not the same as the true effect. In Figure 11.2 the true effect for each study is still 0.60 but the observed effect differs from one study to the next.

While the error in any given study is random, we can estimate the sampling distribution of the errors. In Figure 11.3 we have placed a normal curve about the true effect size for each study, with the width of the curve being based on the variance in that study. In Study 1 the sample size was small, the variance large, and the observed effect is likely to fall anywhere in the relatively wide range of 0.20 to 1.00. By contrast, in Study 2 the sample size was relative large, the variance is small, and the observed effect is likely to fall in the relatively narrow range of 0.40 to 0.80. Note that the width of the normal curve is based on the square root of the variance, or standard error.

Screen Shot 2017 12 18 at 10 10 33 PMMeta-analysis Procedure

In an actual meta-analysis, of course, rather than starting with the population effect and making projections about the observed effects, we work backwards, starting with the observed effects and trying to estimate the population effect. In order to obtain the most precise estimate of the population effect (to minimize the variance) we compute a weighted mean, where the weight assigned to each study is the inverse of that study’s variance. Concretely, the weight assigned to each study in a fixed-effect meta-analysis is

Screen Shot 2017 12 18 at 10 12 11 PM

Where VYi is the within-study variance for study (i). The weighted mean (M) is then computed as

Screen Shot 2017 12 18 at 10 12 48 PM

That is, the sum of the products WiYi (effect size multiplied by weight) divided by the sum of the weights.

The variance of the summary effect is estimated as the reciprocal of the sum the weights, or

Screen Shot 2017 12 18 at 10 15 37 PM

Once VM is estimated, the standard deviation of the weighted mean (or, standard error of the weighted mean) is computed as the square root of the variance of the summary effect. Now we know the distribution, the point estimation, and the standard deviation, of the weight mean. Thus, the confidence interval of the summary effect could be computed by the confidence interval Z-procedure.

Effect Sizes Measurements

Raw Mean Difference

When the studies report means and standard deviations (continuous variables), the preferred effect size is usually the raw mean difference, the standard mean difference (SMD), or the response ratio. When the outcome is reported on a meaningful scale and all studies in the analysis use the same scale, the meta-analysis can be performed directly on the raw difference in means, or the raw mean difference. The primary advantage of the raw mean difference is that it is intuitively meaningful, either inherently or because of widespread use. Examples of raw mean difference include systolic blood pressure (mm Hg), serum LDL-C level (mg/dL), body surface area (m2), and so on.

We can estimate the mean difference D from a study that used two independent groups revealed by the inference procedure for two population means (independent samples). Let’s recall a little for the inference procedure for two population means. The sampling distribution of the difference between two sample meets these characteristics:

Screen Shot 2017 12 19 at 8 19 43 PM

PS: All is based on the central limit theorem – if the sample size is large, the mean is approximately normally distributed, regardless of the distribution of the variable under consideration.

Once we know the sample mean difference, D, the standard deviation of the mean difference (or the standard error), and in the light of the central limit theorem, we could compute the variance of D. In addition to know the group mean, the standard deviation of group mean, and the group size, we also could compute the pooled sample standard deviation (Sp) or the nonpooled method. Therefore, we would have the value of variance of D, which will be used by meta-analysis procedures (fixed-effect, or random-effects model) to compute the weight (Wi = 1 / VYi). And once the standard error is known, the synthesized confidence interval could be computed.

Standardized Mean Difference, d and g

As noted, the raw mean difference is a useful index when the measure is meaningful, either inherently or because of widespread use. By contrast, when the measure is less well known, the use of a raw mean difference has less to recommend it. In any event, the raw mean difference is an option only if all the studies in the meta-analysis use the same scale. If different studies use different instruments to assess the outcome, then the scale of measurement will differ from study to study and it would not be meaningful to combine raw mean differences.

In such cases we can divide the mean difference in each study by that study’s standard deviation to create an index (the standard mean difference, SMD) that would be comparable across studies. This is the same approach suggested by Cohen in connection with describing the magnitude of effects in statistical power analysis. The standard mean difference can be considered as being comparable across studies based on either of two arguments (Hedges and Olkin, 1985). If the outcome measures in all studies are linear transformations of each other, the standardized mean difference can be seen as the mean difference that would have been obtained if all data were transformed to a scale where the standard deviation within-groups was equal to 1.0.

The other argument for comparability of standardized mean differences is the fact that the standardized mean difference is a measure of overlap between distributions. In this telling, the standardized mean difference reflects the difference between the distributions in the two groups (and how each represents a distinct cluster of scores) even if they do not measure exactly the same outcome.

Computing d and g from studies that use independent groups

We can estimate the standardized mean difference from studies that used two independent groups as

Screen Shot 2017 12 19 at 9 22 14 PM

where Swithin is the pooled standard deviation across groups. And n1 and n2 are the sample sizes in the two groups, S1 and S2 are the standard deviations in the two groups. The reason that we pool the two sample estimates of the standard deviation is that even if we assume that the underlying population standard deviations are the same, it is unlikely that the sample estimates S1 and S2 will be identical. By pooling the two estimates of the standard deviation, we obtain a more accurate estimate of their common value.

The sample estimate of the standardized mean difference is often called Cohen’s d in research synthesis. Some confusion about the terminology has resulted from the fact that the index 𝛿, originally proposed by Cohen as a population parameter for describing the size of effects for statistical power analysis is also sometimes called d. The variance of d is given by,

Screen Shot 2017 12 19 at 9 31 59 PM

Again, with the standard mean difference and variance of the standard mean difference known, we could compute the confidence interval of the standard mean difference. However, it turns out that d has a slight bias, tending to overestimate the absolute value of 𝛿 in small samples. This bias can be removed by a simple correction that yields an unbiased estimate of 𝛿, with the unbiased estimate sometimes called Hedges’ g (Hedges, 1981). To convert from d to Hedges’ g we use a correction factor, which is called J. Hedges (1981) gives the exact formula for J, but in common practice researchers use an approximation,

Screen Shot 2017 12 19 at 9 37 18 PM

Screen Shot 2017 12 19 at 9 37 47 PM

Summary

  • Under the fixed-effect model all studies in the analysis share a common true effect.
  • The summary effect is our estimate of this common effect size, and the null hypothesis is that this common effect is zero (for a difference) or one (for a ratio).
  • All observed dispersion reflects sampling error, and study weights are assigned with the goal of minimizing this within-study error.

Screen Shot 2017 12 19 at 9 55 55 PMConverting Among Effect Sizes

Despite that widespread used outcome measures would be across studies under investigation, it is not uncommon that the outcome measures among individual studies are different. When we convert between different measures we make certain assumptions about the nature of the underlying traits or effects. Even if these assumptions do not hold exactly, the decision to use these conversions is often better than the alternative, which is to simply omit the studies that happened to use an alternate metric. This would involve loss of information, and possibly the systematic loss of information, resulting in a biased sample of studies. A sensitivity analysis to compare the meta-analysis results with and without the converted studies would be important. Figure 7.1 outlines the mechanism for incorporating multiple kinds of data in the same meta-analysis. First, each study is used to compute an effect size and variance of native index, the log odds ratio for binary data, d for continuous data, and r for correlational data. Then, we convert all of these indices to a common index, which  would be either the log odds ratio, d, or r. If the final index is d, we can move from there to Hedges’ g. This common index and its variance are then used in the analysis.

We can convert from a log odds ratio to the standardized mean difference d using

Screen Shot 2017 12 19 at 9 57 13 PM

where 𝜋 is the mathematical constant. The variance of d would then be

Screen Shot 2017 12 19 at 9 59 04 PM

where VlogOddsRatio is the variance of the log odds ratio. This method was originally proposed by Hasselblad and Hedges (1995) but variations have been proposed. It assumes that an underlying continuous trait exists and has a logistic distribution (which is similar to a normal distribution) in each group. In practice, it will be difficult to test this assumption.

Pharmacokinetics – Distribution Series II – Rate of Drug Distribution

November 13, 2017 Biopharmaceutics, Pharmacokinetics No comments , , , ,

Figure 4.1 shows the plasma concentration and the typical tissue concentration profile after the administration of a drug by intravenous injection. It can be seen that during the distribution phase, the tissue concentration increases as the drug distributes to the tissue. Eventually, a type of equilibrium is reached, and following this, in the postdistribution phase, the tissue concentration falls in parallel with the plasma concentration.

Drug distribution is a two-stage process that consist of:

1.Delivery of the drug to the tissue by the blood

2.Diffusion or uptake of drug from the blood to the tissue

The overall rate of distribution is controlled by the slowest of these steps. The delivery of drug to the tissue is controlled by the specific blood flow to a given tissue. This is expressed as tissue perfusion, the volume of blood delivered per unit time (mL/min) per unit of tissue (g). Once at the tissue site, uptake or distribution from the blood is driven largely by the passive diffusion of drug across the epithelial membrane of the capillaries. Because most capillary membranes are very loose, drugs can usually diffuse from the plasma very easily. Consequently, in most cases, drug distribution is perfusion controlled. The rate of drug distribution will vary from one tissue to another, and generally, drugs will distribute fastest to the tissues that have the higher perfusion rates.

Perfusion-Controlled Drug Distribution

Drug is presented to the tissues in the arterial blood, and any uptake of drug by the tissue will result in a lower concentration of drug leaving the tissue in the venous blood. The amount of drug delivered to the tissue per unit time or rate of presentation of a drug to a tissue is given by

rate of presentation = Q * Ca

where Ca is the drug concentration in the arterial blood and Q is the blood flow to the tissue

rate drug leaves the tissue = Q * Cv

where Cv is the drug concentration in the venous blood

so, rate of up take = Q * (Ca – Cv) (remember the O2ER in oxygen delivery?)

When drug uptake is perfusion controlled, the tissue presents no barrier for drug uptake, and the intial rate of uptake will equal the rate of presentation:

initial rate of uptake = Q * Ca

Thus, it is a first-order process. The value of Ca will change continuously as distribution proceeds throughout the body and as drug is eliminated. When the distribution phase in a tissue is complete, the concentration of drug in the tissue will be in equilibrium with the concentration leaving the tissue (venous blood). The ratio of these concentrations is expressed using the tissue blood partition coefficient (Kp):

where Ct is the tissue concentration. The value of Kp will depend on the binding and the relative affinity of a drug for the blood and tissues. Tissue binding will promote a large value of Kp, whereas extensive binding to the plasma proteins will promote a small Kp.

Once the initial distribution phase is complete, the amount of drug in the tissue (At) at any time is

At = Ct * Vt = Kp * Cv * Vt

Distribution is a first-order process and that the rate of distribution may be expressed using the first-order rate constant for distribution (Kd). The physiological determinants of the rate constant for distribution are most easily identified by considering the redistribution process, which is governed by the same physiological factors and has the same rate constant as those for distribution.

If the drug concentration in arterial blood suddenly became zero; the

rate of redistribution = Kd * At = Kd * (Kp * Cv * Vt) = |Q * (Ca – Cv)| (where Ca = 0) = |Q * –Cv| = Q * Cv

Thus, 

Kd = Q / Vt / Kp, when Ca sudden became zero.

The first-order rate constant for distribution is equal to tissue perfusion divided by the tissue: blood partition coefficient and the corresponding distribution half-life is computed via dividing LN(2) (0.693) by Kd.

Summary

The time it takes for distribution to occur is dependnet on tissue perfusion. Generally, drug distribute to well-perfused tissues such as the lungs and major organs faster than they do to poorly perfused tissues such as resting muscle and skin.

The duration of the distribution phase is also dependent on Kp. If a drug has a high Kp value, it may take a long time to achieve equilibrium even if the tissue perfusion is relatively high. If on the other hand, a drug has a high Kp value in a tissue with low perfusion, it will require an extended period of drug exposure to reach equilibrium.

The amount of drug in tissue at equilibrium depends on Kp and on the size of the tissue. A drug may concentrate in a tissue (high Kp), but if the tissue is physically small, the total amount of drug present in the tissue will be low. The distribution of a drug to such a tissue may not have a strong impact on the plasma concentration of the drug.

Redistribution of a drug from the tissues back to the blood is controlled by exactly the same principles. Thus, redistribution take less time when Kp value is small and the perfusion is high, and will take a long time when the Kp is high and the perfusion is low.

Diffusion-Controlled Drug Distribution

The epithelial junctions in some tissues, such as the brain, placenta, and testes, are very tightly knit, and the diffusion of more polar and/or large drugs may proceed slowly. As a result, drug distribution in these tissues may be diffusion controlled. In this case, drug distribution will proceed more slowly for polar drugs than for more lipophilic drugs. It must be pointed out that not all drug distribution to these sites is diffusion controlled. For example, small lipophilic drugs such as the intravenous anesthetics can easily pass membranes by the transcellular route and display perfusion-controlled distribution to the brain.

Diffusion-controlled distribution may be expressed by Fick's law

rate of uptake = Pm * SAm * (Cpu – Ctu)

where Pm is the permeability of the drug through the membrane (cm/h), SAm the surface area of the membrane (cm2), Cpu the unbound drug concentration in the plasma (mg/mL), and Ctu the unbound concentration in the tissue (mg/mL).

Initially, the drug concentration in the tissue is very low, Cpu >> Ctu, so the equation may be written

rate of uptake = Pm * SAm * Cpu

which can be seen that under these circumstances, the rate of diffusion approximates a first-order process.