Data Pooling And Coordinated Comparative Analysis

Pooling is the ultimate aggregation of evidence from multiple studies addressing the same topic in a similar manner. In this instance, it is not just the final results that are synthesized, but the raw data that are merged in the analysis. This procedure obviously requires data from each of the component studies rather than relying on published information. Furthermore, it requires a certain minimum degree of compatibility across studies in the structure of the data (study design and measurements). The ideal situation for data pooling, of course, is when a true multicenter study has been conducted, where identical research protocols were followed to assure that the data would be identical in all respects. Sometimes the equivalent of a multicenter study with a common protocol results from a systematic effort by a consortium of investigators or a funding agency to ensure such compatibility.

In pooling data across studies that were not designed to be so fully compatible, some compromises or assumptions are necessary to allow the integration of results. In fact, such efforts should be viewed with appropriate skepticism as to whether a single estimate from the aggregation of studies is the most informative way to understand the evidence that the set of studies are providing. The only goal of pooling is to reduce random error by merging multiple studies. The logic behind this approach is straightforward: if a series of studies are indistinguishable with regard to methods, then any differences among their results is presumably due to random error. It is as though one large study was done and the results had been arbitrarily subdivided, losing precision in such a division with no compensating gains in validity. Under those conditions, nothing could be learned from examining the variability among the studies about the influence of study design and conduct on results. The most succinct presentation of results, and the one that would most accurately describe the role of random error, would be a pooled estimate. In practice, it is very rare for epidemiologic studies to be so comparable in design and conduct that the differences among them are uninformative.

An evaluation of the role of residential magnetic field exposure and the risk of childhood leukemia (Greenland et al., 2000) provides a useful illustration of the value and limitations of the approach. A series of studies largely from the United States and Western Europe had evaluated the hypothesis that prolonged residential exposure to elevated levels of magnetic fields from outside power lines and electrical wiring in the home might lead to an increased risk of leukemia in children. Many methodologic concerns within each of the studies and the irregular pattern of results rendered the findings inconclusive regarding a causal relation between magnetic field exposure and cancer (NRC, 1997; Portier and Wolfe, 1998). Although there were many concerns, only one of those issues, random error, could be addressed by data pooling. In this instance, the most informative results might come from studying the occupants of homes with the very highest levels of magnetic fields, but such homes were rare in each of the studies (Table 11.1). Thus, enhancing precision for the higher dose portion of the dose-response curve was a perfectly valid, if narrow, goal.

Only a handful of cases and controls had been identified in each of the studies whose home measurements were above 0.2 microTesla, and even fewer above 0.3 microTesla, with the vast majority of homes between 0.05 and 0.2 microTesla (Table 11.1). To generate improved estimates of the dose-response function relating measured magnetic fields to childhood leukemia risk, Greenland et al. (2000) obtained raw data from all the investigators who had conducted relevant studies of this topic and pooled the data to generate dose-response estimates across a wide exposure range. The results are most interesting: no indication of increasing leukemia risk with increasing exposure was found for the range below 0.2 microTesla, whereas above that level, a clear (though still imprecise) indication was found of increasing risk with increasing exposure (Fig. 11.1). Risk estimates for exposure in the range of 0.3 to 0.6 microTesla, which no individ-

Table 11.1. Study-Specific Distributions of Magnetic-Field Data, Pooled Analysis of Magnetic Fields, Wire Codes, and Childhood Leukemia

Magnetic-Field Category (/u,T)

FIRST AUTHOR^ < 0.1 > 0.1-< 0.2 > 0.2-< 0.3 > 0.3-< 0.4 > 0.4-< 0.5 > 0.5 TOTAL NO MEASURE*

Cases

Coghill

Dockerty

Feychting

Linet

London

McBride

Michaelis

Olsen

Savitz

Tomenius

Tynes

Verkasalo

48 72 30 403 110 174 150 829 24 129 146 30

41 5

56 87 38 638 162 297 176 833 36 153 148 32

Controls

Coghill

Dockerty

Feychting

Linet

London

McBride

Michaelis

Olsen

Savitz

Tomenius

Tynes

Verkasalo

47 68 488 407 99 194 372 1658 155 546 1941 300

56 82 554 620 143 329 414 1666 198 698 2004 320

39 0

69 89

67 21 0

*No measure for a residence at or before time of diagnosis (cases) or corresponding index date (for controls). fSee Greenland et al. (2000) for citations to original reports. Greenland et al., 2000.

Magnetic Field (microtesias)

Figure 11.1. Floated case-control ratios from 3-degree-of-freedom quadratic-logistic spline model fit to pooled magnetic field data, with adjustment for study, age, and sex. Inner dotted lines are pointwise 80% confidence limits; outer dotted lines are pointwise 99% confidence limits (Greenland et al., 2000).

Magnetic Field (microtesias)

Figure 11.1. Floated case-control ratios from 3-degree-of-freedom quadratic-logistic spline model fit to pooled magnetic field data, with adjustment for study, age, and sex. Inner dotted lines are pointwise 80% confidence limits; outer dotted lines are pointwise 99% confidence limits (Greenland et al., 2000).

ual study could address, were the most supportive of a positive relation between magnetic fields and leukemia. Note that the myriad limitations in the design of individual studies, including potential response bias, exposure misclassification, and confounding were not resolved through data pooling, but the sparseness of the data in a range of the dose-response curve in which random error was a profound limitation was overcome to some extent.

A technique that is often better suited to maximizing the information from a series of broadly comparable studies, which also requires access to the raw data or at least to highly cooperative collaborators willing to undertake additional analyses, is comparative analysis. In this approach, rather than integrating all the data into the same analysis, parallel analyses are conducted using the most comparable statistical techniques possible. That is, instead of the usual situation in which different eligibility criteria are employed, different exposure categories are created, different potential confounders are controlled, etc., the investigative team imposes identical decision rules on all the studies that are to be included. Applying identical decision rules and analytic methods removes those factors as candidate explanations for inconsistent results and sharpens the focus on factors that remain different across studies such as the range of exposure observed or selective non-response.

The basic design and the data collection methods are not amenable to modification at the point of conducting comparative analysis, of course, except for those that can be changed by imposing restrictions on the available data (e.g., more stringent eligibility criteria). The series of decisions that proceed from the raw data to the final results are under the control of the investigators conducting the comparative analysis. Methodologically, the extent to which the choices made account for differences in the results can be evaluated empirically. When multiple studies address the same issue in a similar manner and yield incompatible results, the possibility of artifactual differences resulting from the details of analytic methods needs to be entertained. Comparative analysis addresses this hypothesized basis for differences in results directly, and either pinpoints the source of disparity or demonstrates that such methodologic decisions were not responsible for disparate findings. The opportunity to align the studies on a common scale of exposure and response yields an improved understanding of the evidence generated by the series of studies.

Lubin and Boice (1997) conducted such an analysis to summarize the evidence on residential radon and lung cancer. A series of eight pertinent case-control studies had been conducted, but both the range of radon exposure evaluated and the analytic approaches differed, in some cases, substantially, across studies. They found that the radon dose range being addressed across the series of studies differed markedly (Fig. 11.2), and the results could be reconciled, in part, by more formally taking the dose range into account. Each study had focused on the internal comparison of higher and lower exposure groups within their study settings, yet the absolute radon exposure levels that the studies addressed were quite distinctive, with the highest dose group ranging from approximately 150 Bq/m3 to approximately 450 Bq/m3. If in fact the studies in a lower dose range found no increase in risk with higher exposure, and those in the higher dose range did find such a pattern, it would be difficult to declare the studies inconsistent in a sense. They would be internally inconsistent but consistent relative to one another.

In this instance, the results were largely compatible when put on a common scale, with a combined relative risk estimate for a dose of 150 Bq/m3 of 1.14 (Table 11.2). Even the combination of evidence across residential and occupational exposures from miner studies, where the doses are typically much higher, showed compatible findings when put on a common dose scale. Few exposures permit the quantification of dose to reconcile study findings, but this example clearly illustrates the need to go beyond the notion of consistent or inconsistent in gleaning the maximum information from a set of studies.

Was this article helpful?

0 0
Nicotine Support Superstar

Nicotine Support Superstar

Stop Nicotine Addiction Is Not Easy, But You Can Do It. Discover How To Have The Best Chance Of Quitting Nicotine And Dramatically Improve Your Quality Of Your Life Today. Finally You Can Fully Equip Yourself With These Must know Blue Print To Stop Nicotine Addiction And Live An Exciting Life You Deserve!

Get My Free Ebook


Post a comment