This article has Open Peer Review reports available.
Metaprop: a Stata command to perform meta-analysis of binomial data
© Nyaga et al.; licensee BioMed Central Ltd. 2014
Received: 5 May 2014
Accepted: 11 July 2014
Published: 10 November 2014
Meta-analyses have become an essential tool in synthesizing evidence on clinical and epidemiological questions derived from a multitude of similar studies assessing the particular issue. Appropriate and accessible statistical software is needed to produce the summary statistic of interest.
Metaprop is a statistical program implemented to perform meta-analyses of proportions in Stata. It builds further on the existing Stata procedure metan which is typically used to pool effects (risk ratios, odds ratios, differences of risks or means) but which is also used to pool proportions. Metaprop implements procedures which are specific to binomial data and allows computation of exact binomial and score test-based confidence intervals. It provides appropriate methods for dealing with proportions close to or at the margins where the normal approximation procedures often break down, by use of the binomial distribution to model the within-study variability or by allowing Freeman-Tukey double arcsine transformation to stabilize the variances. Metaprop was applied on two published meta-analyses: 1) prevalence of HPV-infection in women with a Pap smear showing ASC-US; 2) cure rate after treatment for cervical precancer using cold coagulation.
The first meta-analysis showed a pooled HPV-prevalence of 43% (95% CI: 38%-48%). In the second meta-analysis, the pooled percentage of cured women was 94% (95% CI: 86%-97%).
By using metaprop, no studies with 0% or 100% proportions were excluded from the meta-analysis. Furthermore, study specific and pooled confidence intervals always were within admissible values, contrary to the original publication, where metan was used.
Meta-analyses combine information from multiple studies in order to derive an average estimate. Different meta-analysis procedures exist depending on the statistic to be reported. Examples of statistics of interest include association measures such as risk difference, risk ratio, odds ratio, difference in means, or simply one-dimensional binomial or continuous measures such as proportions or means.
There are three important aspects in meta-analysis: a) the analysis framework, b) the model and c) the choice of the method to estimate the heterogeneity parameter. These aspects interact with each other. A meta-analyst has a choice between the fixed- and random-effects model.
In the fixed-effects model, it is assumed that the parameter of interest is identical across studies and the difference between the observed proportion and the mean is only due to sampling error. In the random-effects model, the observed difference between the proportions and the mean cannot be entirely attributed to sampling error and other factors such as differences in study population, study designs, etc. could also contribute. Each study estimates a different parameter, and the pooled estimate describes the mean of the distribution of the estimated parameters. The variance parameter describes the heterogeneity among the studies and in the case where the variance is zero, this model simply reduces to the fixed-effects model.
There are three frameworks in modeling of binomial data. The most popular framework uses approximation to the normal distribution by use of transformations and is known as the approximate likelihood approach [1, 2]. Some of the common transformations include the logit and the arcsine . Some of the reasons why this approach is popular include lower level of required statistical expertise, faster computations and availability of software to carry out the analysis.
The second approach recognises the true nature of the data and is known as the exact likelihood approach. In this framework, the special relationship between the mean and the variance as characterised by binomial data is captured by the binomial distribution . The beta-binomial distribution  can be used to fit a random-effects model such that the beta distribution describes the distribution of the varying binomial parameters. While it is possible to perform computations to estimate the parameters of the binomial model, most common statistical software lacks function to fit the beta-binomial model and therefore, this approach is the least popular. The WinBUGS software, a software package for Bayesian statistics, has the capability to perform such analyses. Other software e.g R and SAS (PROC NLMIXED) can also be used, but extensive programming is required.
The third approach is a compromise between approximate and exact likelihood. In the first stage, the data is modeled using the binomial distribution. In the second stage, the normal distribution is used after the logit transformation to model the heterogeneity among the studies. This is an emerging approach and is often recommended by statisticians . Most statistical software including Stata(melogit), R, SAS (PROC NLMIXED) have the capability to perform such analyses.
There are three popular methods to estimate the parameters. The non-iterative method popularised byDersimonian and Laird . The other two methods are the maximum likelihood (ML) and restricted maximum likelihood (REML) method. For random-effects model, the REML method is preferred because ML leads to underestimation of the variance parameter. For generalized linear mixed models [2, 7, 8] under which models for binomial data falls, the REML method is not used due to intensive computation of high-dimension integrations of the random-effects and as a result most software estimate the heterogeneity parameter using ML methods. The procedure proposed by Dersimonian and Laird is efficient for the mean but not the heterogeneity parameter .
Various procedures to perform meta-analysis have been implemented in the Stata command metan. In metan, the confidence intervals are calculated using the normal distribution based on the asymptotic variance. For proportions such intervals may contain inadmissible values especially when the statistic is near the boundary. Furthermore, computation of confidence intervals is not possible when the statistic is on the boundary, as the estimated standard error is set to zero and as a consequence, the metan command automatically excludes studies with proportion equal to 0 or 1 from the calculation of the pooled estimate.
Tests of significance on the pooled proportion typically rely on normal probabilities. Proportions () are binomial and the normal distribution is a good approximation of the binomial distribution if n is large enough and p is not close to the margins . When n is small and/or p is near the margins, the test statistic may not be approximately normally distributed due to its skewness and discreteness. To make the normal distribution assumptions more applicable to significance testing, several transformations have been suggested. Freeman and Tukey  presented a double arcsine transformation to stabilize the variance.
We have developed metaprop, a new program in Stata to perform meta-analyses of binomial data to supplement the metan command, which is typically used to pool associations. metaprop builds further on the metan procedure. It allows computation of 95% confidence intervals using the score statistic and the exact binomial method and incorporates the Freeman-Tukey double arcsine transformation of proportions. The program also allows the within-study variability be modelled using the binomial distribution. This article presents a general overview of the program to serve as a starting point for users interested in performing meta-analysis of proportions in Stata software.
Summary of the procedures available in metaprop
Option in metaprop
Computes the study specific confidence intervals using the score method.
Study specific intervals always yield admissible values (within the limits of 0 and 1).
The Wald confidence intervals for the pooled estimate could be inadmissible if study specific estimates are on or close to the margin.
The coverage probability of the study specific confidence intervals are close to the nominal level.
Computes the study specific confidence intervals using exact method
Study specific intervals always yield admissible values
More conservative method and therefore study specific confidence intervals tend to be too wide.
The Wald confidence intervals for the pooled estimate could be inadmissible if study specific estimates are on or close to the margin.
Performs the Freeman-tukey double arcsine transformation, computes the weighted pooled estimate and performs the back-transformation on the pooled estimate.
The confidence intervals for the pooled estimate are always admissible. Test of significance based on Normal approximation more applicable than without the transformation.
The procedure could break-down in case of extremely sparse data.
Uses the Binomial distribution to model the within-study variability.
The confidence intervals for the study-specific estimate and pooled estimate are always admissible.
Requires metaprop_one available for Stata 13 or later versions.
It is an iterative procedure and therefore it requires more computational time than non-iterative procedures.
Confidence intervals for the individual studies
Two types of confidence intervals for the study specific proportions have been implemented. Throughout the text, for study i, r i denotes the number of observations with a certain characteristic, n i is the total number of observations, is the observed proportion, k is the total number of studies in the meta-analysis, and 1 - α refers to the selected level of confidence.
Exact confidence intervals
The exact or Clopper-Pearson  confidence limits for a binomial proportion are constructed by inverting the equal-tailed test based on the binomial distribution.
The lower endpoint is the quantile of a beta distribution; Beta(x i ,n i -x i +1), and the upper endpoint is the quantile of a beta distribution; Beta(x i + 1,n i -x i ) . Since the binomial distribution is discrete, the coverage probability of the exact intervals is not exactly (1- α) but at least (1- α) and consequently exact confidence intervals are considered conservative .
Score confidence intervals
where z is the percentile of the standard normal distribution.
Confidence Intervals for the pooled estimate after transformation
Freeman-Tukey double arcsine transformation
The asymptotic variance of the transformed variable is defined as, . This transformations is intended to achieve approximate normality. The pooled estimate are then computed using the Dersimonian and Laird  method based on the transformed values and their variances. The confidence intervals for the pooled estimate are then computed using the Wald method.
Inverse of Freeman-Tukey double arcsine transformation
where t is the transformed value and n is the sample size. In the meta-analysis setting, t is the pooled estimate or the confidence intervals based on transformed values. In practice, the use of this formula usually involves translating the means of t’s derived from binomials with different n’s as is the case in meta-analysis where most studies included have different sample sizes. In this case, Miller  suggested that the harmonic mean of the n i ’s be used in the conversion formula. For a set of numbers, the harmonic mean is the inverse of the arithmetic mean of the reciprocals of the numbers in the set.
The logistic-normal random-effects model
The datasets used for the illustration were part of meta-analyses conducted by Arbyn et al.  and Dolman et al. . The datasets are available as clickable examples in the help file for metaprop.
Meta-analysis of the presence of high-risk HPV DNA in women with equivocal cervical cytology, by terminology group (ASCUS, Borderline Dyskaryosis or ASC-US)
[95% Conf. interval]
Random pooled ES
Random pooled ES
Random pooled ES
Random pooled ES
Test(s) of heterogeneity:
Degrees of freedom
Random: Rest for heterogeneity between sub-groups:
** I 2: the variation in ES attributable to heterogeneity
Significance of test(s) of ES = 0
z = 17.22
p = 0.000
z = 9.58
p = 0.000
z = 14.57
p = 0.000
z = 25.31
p = 0.000
The dataset contains author and year which identify each study, where tgroup corresponds with the triage group(ASCUS, LSIL, borderline dyskaryosis). num and denom indicates the number of women with a positive HPV test (HC2 assay) and total number of tested women such that is the proportion with a positive HC2 test. se indicates the standard error computed as . lo and up are the lower and upper confidence intervals computed using the ‘exact’ method.
The dataset contains nb_cured and nb_treated indicates the number of women cured of CIN and total number of women treated for CIN such that is the proportion of women cured of CIN, and se is the standard error. region indicates continent in which the study was conducted. For studies with frac = 1, se = 0 and the authors replaced , where up and low were the exact binomial confidence intervals to ensure that such studies were not excluded from the analysis.
The metaprop command is an adaptation of the metan programme developed by Harris et al.  intended to perform fixed and random-effects meta-analysis in Stata on continuous variables or associations between continuous or binomial variables. The metaprop program and its help file are available for downloading at http://ideas.repec.org/c/boc/bocode/s457781.html. The command requires Stata 10 or later versions and can be directly installed within Stata by typing ssc install metaprop when one is connected to the internet. An update to metaprop to include the logistic-normal random-effects model is also available for download. The updated command metaprop_one requires Stata 13 and can be directly installed within Stata by typing ssc install metaprop_one when one is connected to the internet.
We reproduce Figure one in Arbyn et al. . metaprop pools proportions and presents a weighted sub-group and overall pooled estimates with inverse-variance weights obtained from a random-effects model.
. metaprop num denom, random by(tgroup) cimethod(exact) /*
*/ label(namevar =author, yearvar =year) /*
*/ xlab(.25,0.5,.75,1)xline(0, lcolor(black)) /*
*/ subti(Atypical cervical cytology, size(4)) /*
*/ xtitle(Proportion,size(2)) nowt /*
*/ plotregion(icolor(ltbluishgray)) /*
*/ diamopt(lcolor(red)) /*
*/ pointopt(msymbol(x)msize(0))boxopt(msymbol(S) mcolor(black)) /*
Table 2 and Figure 1 both present the study specific proportions with 95% exact confidence intervals for each study, the sub-group and overall pooled estimate with 95% Wald confidence intervals and the I 2 statistic which describes the percentage of total variation due to inter-study heterogeneity. The table presents additional information on the pooled proportions and includes tests of heterogeneity within the sub-groups and overall. Significant intra-group heterogeneity was observed (p <0.001 with I 2 exceeding 93% for all the three terminology groups). However, no inter-group heterogeneity was noted (p = 0.925), supporting the pooling of all studies into one pooled measure: 43% (95% CI: 39-46%).
Though the weights have been computed using the random-effects model, the heterogeneity statistics have been computed by re-calculating the overall pooled estimate by treating the sub-group pooled estimates as though they were fixed-effects estimates. Since all study-specific proportions are close to 0.5, metan (see Figure one in Arbyn et al. ) and metaprop (see Figure 1) produce similar results.
We extracted data that generated Figure two in Dolman et al.  (see Figure 2). Since the proportion of cured women is close to or at 1 in some studies, we enabled the Freeman-Tukey double arcsine transformation. Otherwise, studies with estimated proportion at 1 would be excluded from the analysis leading to a biased pooled estimate. Alternatively; using cc(#) ensures that such studies are not excluded. However, the pooled estimate is not guaranteed to be within the [0,1] interval which is automatic when the Freeman-Tukey double arcsine(ftt) option is enabled. We used the score confidence intervals for the individual studies.
. metaprop nb_cured nb_treated, random by(region)ftt cimethod(score)/* */label(namevar = study) graphregion(color(white)) plotregion(color(white))/* */ xlab(0.5,0.6,.7,0.8, 0.9, 1) /* */ xtick(0.5,0.6,.7,0.8, 0.9, 1) force/* */ xtitle(Proportion,size(2)) nowt stats /* */ olineopt(lcolor(black) lpattern(shortdash)) /* */ diamopt(lcolor(black)) /* */ boxopt(msymbol(S)) rcols(col)/* */ astext(70) texts(80) nohet notable
We extracted data that generated Figure two in Dolmanet al.  (see Figure 2). We fit the logistic-normal random-effects model to the data. With these model, there is no worry about studies with cure rates close to or at 1 in some studies since we use the exact method. The confidence intervals for the individual studies also are computed with exact method. We used the updated command metaprop_one which requires Stata 13 to fit the generalized linear mixed model (GLMM).
. metaprop_one nb_cured nb_treated, random logit groupid(study) /// label(namevar =author, yearvar =year) sortby(year author) /// xlab(.1,.2,.3,.4,.5,.6,.7,.8,.9,1) xline(0, lcolor(black)) /// ti(Positivity of p16 immunostaining, size(4) color(blue)) /// subti("Cytology = HSIL", size(4) color(blue)) /// xtitle(Proportion,size(3)) nowt nostats /// olineopt(lcolor(red) lpattern(shortdash)) /// diamopt(lcolor(red)) pointopt(msymbol(s) msize(2)) /// astext(70) texts(100)
Meta-analysis of the presence proportion of women cured of CIN1 disease with cold coagulation)
[95% Conf. Interval]
Hussein & Galloway (1985)
de Cristofaro (1990)
Loobuyck & Duncan (1993)
Random pooled ES
We have presented procedures to perform meta-analysis of proportions in Stata. We adapted and made additions to the metan command to provide procedures which are specific for binomial data where the user specifies n and N denoting the number of individuals with the characteristic of interest and the total number of individuals. With metaprop, it is possible to perform a test of heterogeneity between groups when sub-group analysis is desired and the random-effects model has been used to compute the pooled estimate. In metan, a test for intergroup comparison is only produced when the fixed effects model is used in a subgroup meta-analysis.
When the estimated proportion is at 0/1, the estimate for the standard error is zero and therefore the Wald confidence intervals cannot be computed. Studies with zero standard error are often excluded since the weight assigned to such studies is infinite. Excluding such studies could lead to biased results and often users compute the standard error in ad hoc way. The continuity correction enabled by the cc(#) option avoids exclusion of studies with 0%. or 100% prevalence. While this ensures that the studies are retained, the confidence intervals for the pooled estimate may yield inadmissible values.
Furthermore, use of Wald confidence intervals for the individual studies when the estimated proportion is close to zero often yields inadmissible values. This is because the Wald confidence intervals are always symmetric around an estimate. In contrast to the Wald, the exact or score confidence intervals can be asymmetric especially near the extreme values. By computing the exact or score confidence intervals for the individuals studies, we are guaranteed of admissible values. While the exact confidence are regarded as the ‘gold’ standard, we recommend the use of score confidence intervals because the coverage is close to the nominal level, whereas the coverage is always higher than the nominal level for the exact method. By using the Freeman-Tukey double arcsine transformation, all the studies are retained, furthermore, we are guaranteed to have admissible confidence intervals for each individual study as well as for the pooled proportion. While the distribution of the Freeman-Tukey double arcsine statistic is more normal for sparse data, the procedure breaks down with extremely sparse data and should thus be used with caution . Whenever possible the use of exact methods is more recommended for binomial data. As the sample size increases and when the proportions are not extreme, methods relying on transformed data and exact methods give similar results as approximate methods.
metaprop enables epidemiologists to pool proportions in Stata, avoiding problems encountered with metan. metaprop allows inclusion of studies with proportions equal to zero or 100 percent, and avoids confidence intervals exceeding the 0 to 1 range. The logistic-normal random-effects model draws the users a step closer towards the use of exact methods recommended for binomial data.
Financial support was received from: (1) the 7th Framework Programme of DG Research of the European Commission through the COHEAHR Network (grant No. 603019, coordinated by the Vrije Universiteit Amsterdam, the Netherlands) and the HPV-AHEAD project (FP7-HEALTH-2011-282562, coordinated by IARC, Lyon, France); (3) The Scientific Institute of Public Health (Brussels, through the OPSADAC project).
- Agresti A, Coull BA: Approximate is better than ’exact’ for interval estimation of binomial proportions. Am Stat. 1998, 52 (2): 119-126.Google Scholar
- Breslow NE, Clayton DG: Approximate inference in generalized linear mixed models. J Am Stat Assoc. 1993, 88: 9-25.Google Scholar
- Miller JJ: The inverse of the Freeman-Tukey double arcsine transformation. Am Stat. 1978, 32 (4): 138-Google Scholar
- Hamza TH, van Houwelingen HC, Stijnen T: The binomial distribution of meta-analysis was preferred to model within-study variability. J Clin Epidemiol. 2008, 61: 41-51. 10.1016/j.jclinepi.2007.03.016.View ArticlePubMedGoogle Scholar
- Molenberghs G, Verbeke G, Iddib S, Demétrio CGB: A combined beta and normal random-effects model for repeated, over-dispersed binary and binomial data. J Multivar Anal. 2012, 111: 94-109.View ArticleGoogle Scholar
- DerSimonian R, Laird N: Meta-analysis in clinical trials. Control Clin Trials. 1986, 7: 177-188. 10.1016/0197-2456(86)90046-2.View ArticlePubMedGoogle Scholar
- Engel E, Keen A: A simple approach for the analysis of generalized linear mixed models. Stat Neerl. 1994, 48: 1-22. 10.1111/j.1467-9574.1994.tb01428.x.View ArticleGoogle Scholar
- Molenberghs G, Verbeke G, Demétrio CGB, Vieira AMC: A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat Sci. 2010, 3: 325-347.View ArticleGoogle Scholar
- Jackson D, Bowden J, Baker R: How does the Dersimonian and Laird procedure for random effects meta-analysis compare with its more efficient but harder to compute counterparts?. J Stat Plan Inference. 2010, 140: 961-970. 10.1016/j.jspi.2009.09.017.View ArticleGoogle Scholar
- Harris R, Bradburn M, Deeks J, Harbord R, Altman D, Sterne J: metan: fixed- and random-effects meta-analysis. Stata J. 2008, 8 (1): 3-28.Google Scholar
- Box GEP, Hunter JS, Hunter WG: Statistics for experimenters. 1978, Hoboken (NJ), USA: J Wiley & Sons Inc, Wiley Series in Probability and StatisticsGoogle Scholar
- Freeman MF, Tukey JW: Transformations related to the angular and the square root. Ann Math Stats. 1950, 21 (4): 607-611. 10.1214/aoms/1177729756.View ArticleGoogle Scholar
- Clopper CJ, Pearson ES: The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika. 1934, 26 (4): 404-413. 10.1093/biomet/26.4.404.View ArticleGoogle Scholar
- Brown LD, Cai TT, DasGupta A: Interval estimation for a binomial proportion. Stat Sci. 2001, 16: 404-413.Google Scholar
- Newcombe RG: Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med. 1998, 17: 857-872. 10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E.View ArticlePubMedGoogle Scholar
- Wilson EB: Probable inference, the law of succession, and statistical inference. J Am Stat Assoc. 1927, 22 (158): 209-212. 10.1080/01621459.1927.10502953.View ArticleGoogle Scholar
- Arbyn M, Martin-Hirsch P, Buntinx F, Ranst MV, Paraskevaidis E, Dillner J: Triage of women with equivocal or low-grade cervical cytology results a meta-analysis of the hpv test positivity rate. J Cell Mol Med. 2009, 13 (4): 648-659. 10.1111/j.1582-4934.2008.00631.x.View ArticlePubMedPubMed CentralGoogle Scholar
- Dolman L, Sauvaget C, Muwonge R, Sankaranarayanan R: Meta-analysis of the efficacy of cold coagulation as a treatment method for cervical intra-epithelial neoplasis: a systematic review. BJOG. 2014, 121: 929-942. 10.1111/1471-0528.12655.View ArticlePubMedGoogle Scholar
- Arbyn M, Ronco G, Anttila A, Meijer CJLM, Poljak M, Ogilvie G, Koliopoulos G, Naucler P, Sankaranarayanan R, Petok J: Evidence regarding human papillomavirus testing in secondary prevention of cervical cancer. Vaccine. 2012, 30 (Suppl 5): F88-F99.View ArticlePubMedGoogle Scholar
- Arbyn M, Roelens J, Simoens C, Buntinx F, Paraskevaidis E, Martin-Hirsch PP, Prendiville WJ: Human papillomavirus testing versus repeat cytology for triage of minor cytological cervical lesions. Cochrane Database Syst Rev. 2013, 3 (CD008054): 1-201.Google Scholar
- Westfall PH, Young SS: Resampling-based multiple testing: examples and methods for P-value adjustment. 1993, Hoboken (NJ), USA: John Wiley & SonsGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.