Advanced Search Abstract This review presents the findings from controlled school-based sex education interventions published in the last 15 years in the US. The effects of the interventions in promoting abstinent behavior reported in 12 controlled studies were included in the meta-analysis. The results of the analysis indicated a very small overall effect of the interventions in abstinent behavior. Moderator analysis could only be pursued partially because of limited information in primary research studies.
Parental participation in the program, age of the participants, virgin-status of the sample, grade level, percentage of females, scope of the implementation and year of publication of the study were associated with variations in effect sizes for abstinent behavior in univariate tests.
However, only parental participation and percentage of females were significant in the weighted least-squares regression analysis. The richness of a meta-analytic approach appears limited by the quality of the primary research. Unfortunately, most of the research does not employ designs to provide conclusive evidence of program effects.
Suggestions to address this limitation are provided. Introduction Sexually active teenagers are a matter of serious concern. In the past decades many school-based programs have been designed for the sole purpose of delaying the initiation of sexual activity. There seems to be a growing consensus that schools can play an important role in providing youth with a knowledge base which may allow them to make informed decisions and help them shape a healthy lifestyle St Leger, The school is the only institution in regular contact with a sizable proportion of the teenage population Zabin and Hirsch, , with virtually all youth attending it before they initiate sexual risk-taking behavior Kirby and Coyle, These are referred to in the literature as abstinence-only or value-based programs Repucci and Herman, Other programs—designated in the literature as safer-sex, comprehensive, secular or abstinence-plus programs—additionally espouse the goal of increasing usage of effective contraception.
Although abstinence-only and safer-sex programs differ in their underlying values and assumptions regarding the aims of sex education, both types of programs strive to foster decision-making and problem-solving skills in the belief that through adequate instruction adolescents will be better equipped to act responsibly in the heat of the moment Repucci and Herman, For most programs currently implemented in the US, a delay in the initiation of sexual activity constitutes a positive and desirable outcome, since the likelihood of responsible sexual behavior increases with age Howard and Mitchell, Even though abstinence is a valued outcome of school-based sex education programs, the effectiveness of such interventions in promoting abstinent behavior is still far from settled.
Most of the articles published on the effectiveness of sex education programs follow the literary format of traditional narrative reviews Quinn, ; Kirby, , ; Visser and van Bilsen, ; Jacobs and Wolf, ; Kirby and Coyle, Two exceptions are the quantitative overviews by Frost and Forrest Frost and Forrest, and Franklin et al. In the first review Frost and Forrest, , the authors selected only five rigorously evaluated sex education programs and estimated their impact on delaying sexual initiation.
They used non-standardized measures of effect sizes, calculated descriptive statistics to represent the overall effect of these programs and concluded that those selected programs delayed the initiation of sexual activity. In the second review, Franklin et al. The discrepancy between these two quantitative reviews may result from the decision by Franklin et al. However, given that recent evidence indicates that weaker designs yield higher estimates of intervention effects Guyatt et al.
Given the discrepant results forwarded in these two recent quantitative reviews, there is a need to clarify the extent of the impact of school-based sex education in abstinent behavior and explore the specific features of the interventions that are associated to variability in effect sizes.
Purpose of the study The present study consisted of a meta-analytic review of the research literature on the effectiveness of school-based sex education programs in the promotion of abstinent behavior implemented in the past 15 years in the US in the wake of the AIDS epidemic. The goals were to: Literature search and selection criteria The first step was to locate as many studies conducted in the US as possible that dealt with the evaluation of sex education programs and which measured abstinent behavior subsequent to an intervention.
The primary sources for locating studies were four reference database systems: Branching from the bibliographies and reference lists in articles located through the original search provided another source for locating studies. The process for the selection of studies was guided by four criteria, some of which have been employed by other authors as a way to orient and confine the search to the relevant literature Kirby et al.
The criteria to define eligibility of studies were the following. Interventions had to be geared to normal adolescent populations attending public or private schools in the US and report on some measure of abstinent behavior: Studies that reported on interventions designed for cognitively handicapped, delinquent, school dropouts, emotionally disturbed or institutionalized adolescents were excluded from the present review since they address a different population with different needs and characteristics.
Community interventions which recruited participants from clinical or out-of-school populations were also eliminated for the same reasons. Studies had to be either experimental or quasi-experimental in nature, excluding three designs that do not permit strong tests of causal hypothesis: Studies had to be published between January and July A time period restriction was imposed because of cultural changes that occur in society—such as the AIDS epidemic—which might significantly impact the adolescent cohort and alter patterns of behavior and consequently the effects of sex education interventions.
Studies had to be published in a peer-reviewed journal. The reasons for this criterion are 3-fold. First, there have been many reports published in newspapers or advocacy newsletters claiming that specific sex education programs have a dramatic impact on one or more outcome variables, yet when these reports have been investigated, they often were found lacking in valid empirical evidence Kirby et al.
Second, unpublished studies are hard to locate and the quality of unpublished research makes it doubtful whether the cost involved in undertaking retrieval procedures is worth investing. This is not to say that all conference papers are defective or all journal articles are free of weaknesses. However, regardless of varying standards of review rigor and publication criteria between journals, published articles have at least survived some form of a refereeing and editing process Dunkin, Finally, an added advantage of including only published articles is that it helps reduce the risk of data dependence.
The probability of duplication of studies is likely to be increased when including dissertation and papers presented at conferences, which often constitute previous drafts to published studies. Even considering only published studies, it may be difficult to detect duplication. The same data set, or a subset of it, may be repeatedly used in several studies, published in different journals, with different main authors, and without any reference to the original data source.
Published studies which were known or suspected to have employed the same database were only included once. Only one effect size from each pair of articles was included to avoid the possibility of data dependence. Coding of the studies for exploration of moderators The exploration of study characteristics or features that may be related to variations in the magnitude of effect sizes across studies is referred to as moderator analysis.
A moderator variable is one that informs about the circumstances under which the magnitude of effect sizes vary Miller and Pollock, The information retrieved from the articles for its potential inclusion as moderators in the data analysis was categorized in two domains: Demographic characteristics included the following variables: In terms of the characteristics of the programs, the features coded were: The type of sex education intervention was defined as abstinence-oriented if the explicit aim was to encourage abstinence as the primary method of protection against sexually transmitted diseases and pregnancy, either totally excluding units on contraceptive methods or, if including contraception, portraying it as a less effective method than abstinence.
An intervention was defined as comprehensive or safer-sex if it included a strong component on the benefits of use of contraceptives as a legitimate alternative method to abstinence for avoiding pregnancy and sexually transmitted diseases.
A study was considered to be a large-scale trial if the intervention group consisted of more than students. Finally, year of publication was also analyzed to assess whether changes in the effectiveness of programs across time had occurred. The decision to record information on all the above-mentioned variables for their potential role as moderators of effect sizes was based in part on theoretical considerations and in part on the empirical evidence of the relevance of such variables in explaining the effectiveness of educational interventions.
A limitation to the coding of these and of other potentially relevant and interesting moderator variables was the scantiness of information provided by the authors of primary research. Not all studies described the features of interest for this meta-analysis. For parental participation, no missing values were present because a decision was made to code all interventions which did not specifically report that parents had participated—either through parent—youth sessions or homework assignments—as non-participation.
However, for the rest of the variables, no similar assumptions seemed appropriate, and therefore if no pertinent data were reported for a given variable, it was coded as missing see Table I. Decisions related to the computation of effect sizes Once the pool of studies which met the inclusion criteria was located, studies were examined in an attempt to retrieve the size of the effect associated with each intervention. Since most of the studies did not report any effect size, it had to be estimated based on the significance level and inferential statistics with formulae provided by Rosenthal Rosenthal, and Holmes Holmes; When provided, the exact value for the test statistic or the exact probability was used in the calculation of the effect size.
In order to avoid data dependence, a conservative strategy of including only one finding per study was employed in this review. When multiple variations of interventions were tested, the effect size was calculated for the most successful of the treatment groups. This decision rests on the assumption that should the program be implemented in the future, the most effective mode of intervention would be chosen. Similarly, to ensure the independence of the data in the case of follow-up studies when multiple measurements were reported across time a single estimate of effect size was included.
According to Matt and Cook such estimates may be difficult—if not impossible—to obtain due to missing information in primary studies Matt and Cook, The sample sizes used for the overall effect size analysis corresponded to the actual number used to estimate the effects of interest, which was often less than the total sample of the study.
Occasionally the actual sample sizes were not provided by the authors of primary research, but could be estimated from the degrees of freedom reported for the statistical tests. The overall measure of effect size reported was the corrected d statistic Hedges and Olkin, These authors recommend this measure since it does not overestimate the population effect size, especially in the case when sample sizes are small.
The homogeneity of effect sizes was examined to determine whether the studies shared a common effect size. Testing for homogeneity required the calculation of a homogeneity statistic, Q. For the purposes of this review the probability level chosen for significance testing was 0.
Rejection of the hypothesis of homogeneity signals that the group of effect sizes is more variable than one would expect based on sampling variation and that one or more moderator variables may be present Hall et al. To examine the relationship between the study characteristics included as potential moderators and the magnitude of effect sizes, both categorical and continuous univariate tests were run.
Categorical tests assess differences in effect sizes between subgroups established by dividing studies into classes based on study characteristics. Hedges and Olkin presented an extension of the Q statistic to test for homogeneity of effect sizes between classes QB and within classes QW Hedges and Olkin, The relationship between the effect sizes and continuous predictors was assessed using a procedure described by Rosenthal and Rubin which tests for linearity between effect sizes and predictors Rosenthal and Rubin, A weighted least-squares regression analysis was conducted to test the joint effect of the significant moderators on the effect sizes.
The results of the univariate analyses were used to select the predictors to be included in the model. Categorical predictors were included as dummy variables. All predictors were entered simultaneously. Significance of each regression coefficient was tested using a z-test where the standard errors in the output of SPSS were adjusted by a factor of the square root of the mean square error for the regression model Hedges and Olkin, Model specification was tested using the QE goodness-of-fit statistic.
Results The search for school-based sex education interventions resulted in 12 research studies that complied with the criteria to be included in the review and for which effect sizes could be estimated. Among the set of categorical predictors studied, parental participation in the program, virginity status of the sample and scope of the implementation were statistically significant. The limited number of effect sizes precluded such analysis.
The confidence interval for parent participation does not include zero, thus indicating a small but positive effect. The scope of the implementation also appeared to moderate the effects of the interventions on abstinent behavior. In general, the remaining set of predictors had a moderate degree of intercorrelation, although none of the coefficients were statistically significant.
In the weighted least-squares regression analysis, only parental participation and the percentage of females in the study were significant. The test of model specification yielded a significant QE statistic suggesting that the two-predictor model cannot be regarded as correctly specified see Table IV.