Does environmental education benefit environmental outcomes in children and adolescents? A meta-analysis

.


Does environmental education benefit environmental outcomes in children and adolescents? A meta-analysis
The idea that education can be a vehicle to spread knowledge and help protect the natural environment has gained prominence since the 1960s. As formalized at the world's first intergovernmental conference on environmental education (Tbilisi Declaration; United Nations Educational, Scientific, and Cultural Organization [UNESCO] & United Nations Environment Programme [UNEP], 1977), the initial goals of environmental education were to foster global public knowledge about environmental issues and increase individuals' motivation and skills to protect or improve the natural environment. Later, educational approaches were developed to target social and economic sustainability more broadly, alongside environmental sustainability (Kopnina, 2012;Pavlova, 2013). To reflect their broader scope, these programs are often referred to as education for sustainable development, or sustainability education.
While educational programs can be implemented across the lifespan, targeting children and adolescents may be especially important. As the consumers, policymakers, and parents of the future, children and adolescents could be crucial agents for sustainable change (United Nations [UN], 2015). Indeed, young people can mobilize to help protect and stand up for the environment, as the millions of youth worldwide who took part in the 2019 climate protests demonstrate.
Over time, environmental educational programs for children and adolescents have been implemented in formal, school-based settings as well as in non-formal settings, for example at zoos and nature centers (Eshach, 2007). Programs have employed various educational approaches (e.g., active student participation, group learning, place-based learning) and learning activities (e.g., classroom lectures, field trips; Stern et al., 2014). The exact outcomes these programs target, as well as their conceptualizations, have been diverse as well (Ardoin et al., 2018). The outcome of environmental knowledge, for example, has been conceptualized in various ways, including students' awareness of climate change, their knowledge of how to recycle waste, or their understanding of ecological topics, such as the water cycle.
In this meta-analysis, we comprehensively define environmental education as all programs that provide children and adolescents with information or training to improve their environmental outcomes. As such, this meta-analysis includes programs that authors labeled as environmental education, as well as those that were labeled differently (e.g., education for sustainable development, conservation education, outdoor education). Thus, we use the term environmental education loosely and non-exclusively. We evaluate the influence of environmental education on four broad outcome categories, which we refer to collectively as "environmental outcomes": environmental knowledge (i.e., factual knowledge about and understanding of ecology, climate, and related natural science concepts and principles); environmental attitudes (i.e., favorable beliefs and feelings towards the subject of environmental education, or the importance of environmental sustainability more generally); environmental intentions (i.e., intent or willingness to engage in environmental behavior); and environmental behavior (i.e., behavior that either benefits the environment or harms it as little as possible; Steg & Vlek, 2009).

Understanding the effectiveness of environmental education
Does environmental education have the potential to affect environmental outcomes in children and adolescents positively? Although occasionally criticized for an arguable lack of theoretical richness (Dillon, 2003), many environmental education programs do make use of theory-driven educational approaches and principles (e.g., group learning, experiential learning) that have been thoroughly evaluated and proven useful in neighboring educational domains, such as science education (Dillon, 2003;Stern et al., 2014).
For decades, researchers, policymakers, and practitioners have attempted to evaluate the effectiveness of environmental education. Their work resulted in individual program evaluations as well as influential systematic literature reviews (e.g., Ardoin et al., 2018;Rickinson, 2001;Stern et al., 2014). These reviews have provided valuable insight into the state of the field, and they have documented how environmental education can be effective for improving, in particular, environmental knowledge, attitudes, and associated cognitions (Ardoin et al., 2018;Rickinson, 2001).
What the reviews do not provide, however, is empirical synthesis, which is what we aim to contribute with the present study. Empirical synthesis, as in the form of meta-analysis, is less subject to potential bias, including bias that may occur when one relies on the count of statistically significant results to draw conclusions about a body of literature (Cumming, 2014). We note that one prior meta-analysis on the effectiveness of environmental education does exist (Zelezny, 1999). This study, however, focused exclusively on the promotion of environmental behaviors, and it was conducted over two decades ago. Much relevant research has been published since. In fact, a search in the ERIC database indicates that over 7000 papers listing environmental education as keyword have been published since 1999. We include this recent literature in our analysis and cover the diversity of environmental education programs as they are currently employed.
Early environmental education programs commonly relied on the assumption that when children and adolescents fail to engage in environmental behavior, they do so at least in part because they lack knowledge about the environment (i.e., "knowledge deficit model," e.g., Burgess et al., 1998;Kollmuss & Agyeman, 2010). From this reasoning, one would assume that if environmental education effectively improves students' environmental knowledge, then it should predispose them to engage in more environmental behavior as well. This assumption, however, proved premature (Burgess et al., 1998;Heeren et al., 2016). Correlational work has shown that the link between environmental knowledge and environmental behavior is modest at best-children and adolescents often fail to act upon what they know (Diamantopoulos et al., 2003;Hasiloglu & Kunduraci, 2018;Marcinkowski & Reid, 2019).
Indeed, research has identified various barriers, including psychological barriers or "dragons of inaction", that hinder environmental behavior (Gifford, 2011). For example, children and adolescents may perceive environmental behavior to be counter-normative, perhaps especially among peers. As they are inclined to conform to peer norms (Brown, 2004;van Hoorn et al., 2016), this may lead them to refrain from engaging in environmental behavior. Similarly, young people may be relatively susceptible to the illusion that the impact of the environmental crisis will be bigger globally than locally, limiting the perceived urgency to act on their environmental attitudes (Gubler et al., 2019). Another common perception among young people is that they lack the means to reduce substantially their environmental impact, or that their individual actions do not matter much, which may lead them to believe that investing in environmental behavior is futile (Bandura, 1977;Meinhold & Malkus, 2005).
The pervasiveness of such attitude-behavior gaps suggests that environmental education, even if it can promote environmental knowledge and attitudes, may not necessarily provide a powerful or robust impetus for behavior change (e.g., Boyes et al., 2009). As most previous work has not distinguished between environmental intentions and behavior (Rickinson, 2001;Zelezny, 1999), the question as to whether environmental education can actually improve students' environmental behavior is still insufficiently answered. Given the supposed pervasiveness of psychological barriers to environmental behavior change in children and adolescents, we conservatively hypothesized that environmental education improves students' environmental knowledge and associated cognitions, but not their environmental behavior.

Possible differential effectiveness of environmental education
Various factors, including program and student characteristics, likely influence the effectiveness of environmental education. By improving our understanding of when environmental education is most effective, we will be able to optimize available programs and, in doing so, learn important lessons about student learning and behavior change processes more generally (Dillon, 2003;Stern et al., 2014). Accordingly, an additional aim of the current meta-analysis is to test moderators of environmental education effectiveness.
First, we examine whether two frequently used educational approaches are associated with increased environmental education effectiveness (Stern et al., 2014). We hypothesized that programs that let students actively work together through discussion or collaboration (i. e., group learning) are especially effective. Indeed, in educational domains other than environmental education, group learning is typically associated with improved student learning and achievement compared to individual learning, partly because students can model learning processes for each other (Kyndt et al., 2013;O'Donnell, 2006). Also, in line with the concern that many of today's children and adolescents are relatively disconnected from their natural environment (e.g., Bruni et al., 2017;Louv, 2005), while such a connection is an important driver of environmental behavior (Mackay & Schmitt, 2019;Whitburn et al., 2019), we hypothesized that environmental education is especially effective when implemented, at least partially, in nature (i.e., nature experience). Nature experiences (e.g., hiking in a nature park, observing animals in their natural habitat) may improve students' emotional connection to the natural environment, thus promoting their learning outcomes and environmental behavior (Dopko et al., 2019;Kuo et al., 2019;Otto & Pensini, 2017).
Second, we examine whether environmental education is less effective among middle adolescents compared to both younger children and older adolescents. There is some evidence to suggest that environmental consciousness dips in adolescence, especially among middle adolescents (i.e., 14-16-year-olds), who tend to report comparatively low levels of environmental knowledge, attitudes, and behavior (Olsson & Gericke, 2016;Otto et al., 2019). Environmental education may not be ideally suited to help overcome such environmental disengagement, especially because it is hard to align education-based interventions with middle adolescents' developmental needs (Eames et al., 2018;Yeager et al., 2018). Traditionally, environmental education programs often seek to explain what environmental problems we face and what can be done to counter them. Adolescents, however, do not necessarily like to be told what to think and do-they care deeply about autonomously forming their own opinions and making their own decisions. Of course, this is not to say that environmental education cannot be of use in this age period, but it may be hard to be impactful enough to counter the environmental disengagement that middle adolescents may exhibit (Corner et al., 2015;Yeager et al., 2018).

Overview
The present meta-analysis aims to synthesize the available empirical evidence on the effectiveness of environmental education among children and adolescents (≤19 years of age). Specifically, we examine the effectiveness of environmental education on improving environmental knowledge, attitudes, intentions, and behavior. In addition, this metaanalysis aims to improve understanding of the conditions under which environmental education is more or less effective. We test whether programs that use group learning or provide nature experience are more effective than programs that do not, and whether programs that target middle adolescents are less effective than programs targeting students in other age groups.

Protocol and registration
This study was approved by the ethics review board at the Faculty of Social Sciences, Utrecht University. We preregistered the study protocol with Open Science Framework (OSF) prior to data coding on November 19, 2019 (http://doi.org/10.17605/OSF.IO/H639V). Data and analysis code are also available on OSF.

Search strategy
We searched for relevant study reports until November 16, 2019 in the following databases: PsycINFO, ERIC, and Scopus (for the full search string, see the supplementary material). No restrictions were imposed on the search. Additionally, we searched the references of previous reviews and meta-analyses (i.e., Rickinson, 2001;Stern et al., 2014;Zelezny, 1999), and volumes of two specialist journals (i.e., Journal of Environmental Education, Environmental Education Research). We did not restrict our search to studies published in a certain range of years. This comprehensive approach allowed us to include a diverse set of studies, which we needed to compare program effectiveness across various educational approaches and populations.

Study selection
Studies were selected for inclusion if they (a) were reported in an available article written in English, (b) only included participants younger than twenty years, (c) used an experimental, quasiexperimental, or single group pre-posttest design, (d) examined the effect of environmental education on environmental knowledge, attitudes, behavioral intentions, and/or behavior, and (e) reported quantitative information on at least one of these outcomes. We decided to include studies that used single group pre-posttest designs, even if they are methodologically suboptimal, to be able to comprehensively synthesize the available evidence base.
We conceptualized environmental knowledge as respondents' factual knowledge about, and understanding of, ecology, climate, and related natural science concepts and principles, as indicated by their performance on a test or questionnaire. In some cases, environmental knowledge included forms of environmental awareness, though it did not include self-perceived knowledge (e.g., "I know a lot about global warming"). We conceptualized environmental attitudes as respondents' self-reported favorable beliefs and feelings towards the environment, the subject matter, or the importance of environmental sustainability and/or environmental behavior. Environmental attitudes included some forms of environmental awareness, though it did not include environmental identity, inclusion of nature in oneself, connectedness to nature, or implicit attitudes.
Whereas environmental intentions are gathered under the categories of environmental attitudes by some (e.g., Schultz et al., 2004) and environmental behavior by others (e.g., Zelezny, 1999), we decided to examine environmental intentions as a distinct construct, consistent with influential theories of behavior change (e.g., Ajzen, 1991;Webb & Sheeran, 2006). We conceptualized environmental intentions as respondents' self-reported intent or willingness to engage in environmental behavior. We conceptualized environmental behavior as respondents' self-reported, observed, or inferred behavior that either benefits the environment or harms it as little as possible (Steg & Vlek, 2009). Environmental behavior therefore comprised a relatively diverse set of behaviors, ranging from water and energy conservation to waste recycling. Behaviors that do not benefit (or do not limit harm to) the environment, such as spending time in nature, were not included. These comprehensive operationalizations allowed us to capture the diversity of environmental education outcomes as they have been assessed in the literature.
A flow diagram of the study selection process is presented in Fig. 1. Study selection proceeded in two steps. First, titles and abstracts were screened for eligibility by one author. Second, the full text of the remaining eligible studies was screened for eligibility. Ambiguities were checked with two other authors. If the full text was unavailable or if quantitative information necessary for effect size computation was not provided in the report, this information was requested from the authors if contact information was available. We contacted 83 authors, which allowed us to include an additional 17 studies.

Coding
Data were extracted for study and design characteristics (e.g., country, year of publication, sample size), program characteristics (e.g., program delivery, program intensity, program activities), participant characteristics (e.g., age, sex), outcome characteristics (e.g., operationalization of outcomes), risk of bias (e.g., coding difficulty), and several indices of study quality (i.e., study design, dropout, method of assessment). Details on the coding procedure are provided in the supplementary material.

Educational approaches
Environmental education programs were coded as involving group learning if they required students to work together actively through discussion or collaboration. They were coded as including nature experience if they physically took place, at least partially, in nature. These educational approaches were coded dichotomously (0 = no, 1 = yes). Thus, programs could use either both, one, or none of these educational approaches. When reports did include descriptions of the program but did not mention the educational approach, we assumed it was not used (i.e., code = 0). When reports insufficiently described the program that was implemented (14% of all reports), we did not code the educational approach (i.e., code = missing). We excluded these reports from the relevant moderator analyses. We initially planned also to code whether environmental education programs capitalized on personal relevance (i.e., explicitly linking program content to students' individual lifestyles, values, or self-beliefs), but we dropped this educational approach as the reports often provided too little detail to allow for reliable coding.

Age
We coded the mean age of the sample. If not available, the mean age was approximated with the median of the age range. To be able to graphically explore age variation within samples, we also coded the sample age range (maximum, minimum). If not reported, the age range was approximated with the reported sample age mean ±2 SD if available, or with the typical age range for the reported school grade(s).

Interrater reliability
To ensure reliable study selection and coding, we assessed interrater reliability. First, the full texts of 92 randomly selected studies were screened for eligibility by an independent rater (10% of 881 studies). Cohen's κ was excellent, κ = 0.93. Second, 17 randomly selected studies were coded by the same independent rater (10% of 169 studies). For categorical variables, Cohen's κ was satisfactory (κ = 0.81 to 1), except for the difficulty that raters experienced while coding the studies (κ = 0.49). This variable appeared to be too subjective, and was therefore excluded from analysis. For continuous variables, average intraclass correlations (ICC) were examined using a two-way random-effects model, to be able to examine if the two raters assigned the same scores to the same variables (i.e., absolute agreement; Koo & Li, 2015). ICCs were excellent, ranging from 0.99 to 1, with two exceptions: one poor ICC (0.28) was based on only three observations with one discrepancy, and another (0.14) resulted from an outlier in coding. We reduced the impact of the outlier, and the ICC improved (0.88). Between-rater discrepancies were discussed and resolved among the authors.

Effect size calculation
Effect sizes were computed as Cohen's d, reflecting the standard mean difference between the control and intervention group at posttest, or the difference between pretest and posttest in cases where no control group was available. Positive effect sizes reflect an improved outcome, and negative effect sizes reflect a worsened outcome (Borenstein & Hedges, 2009;Cohen, 1988;Lipsey & Wilson, 2001). Effect sizes and their variances were calculated based on means and standard deviations, using (1) and (2), where exp refers to the experimental group or posttest and con refers to the control group or pretest. If no means and standard deviations were reported, we calculated effect sizes based on other statistics when possible, using the online Campbell Collaboration tool (e. g., SEs, t-values, binary proportions, p-values; 37% of effect sizes). If statistics were reported for scale items (rather than full scales), we calculated an effect size for the full scale by averaging the reported statistics when possible (e.g., Ms and SDs, p-values; 10% of effect sizes). If studies only reported that an effect was not significant, the effect was assigned an effect size of 0 (<1% of effect sizes). All effect sizes were transformed using correction for small sample sizes, using (3) and (4) (Hedges' g;Hedges & Olkin, 1985;Marfo & Okyere, 2019).  mean and Cook's distance >4/N) by substituting their values with the highest or lowest Hedges' g value that was not an outlier. Outliers (3.1% of effect sizes) were identified for environmental knowledge (n = 7), attitudes (n = 5), and behavior (n = 4). All analyses provided similar results when outliers were not adjusted.

Multilevel modeling approach
We adopted a multilevel modeling approach using the Metafor Package in R, version 3.6.3 (Ouzzani et al., 2016;Viechtbauer, 2010). This approach allowed us to extract and analyze multiple effect sizes from the same study by accounting for the dependency between effect sizes (van den Noortgate, López-López, Marín-Martínez, & Sánchez-Meca, 2013). This means that if studies included multiple measures of, for example, pro-environmental attitudes, we included all of these in our analyses, taking into account that these measures were assessed in the same study. We applied a three-level random-effects model, which estimates variance in effect sizes between participants (sampling variance, level 1), outcomes (within-study variance, level 2), and studies (between-study variance, level 3; Wibbelink & Assink, 2015). We computed standard errors using the Knapp and Hartung (2003) method, which is based on t and F distributions (we used this method because the default computation of standard errors based on normal distributions may increase type I error; Assink & Wibbelink, 2016). We estimated model parameters using restricted maximum likelihood (REML; Assink & Wibbelink, 2016). Analysis proceeded in 3 steps: 1. First, we ran four intercept-only models to test if the overall effect sizes of environmental knowledge, attitudes, intentions, and behavior differed significantly from 0. 2. Second, we conducted two one-sided log-likelihood ratio tests for each of the intercept-only models to test if there was significant heterogeneity within studies (level 2) and between studies (level 3). We examined how the total variance was distributed over the three levels (Cheung, 2014). If log-likelihood ratio tests were not significant, we still considered heterogeneity to be substantial if < 75% of the total amount of variance was located at level 1 (i.e., 75% rule; Hunter & Schmidt, 1990). 3. Third, to test putative moderating effects, we extended each of the intercept-only models with a mixed-effects multilevel model. As including multiple moderators in one model can inflate type II error rates due to multicollinearity, we first evaluated the effect of moderators in separate models (Hox, 2010). All moderating effects were tested using Holm-Bonferroni correction for multiple testing (α = 0.00625; Holm, 1979).

Sensitivity analyses
We conducted sensitivity analyses to test whether our main findings are robust to study quality and related assessment features. We evaluated study quality in terms of the study design (randomized and quasi experimental design versus single group pre-posttest design), percentage of dropout, and, for effect sizes of behavior, the method of behavior assessment (observation versus self-report). We tested whether effect sizes were sensitive to (i.e., moderated by) these three quality indices, as well as the number of weeks between the intervention and posttest assessment (i.e., follow-up). We initially also planned to use study preregistration as a quality index, but we dropped this index because none of the studies had been preregistered.

Publication bias
We did not statistically test for publication bias because the accuracy of available tests (e.g., Egger's regression, trim and fill) is unclear when applied in the context of a multilevel modeling approach with heterogeneous datasets (Coburn & Vevea, 2015;McDaniel et al., 2006). However, we were able to explore potential publication bias by visually examining a funnel plot. In the absence of bias, this plot should resemble a symmetrical inverted funnel (Sterne & Egger, 2001).

Study pool
We obtained a total of 512 effect sizes from 169 independent studies, which were conducted in 43 countries from 6 continents (Fig. 2). The studies were published between 1971 and 2019 (Table S1 in the supplementary material presents an overview of the included studies). Effect sizes were computed over a total of N = 176,007 participants, ranging in age from 3 to 19. Of the effect sizes, 231 (45.1%) concerned knowledge, 171 (33.4%) concerned attitudes, 32 (6.3%) concerned intentions, and 78 (15.2%) concerned behavior. A total of 38 studies (k = 110 effect sizes) assessed delayed effects, ranging from one week to five years after the program (M = 10 weeks). Sample demographics and use of educational approaches are presented in Table 1.
The majority of effect sizes was based on a single group pre-posttest design (57.6%), others used a quasi-experimental (38.7%) or experimental (3.7%) design. Environmental outcomes were assessed mostly through self-report questionnaires (97.9% of all effect sizes). These questionnaires were often author-developed for the pertaining study (64.3% of effect sizes), but validated measures were occasionally used as well. We considered (variations of) the 2 Major Environmental Values scale (2-MEV, 7.0% of effect sizes; Bogner & Wiseman, 2006), Children Environmental Attitudes and Knowledge Scale (CHEAKS, 3.3% of effect sizes; Leeming et al., 1995), and the New Ecological Paradigm scale (NEP, 1.0% of effect sizes; Dunlap et al., 2000) as validated measures. Environmental behavior was occasionally inferred or observed, rather than self-reported (e.g., by inspecting household electricity bills, or observing the recycling of sweets wrapping; 12.8% of effect sizes for behavioral outcomes). For each of the environmental outcomes, conceptualizations varied widely across studies. While most effect sizes (64.7%) on environmental knowledge and attitudes pertained to climate change or other global environmental challenges, some (30.8%) pertained to more specific topics (e.g., the water cycle) or environmental challenges (e.g., endangered local plants or animal species). Effect sizes on environmental intentions and behavior most often pertained to a composite of diverse environmental behaviors (41.8%), but otherwise pertained to specific environmental behaviors, including energy and water conservation (30%), recycling (7.3%), and social sphere or activist behaviors (3.6%).

Overall effectiveness of environmental education
As hypothesized, environmental education improved students' environmental knowledge, attitudes, and intentions, with the overall effect sizes for each of these outcomes being significantly different from zero (Fig. S1 in the supplementary material presents the forest plots). As shown in Fig. 3, the overall effect size for knowledge was large, g = 0.953 (SE = 0.075, 95% CI [0.805, 1.101], p < .001); the overall effect size for attitudes was small to medium, g = 0.384 (SE = 0.050, 95% CI [0.286, 0.482], p < .001); and the overall effect size for intentions was small, g = 0.256 (SE = 0.091, 95% CI [0.069, 0.443], p = .009). Differing from our hypothesis, environmental education also improved students' environmental behavior. The overall effect size for behavior was small to medium, and differed significantly from zero, g = 0.410 (SE = 0.073, 95% CI [0.264, 0.556], p < .001).
Notably, the 95% CI for improved knowledge did not overlap with the 95% CI for improved attitudes, intentions, and behavior, indicating intervention effects were larger for knowledge compared to attitudes, intentions, and behavior. In addition, the overall effect sizes differed within and between studies (as established using two-sided log-likelihood ratio tests and the 75% rule [Hunter & Schmidt, 1990], see Table 2). These findings warrant subsequent moderation analyses to account for the observed variation in effect sizes (Assink & Wibbelink, 2016).

Educational approaches
We tested whether environmental education is more effective when it uses group learning or offers nature experience (compared to when it does not). Different from what we hypothesized, moderation for these two educational approaches was not significant for any of the outcomes when using a Holm-Bonferroni corrected α of .00625 (ps = .033 to .999; Table 3).

Age
To test whether environmental education is less effective among middle adolescents compared to younger children and older adolescents, we examined moderation by mean age. Different from what we hypothesized, moderation by mean age was not significant for any of the outcomes, regardless of whether we tested linear effects (ps = .025 to  Note. a Number of included effect sizes. b Number of studies from which included effect sizes were retrieved.

Fig. 3.
Overall effects (Hedges' g) of environmental education on students' environmental knowledge, attitudes, intentions, and behavior. Note. Error bars represent 95% confidence intervals.
.572) or quadratic effects (ps = .142 to .712; Table 3). It should be noted that sample age ranges differed substantially across studies, making the estimation of age effects based on mean ages somewhat imprecise. Accordingly, our tests of age effects need to be interpreted with caution (supplementary material, Fig. S2). Nonetheless, exploratory analyses including only effect sizes with small sample age variation (i.e., age ranges with a width ≤2 years) produced similar results (supplementary material, Table S2).

Sensitivity analyses
As reported in Table 3, overall effect sizes for environmental knowledge, attitudes, intentions, and behavior did not depend on study design (experimental and quasi-experimental design versus single group pre-posttest design), percentage of dropout, follow-up, and method of behavior assessment (inference or observation versus questionnaire).

Post Hoc exploratory analyses
3.3.4.1. Moderation analyses. As our a priori moderators did not account for the substantial heterogeneity we found in effect sizes, we explored whether any of the other coded variables moderated the effect of environmental education on students' environmental knowledge, attitudes, intentions, and behavior: study characteristics (i.e., publication year), design characteristics (i.e., type of control group), intervention characteristics (i.e., classroom-based versus not classroom-based, lasting multiple sessions or days versus a single session or day), participant characteristics (i.e., percentage female, percentage minority, socioeconomically disadvantaged versus non-disadvantaged), and outcome characteristics (i.e., knowledge or attitudes pertaining to a specific versus more general or globally relevant environmental topic, composite of [intended] environmental behaviors versus a specific [intended] environmental behavior, use of validated versus nonvalidated measures). For consistency, we set α at 0.00625, similar to  the tests of the a priori moderators. None of these characteristics significantly accounted for the heterogeneity of effect sizes (supplementary material, Table S3).

Comparison of activities.
Next, we explored whether some educational activities produced stronger (or rather, weaker) effects than others. We categorized programs based on the type of educational activities that they offered. We were able to classify 76.8% of all effect sizes as being derived from either a camp (82 effect sizes), field trip (69 effect sizes), school-wide curriculum (26 effect sizes), traditional classroom (50 effect sizes), investigation-based (37 effect sizes), gardening (10 effect sizes), or multimodal (119 effect sizes) activity format. Detailed descriptions of activity categories are provided in the supplementary material. We then ran intercept-only models to derive an overall effect size for each activity on environmental knowledge and on environmental attitudes, intentions, or behaviors (the latter three outcomes were combined because their overall effect sizes did not significantly differ, which allowed us to compute an overall effect size for most activity formats, based on at least 10 observations). We excluded school-wide curriculum and gardening activity formats from these analyses due to a small number of observations (<10) for each. The overall effect sizes are shown in Fig. 4a and b. All 95% confidence intervals are overlapping, indicating that the effect sizes did not differ significantly between different activities.

Publication bias
To explore potential publication bias, we visually examined a funnel plot with SEs plotted against effect sizes of environmental knowledge, attitudes, intentions, and behavior. The funnel plot (Fig. 5) appears somewhat asymmetrical, with stronger positive effects being reported more frequently in studies with smaller samples (i.e., studies with larger SEs), and stronger negative effects being reported more frequently in studies with larger samples (i.e., studies with smaller SEs). Although asymmetry can arise from factors other than publication bias (Sterne et al., 2011), and we were not able to test whether asymmetry was statistically significant, our funnel plot suggests that the overall effect of environmental education on students' environmental outcomes is partially driven by smaller studies with relatively strong positive effects.

Discussion
The current meta-analysis synthesized five decades of global research on environmental education for children and adolescents. It shows that environmental education provides an effective means of improving students' environmental outcomes in each of the categories that we distinguished: environmental knowledge, attitudes, intentions, and behavior.
Our findings concerning environmental knowledge, attitudes, and intentions are consistent with conclusions drawn in previous narrative reviews (Ardoin et al., 2018;Rickinson, 2001). Stereotypes sometimes portray youth, and perhaps especially adolescents, as self-centered or apathetic.
However, young people often care about beyond-the-self-aims and are driven to make societal contributions (Damon et al., 2003;Fuligni, 2019). Our meta-analysis demonstrates that education can be a viable means of strengthening such motivations (Gould et al., 2011;Vare & Scott, 2007).
One other encouraging finding is that environmental education also appears to benefit students' environmental behavior. We did not hypothesize this effect, given how challenging it is to encourage behavior change in youth (Yeager et al., 2018). As argued by others (Stern et al., 2014), environmental education programs typically focus on promoting knowledge, and programs with such a focus should not necessarily be expected to influence behavior as well. That said, our finding is consistent with evidence from a previous meta-analysis on the behavioral outcomes of environmental education, conducted more than 20 years ago (Zelezny, 1999).
One interpretation is that environmental education can-better than we anticipated-effectively mitigate the psychological barriers that often prevent children and adolescents from engaging in environmental behavior. For example, while knowledge about climate change may not be a sufficient driver of environmental behavior, so called "action knowledge" on how to engage in environmentally friendly behaviors (e. g., how to recycle waste) may effectively instill in young people the conviction that they can have a meaningful environmental impact, and thus foster behavior change (Otto & Pensini, 2017). An alternative interpretation is that environmental education programs may improve environmental behaviors for which young people experience relatively few psychological barriers to begin with (Gifford, 2011). For example, programs target conservation and recycling behaviors quite regularly (37% of effect sizes concerned these behaviors), whereas youth already perceive these behaviors as relatively easy to engage in (Boyes et al., 2009;Boyes & Stanisstreet, 2012). Psychological barriers may exert a more powerful impact on other environmental behaviors, such as adopting different eating habits (e.g., reducing meat consumption) or purchasing eco-friendly (e.g., sustainably produced) products. These behaviors are often perceived as more difficult to engage in, less useful, or in conflict with peer norms (Prabawa-Sear & Baudains, 2011). None of the studies included in this meta-analysis specifically targeted these behaviors.
Notwithstanding these interpretations, we emphasize that behavior was typically measured (i.e., 87% of effect sizes) via self-report rather than observation or behavioral assessment. Self-reports of environmental behavior, although informative, can be subject to bias. For example, social desirability bias may have led participants to report behaviors in accordance with the contents they were taught in environmental education (Chao & Lam, 2011;Oerke & Bogner, 2013). Although we found no evidence that effect sizes differed between studies that relied on self-reports and those that relied on inferred or observed measures of behavior, this result may be due to limited statistical power given that the latter measures were only rarely used (i.e., 13% of effect sizes for behavioral outcomes).

Unexplained heterogeneity
None of the moderators that we tested accounted for the heterogeneity in the effect sizes of environmental education. For example, we found no evidence that programs that used group learning or offered nature experience were more effective than those that did not. One possibility is that specific forms of group learning, especially those that emphasize peer discussion, may have had adverse effects in some studies. Indeed, peer discussion can make students realize that the main messages endorsed by environmental education may occasionally conflict with peer norms, possibly leveling out potential benefits of group  learning (Bamberg, 2013;Gifford, 2011). Also, it is possible that environmental education programs need to offer relatively intensive nature experience to yield meaningful effects-perhaps more intensive than can be feasibly offered in most programs. Indeed, there is evidence that regular rather than single or short-term experiences in nature are needed to foster nature connectedness and improve environmental outcomes (Kuo et al., 2019;Rosa, Profice, & Collado, 2018;Whitburn et al., 2019).
Similarly, we found no evidence that the effectiveness of environmental education programs varies across age, or indeed, across any of the other putative moderators that we explored, which included study and design characteristics (e.g., publication year, study design), intervention characteristics (e.g., program duration), participant characteristics (e.g., sample size, sample sex), and outcome characteristics (e.g., operationalization of environmental outcomes). Our finding that recent environmental education efforts do not seem more effective than earlier efforts suggests that environmental education has not meaningfully improved in recent decades. We note though that it may also indicate that programs have been successfully adapted to target the changing and increasingly complex environmental challenges that the world faces (Dunlap et al., 2000;Intergovernmental Panel on Climate Change [IPCC, 2019).
Finally, we found no evidence that the type of activity that environmental education programs offer makes a difference in terms of program effectiveness. Camps, field trips, classroom-based approaches, investigation-based approaches, or some mixture of such activities all tend to yield similar effects. Thus, although we know that environmental education often does what it intends to do, we still know little about the conditions and approaches that facilitate its effectiveness. What can be concluded is that the positive effects of environmental education are relatively robust and can be obtained in diverse ways, for diverse outcomes, and for diverse student populations of different ages.

How can the field move forward?
The gravity of the current environmental crisis necessitates the promotion of environmental outcomes in young people, who will shape the future of the planet (IPCC, 2019). To support educators and policymakers in the development and evaluation of environmental education, this field of research has a critical role to fulfill. One notable strength of the research that we meta-analyzed is the diversity of the populations it sampled. We were able to include studies conducted in over 40 countries across the globe. As such, the field sets a positive example-most research in the social and behavioral sciences relies more heavily on samples from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) populations, raising concerns about representativeness (Henrich et al., 2010;Nielsen et al., 2017). The global implementation of environmental education has also allowed for programs to target environmental problems that matter locally, ranging from the conservation of endangered endemic monkeys (i.e., cotton-top tamarins) in Colombia (Feilen et al., 2018) to the protection of drinking water resources in India (Alexandar & Poyyamoli, 2012).
Notwithstanding these strengths, the field faces challenges as well. Below, we outline research priorities that may help further strengthen the field's impact. In doing so, we focus specifically on the promotion of environmental behavior. Environmental education primarily targets environmental knowledge (i.e., almost half of the effect sizes included in our study) and, as this meta-analysis demonstrated, it does so successfully. Arguably, optimization of our means to promote behavior change thus seems most urgent at this stage: behavioral and lifestyle changes are necessary for mitigating climate change (IPCC, 2019;Nielsen et al., 2021).
A first priority will be to gain a more fine-grained understanding of when and how environmental education impacts behavior change. This understanding can be advanced by diversifying methodological standards, both in terms of research design and assessment methods. More frequent use of randomized experimental designs will allow scholars to draw causal conclusions on the effects of environmental education on behavioral outcomes. Also, future program evaluations could be designed to empirically isolate effective program components, perhaps using similar techniques that are used in clinical intervention research to discern active therapy ingredients (e.g., additive and dismantling trials, factorial experiments; Leijten et al., 2021). Similarly, in terms of assessment methods, it would be good to complement self-reports of behavior with observational or other in vivo behavioral assessments (Camargo & Shavelson, 2009;Lange & Dewitte, 2019). Preferably, these observational or behavioral assessments are unobtrusive. In one study (Baur & Haase, 2015), for example, participants received sweets wrapped in invisibly labeled packages, which enabled the researchers to track down if students who had taken part in an environmental education program recycled their waste. By diversifying methodological standards, environmental education research can contribute to a deeper understanding of behavior change processes in young people and establish a rigorous knowledge base from which to inform educators and policymakers.
A second priority will be to broaden the scope of environmental behaviors that are targeted and assessed as outcomes. As it stands, environmental education mainly targets behaviors that are known to be relatively amenable to change, including conservation and recycling behaviors. While behavior change in these domains is important, it would be good to also target other impactful behaviors that seem harder to change (e.g., eco-friendly consumption; Herrero et al., 2016), arguably because the psychological barriers that hamper these behaviors are more potent. The development of theoretically precise, targeted intervention procedures aimed at changing the psychological process that impedes behavior change are promising in this regard (Walton, 2014;Yeager & Walton, 2011). What is more, we recommend expanding the current focus on individual behaviors to include public sphere environmental behaviors (e.g., contributing to environmental organizations), which may also contribute substantially to social change (IPCC, 2018;Jorgenson et al., 2019).

Strengths and limitations
Our meta-analysis took stock of the diverse, interdisciplinary, and global field of environmental education research and empirically demonstrated its accomplishments-from its early implementation in the 1970s until now. We have critically appraised these accomplishments and outlined research priorities that should foster the field's vitality during a time of pressing need for sustainable change. Our use of a multilevel modeling approach allowed us to extract multiple effect sizes from single studies, enabling us to distinguish among four major outcomes of environmental education for children and adolescents (Van den Noortgate et al., 2015). In addition, we tested both theoretically relevant and descriptive moderators in an attempt to understand better the conditions under which environmental education is most effective.
Our meta-analysis has limitations as well. First, the studies that we included are fairly heterogeneous in terms of the measurement approaches they used. For example, they either relied on validated measures or author-created alternatives; and they assessed environmental behavior either in terms of a composite of diverse behaviors (e.g., recycling, conservation, consumption) or rather a single, specific action (e.g., recycling sweets packages). Our findings should be interpreted in the light of such heterogeneity. Still, we established that our findings did, in fact, generalize across the measurement characteristics that we coded, suggesting that program effectiveness does not hinge upon a particular outcome operationalization.
Second, for the coding of educational approaches (i.e., group learning, nature experience), we needed to assume that programs used group learning or offered nature experience only when these approaches were explicitly mentioned in the study report. However, we acknowledge this is not necessarily the case, which may have rendered our coding of these variables somewhat less precise. Third, our estimation of age effects was not as precise as we had hoped. The sample age ranges of the studies we meta-analyzed differed substantially, making it impossible to estimate age effects precisely. In future work, such a precise estimate could be obtained by using individual participant data metaanalysis, a technique that allows for testing moderators on the level of the individual participant rather than study sample. Fourth, although our search strategy included several synonyms for 'environmental education', it is possible that some studies that could be classified as environmental education but used a different label were not included in our analyses. Nonetheless, with 169 studies and 512 effect sizes, our metaanalysis is the most comprehensive synthesis of environmental education research so far.

Conclusion
Our meta-analysis demonstrates that environmental education offers an effective approach to improving the environmental knowledge, attitudes, intentions, and behavior of young people. Priorities for the field are to diversify methodological standards, to discern effective environmental education components and approaches, and to expand the scope of targeted environmental behaviors. We hope that environmental education can further realize its potential to help young people develop into environmentally aware and engaged actors.

Declaration of competing interest
None. This study was registered with Open Science Framework (OSF; http://doi.org/10.17605/OSF.IO/H639V). The data and analysis code that support the findings of this study are also available on OSF. This work was supported by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement 864137 awarded to Sander Thomaes).