Educational research generally aims to develop cognitive skills by means of practices that are based on scientific evidence (Slavin, 2020). That is, with scientific confirmation that relies on the scientific method, it is possible to improve school performance (Agarwal et al., 2018; Dunlosky et al., 2013; Sigman et al., 2014). According to the science of learning, retrieval practice is one of the most effective strategies (Dunlosky et al., 2013). This practice intends to promote the retrieval of studied information, by "taking information out of student's heads" (Agarwal et al., 2018; Roediger, Putnam & Smith, 2011). Thus, students are able to retrieve previously studied content by their own volition whenever they need it (Agarwal et al., 2018; Roediger, Putnam & Smith, 2011). To do so, it is possible to access learned content by means of simple techniques such as tests, self-tests, questionnaires, quizzes, flashcards, and so on (Agarwal et al., 2018; Agarwal et al., 2012; Roediger, Putnam & Smith, 2011). It is advisable to use the retrieval technique without any grading involved (or low stakes) (Roediger, Putnam & Smith, 2011; Yang et al., 2021), as a learning strategy rather than some sort of test (Agarwal et al., 2012; McDaniel et al., 2013; Roediger, Putnam & Smith, 2011).
Considering that the retrieval practice strengthens memory features (McDaniel et al., 2013) and is seen as one of the best strategies to promote long-lasting learning (Dunlosky et al., 2013), it is necessary to understand how researchers reached such a conclusion. Basically, the experimental procedure that leads to the observation of the effects of the retrieval practice might be described in the following way: I) The students are introduced to the to be learned information; II) After that, there is a lag, which is the time between the introduction of content and the first retrieval practice; III) Subsequently, the students are introduced to the information by means of the retrieval practice with a control condition (which, in most of the studies, consists of rereading the studied content) (Broek et al., 2016; Moreira, Pinto, Starling & Jaeger, 2019; Rowland, 2014); IV). Then, there is a retention interval, that is, the time between the reintroduction to content and the final test (Broek et al., 2016); V). Finally, retention was assessed by a test applied to the participants of the study (to gather data for measuring performance by the condition of retrieval practice in comparison to the control condition) (Broek et al., 2016). In all these phases, there are variables that researchers might control and these variables might interfere with the obtained results (for a brief summary, see Figure 1, with the most important variables found in each phase).
Note: within the blue boxes, are the most important variables obtained by this systematic review in every 'phase of the experimental procedure involving the retrieval practice.
As a result of the research that used the experimental procedure described above, there is facilitation for learning (McDermott, 2021), better memory performance for introduced content, and long-lasting learning (e.g. McDaniel et al., 2013), including other benefits (see Roediger, Putnam & Smith, 2011). These are robust discoveries, with most studies showing effect sizes ranging from moderate (Adesope et al., 2017) to large, both in the laboratory and in the classroom (Agarwal et al., 2021; Yang et al., 2021).
Although retrieval practice is one of the best strategies for studying, in regard to defining what strategies to use when students are learning or reviewing some introduced material, they usually choose less effective strategies - not based on evidence - such as rereading material and/or notes (Ekuni & Pompeia, 2020; Karpicke et al., 2009; Tullis & Maddox, 2020). Unfortunately, rereading is not as beneficial as the retrieval practice (Rowland, 2014). In addition, most undergraduate students report that they do not study the way their teachers have instructed them to do (Hartwig & Dunlosky, 2012). In this sense, it is indispensable that students, as well as teachers, know what are the best strategies to use.
A point that deserves attention is the fact that old study habits are hard to change. For instance, even though undergraduate students experienced the perception that studying via retrieval practice led to better learning, they still chose to study by rereading content (Karpicke et al., 2009). Considering that childhood is a period of everlasting lessons, it is necessary to organize the teaching and learning process by their own specific needs (e.g. by paying attention to the creative capacity of each age group) (Delvan et al., 2002). One plausible hypothesis is that, perhaps, if children are taught good study habits since an early age, it might be possible to keep them throughout one's academic career. Thus, it is possible to adapt curricula, teaching methods, and pedagogical routines (Agarwal et al., 2018; Roediger, Agarwal et al., 2018; Tullis & Maddox, 2020) to promote retrieval practice since childhood (Fazio & Agarwal, 2020).
However, before making practice recommendations for this age group (e.g. Fazio & Agarwal, 2020), it is necessary to further investigate whether the retrieval practice also bears benefits for child education (Fazio & Marsh, 2019). For example, Aslan and Baurnl (2015) detected positive effects only on older children (at the average age of 8 years), but not on younger children (at the average age of 6 years). Metcalfe et al. (2007) verified that the earlier an item is introduced by means of retrieval practice, that is, the shorter the lag is, the greater the chance of retrieving the information on the final test, after a one-week retention interval. Brojde and Wise (2008), using stories, pointed out that retrieving the same content twice or more times with short-answer questions produced better retention than just reading the material. In studies conducted by Adlof et al. (2021), they were able to verify that the learning protocol with multiple presentations via the retrieval practice can be useful for the learning of new words, especially when it is approached with children with specific language disorders. Other studies have identified that by giving students retrieval practice opportunities, even when the children make mistakes, if feedback is provided (whether students get it right or wrong and what the correct answer is) (Agarwal et al., 2018), it is possible to optimize learning (Agarwal et al., 2012; Agarwal et al., 2018; Fazio & Agarwal, 2020).
In one review, Fazio and Marsh (2019) assessed the benefits of retrieval practice for students from kindergarten to elementary school and verified that the children retain more information when they are given a chance to recall them by memory, that is, by retrieval practice. However, this review does not approach the literature systematically, but rather critically. After an analysis of the available systematic review, only 10% of the studies that were included in the review on the effects of the retrieval practice with students focus on children; 24% are on children and adolescents and the other ones focus on adults (Agarwal et al., 2021). Likewise, in the literature review conducted by Adesope et al. (2017), most of the studies on retrieval practice were conducted with adults (e.g. undergraduate students). Although most of the studies involved undergraduate students, the meta-analysis by Yang et al. (2021) shows that the benefits of learning via retrieval practice can be observed in higher education, as well as basic education and high school.
Given the importance of providing guidelines for the use of effective strategies (Slavin, 2000), this study aims to provide a current synthesis of the contemplated variables in the studies, as well as the results obtained when the retrieval practice was realized with children. To do so, we used the method named "Preferred Reporting Items for Systematic Reviews and Meta-Analyses" (PRISMA) (Moher et al., 2009). The obtained results are presented with a focus on the number of experiments and participants, the geographical location, age group, setting, materials, type of control condition, repetitions and moments for the retrieval practice, presence or absence of feedback, test formats, lag, retention interval, and effects of the retrieval practice.
Materials and methods Results
This review uses the PRISMA method (Moher et al., 2009). As eligibility criteria, the selected articles were reviewed by peers on the theme related to the retrieval practice with children up to 12 years of age (this is the age at which people start to be considered adolescents). We considered research works developed at home, in the laboratory (including research conducted at the school in a separate classroom, with the experimenter), or at the school (in the context of a regular classroom) (setting) written in Portuguese or in English.
The bibliographical search was made in the period from May 2020 to March 2022. There was no deadline for the inclusion of articles. Regardless of the publication date was, all articles identified in the search were considered in the screening/analysis. The included databases were: Portal of Journals of the Coordination for Higher Education Personnel of Ministry of Education and Cultura (CAPES/MEC), Education Resources Information Center (ERIC), PsycINFO, Scielo, PubMed, and Web of Science. As search strategies, the following combinations of keywords were used: "testing effect", "retrieval practice", "test-enhanced learning" (all of them in the title or topic/abstract) AND child* (in any part), as well as their versions in Portuguese that focused on the practice as a study strategy (we excluded, for example, research focused on false memories). In the advanced search, we restricted the search to articles reviewed by peers. The searches were conducted by two researchers independently. When there was indecision regarding the eligibility of an article, both reassessed the full text to verify the previously established inclusion and exclusion criteria. In addition, there was the inclusion of studies tracked down in the references of the critical review on the literature with children (Fazio & Marsh, 2019) and of the systematic review by Agarwal et al. (2021), which did not appear in the searches.
Results
As a result, 224 articles were identified, 49 results were based on the CAPES journal database, 41 from ERIC, 23 from Web of Science, 75 from PubMed, 22 from PsycINFO, 14 from Scielo, and 8 from other sources [references on the reviews (Agarwal et al., 2021; Fazio & Marsh, 2019) that did not appear in the searches (Adler et al., 2000; Agarwal et al., 2014; Carneiro et al., 2018; Hotta et al., 2016; Lima & Jaeger, 2020; Lipko-Speeda et al., 2014; Sheffield & Hudson, 2006; Spitizer, 1939)]. After eliminating 46 duplicated articles, we screened 186 articles. By means of the verification of titles and abstracts of the articles, we excluded 97 results as they did not meet the eligibility criteria; that is, they did not refer to research developed on the retrieval practice with children. Thus, 89 studies were fully read. Of these, 48 articles were excluded with justification (they did not approach the retrieval practice with children, for example. For details regarding the reasons why studies were excluded, please see the supplementary document. Therefore, this review contains 41 articles, according to Figure 2.
Sample Description
The present review includes a sample of approximately 92 experiments (comparisons), with 10.153 participants (babies and children) (Table 1). The numbers of experiments and participants are approximate because there are two studies that are longitudinal, in which the exact number of experiments and the size of the final sample remain unclear (Agarwal et al., 2014; Bailey et al., 2012). For example, the research by Agarwal et al. (2014) was conducted with a sample of 1.408 students, including children and adolescents altogether (1.306 were students in elementary education whose ages ranged from 11 to 14 years). The research works by Coyne et al. (2015) involved 15 children and adolescents, whose ages ranged from 8 to 16 years.
Haseneder et al. (2019) also developed their research with a combination of 357 children and adolescents whose ages ranged from 10 to 17 years. As a consequence, some results overlap investigations with children and adolescents. However, whenever possible, adolescents and adults were left out of the analysis, as the focus of this review is to go over the effects of the retrieval practice on children (up to 12 years old), according to the descriptions by the eligibility criteria.
Taking into consideration the age group of the target sample (children up to 12 years old) and the presence of beneficial effects of the retrieval practice, the literature shows that babies of 18 months or older (Sheffield & Hudson, 2006) already display the positive effects of this practice (Table 1). Three studies (Dang, Yang, Che et al., 2021; Dang, Yang & Chen, 2021; Lechuga et al., 2006) aimed to compare learning performance after retrieval practice with that of children of different ages. For example, younger children (6-9 years), older children (10-12 years), and adults practiced retrieving during the reintroduction to content. In the final test, the adults remembered more propositions (48%) than the older children (41%) and the younger children (34%) (Lechuga et al., 2006). On the other hand, Dang, Yang, Che et al. (2021), as well as Dang, Yang & Chen (2021), observed that the frequency of correct retrieval increases with age and that, when compared to rereading, learning was more substantial for both younger children and older children (Dang, Yang, Che et al., 2021). In the comparisons, the authors affirmed that there is a difference between the performance of children and adults. However, the difference between the younger children and the older children is just numeric and not statistical (p = .47).
Lipowski et al. (2014), in turn, verified that, although the general levels of performance were better for children aged from 8 years and 2 months to 9 years and 11 months than for younger children (6 years and 5 months to 8 years and 2 months of age), the magnitude of the benefits of the retrieval practice did not increase significantly with age.
Most of the studies involved children with typical development, except for three studies, (Haebig et al., 2021; Leonard & Deevy, 2020; Leonard et al., 2020). In the study by Leonard and Deevy (2020), children with specific language disorders benefitted from spaced retrieval practice. In this study, the children were introduced to new words (names of exotic plants and animals) and their respective images and characteristics. Each child participated in two learning conditions: more retrieval practice and less studying; more studying and less retrieval practice, with short breaks between the four learning phases. During the manipulation of the retrieval practices, the children were shown a picture and were expected to remember the name and some of its characteristics. As a result, retrieval practice was more beneficial for learning than rereading/ studying (Leonard & Deevy, 2020).
Another similar study showed that no matter their lexical skills, children with specific language disorders benefitted from the retrieval practice for learning new words (Leonard et al., 2020). Haebig et al. (2021) conducted studies with 36 children aged between 4 and 6 years. Half of the children presented typical development and the other half had specific language disorder. As a material, researchers used sets of names of exotic plants and animals and corresponding images, in two control conditions (immediate retrieval versus repeated retrieval), with cued-recall as criterion tests. The authors verify that contextual re-establishment, due to the spaces between retrieval practices, enhances the long-term retention of words for both groups of children (with typical development and with specific language disorder) (Haebig et al., 2021).
Coyne et al. (2015) studied children and adolescents who survived traumatic brain injuries. They identified that retrieval practice is a promising strategy for supporting learning and rehabilitation of the memories of children with pediatric cranioen-cephalic trauma (Coyne et al., 2015).
Regarding geographical location, 49% of the studies on retrieval practice in the teaching and learning process with children were developed in the US. Only 5% of studies that were included in this review were conducted in Latin America, according to Table 2.
Note: Acccording to the authors, in the article by Agarwal et al. (2014), the number of comparisons (experiment) is approximate, as well as the number of participants because they considered children and adolescents together. In the study by Bailey et al. (2012), the number of comparisons (experiments) is approximate. In the experiments by Coyne et al. (2015) and Haseneder et al. (2019) the number of participants is also approximate, for reporting the number of children and adolescents together.
Some experiments used more than one testing format.
Materials
The materials used to investigate the effects of the retrieval practice on children were numerous. There were stories (Goossens, Camp, Verkoeijen, Tabbers & Zwaan, 2014), illustrated books (Cornell et al., 1988), texts adapted from pedagogical materials (Agarwal et al., 2014; Barenberg & Dutke, 2019; Jaeger et al., 2015; Karpicke et al., 2014; Leahy & Sweller, 2019; Lima & Jaeger, 2020; Marsh et al., 2012; Moreira, Pinto, Justi & Jaeger, 2019; Roediger, Agarwal et al., 2011; Rowley & McCrudden, 2020; Spitizer, 1939), proper names (Fritz et al., 2007), names of animals, plants, along with their respective information and semantic facts (Haebig et al., 2021), names of taxonomic categories represented by pictures (Lipowski et al., 2014), associated images and objects (Kliegl et al., 2018; Ma, Li, Duzi et al., 2020), associated images and pictures with their names and meanings (Leonard & Deevy, 2020; Leonard et al., 2020; Moore et al., 2018), lists of words (Bouwmeester & Verkoeijen, 2011; Dang, Yang, Che et al., 2021; Dang, Yang & Chen, 2021; Haebig et al., 2019; Jones et al., 2015; Karpicke et al., 2016; Ma, Li, Li & Zhou, 2020) introduced by one illustration, one definition and one context phrase (Goossens et al., 2016), words and synonyms, (Goossens, Camp, Verkoeijen & Tabbers, 2014). Pairs of associated words were also used (Carneiro et al., 2018; Coyne et al., 2015; Hughes et al., 2018; Lechuga et al., 2006), pairs with concepts and their definitions (Lipko-Speeda et al., 2014), cards containing geographic information (Ritchie et al., 2013), fictitious maps (Rohrer et al., 2010), simple adding exercises (Bailey et al., 2012), theoretical and demonstrative lecture for first aid (Haseneder et al., 2019), target actions (Sheffield & Hudson, 2006), small items such as toys and a box with subdivisions (Hotta et al., 2016). Equally important are studies that used content adapted from pedagogical materials (Agarwal et al., 2014; Karpicke et al., 2014; Leahy & Sweller, 2019; Lima & Jaeger, 2020; Marsh et al., 2012; Roediger et al., 2011b; Rowley & McCrudden, 2020; Spitizer, 1939), texts (Jaeger et al., 2015; Moreira, Pinto, Justi & Jaeger, 2019), and diverse disciplines such as chapters used in social studies (Roediger, Agarwal et al., 2011), sciences, history, and geography (Marsh et al., 2012).
Control conditions
Almost half of the control group in the studies used rereading (Table 2). The rereading was done by the children (Bouwmeester & Verkoeijen, 2011; Carneiro et al., 2018; Cornell et al., 1988; Coyne et al., 2015; Dang, Yang, Che et al., 2021; Dang, Yang & Chen, 2021; Goossens, Camp, Verkoeijen & Tabbers, 2014; Goossens et al., 2016; Hughes et al., 2018; Jaeger et al., 2015; Karpicke et al., 2014; Karpicke et al., 2016; Leahy & Sweller, 2019; Leonard et al., 2020; Lima & Jaeger, 2020; Lipko-Speeda et al., 2014; Ma, Li, Duzi et al., 2020; Ma, Li, Li & Zhou, 2020; Moreira, Pinto, Justi & Jaeger, 2019), or by the experimenter (Dang, Yang, Che et al., 2021; Dang, Yang & Chen, 2021). Some studies combined rereading with other strategies such as conceptual maps (Karpicke et al., 2014), age (Dang, Yang, Che et al., 2021); elaboration and rereading (Ma, Li, Li & Zhou, 2020), elabora-tive codification (generation of mediating words) (Hughes et al., 2018), pretests and posttests (Lima & Jaeger, 2020). Of these studies, which used rereading as a control condition, only one found a negative effect of retrieval practice. However, the effect size was small and the average numeric difference between the groups was small (Goos-sens et al., 2016). The other ones replicated the positive effect. The ones that found positive and neutral effects depended on other variables of the experimental condition. For example, Ma, Li, Li & Zhou (2020) found a neutral effect for the retrieval practice without feedback. However, when the retrieval practice was followed by feedback, the effect was positive.
A similar way to reread is to revisit by looking repeatedly at images/pictures (Kliegl et al., 2018; Lipowski et al., 2014). While Lipowski et al. (2014) found a positive effect in retrieval practice compared to looking at images, Kliegl et al. (2018) showed that this effect depended on the test format (cued-recall generated an effect whereas free retrieval did not) and the size of the effect was considerable with the inclusion of feedback after of retrieval practice attempts.
Five studies used copying/rewriting as a control condition (Goossens, Camp, Verkoeijen, Tabbers & Zwaan, 2014; Goossens et al., 2016; Jones et al., 2015; Rohrer et al., 2010; Rowley & McCrudden, 2020). The retrieval practice strategy was better than copying, except in Goossens et al. (2016). In this study, the researchers taught vocabulary by copying the words or practicing retrieval with short breaks (distributed over one week) or long breaks (distributed over two weeks) between repetitions. As a result, in the two groups, the repetitions with short breaks led to a beneficial effect of retrieval practices (but the effect was neutral for long breaks and, for other groups, there was no effect). However, in one group, copying was better than retrieval practice via cued-recall, but the effect size was quite small, for the other groups, the effect was neutral.
Interestingly, Jones et al. (2015), by comparing retrieval practice with the rainbow writing1 technique, which is based on the rewriting principle, verified that retrieval practice promotes more efficient learning, with more accurate orthographic precision. One study took vision reintegration as a control (the babies watched a representational media showing target actions) versus live recreation (the babies were stimulated to physically realize the target actions). As a result, the retrieval practice, that is, recreating the action, is better than copying the action (Sheffield & Hudson, 2006). Similarly, having the children place objects in locations according to instructions, that is, copying the movement executed by the experimenter, is worse than the retrieval practice (Hotta et al., 2016). Two studies used oral repetition as a control and again, the retrieval practice was better and produced a large effect size (respectively) (Fritz et al., 2007; Leonard & Deevy, 2020).
Two studies did not use any test as a control condition (Agarwal et al., 2014; Barenberg & Dutke, 2019; Brojde & Wise, 2008; Haseneder et al., 2019; Ritchie et al., 2013; Roediger, Agarwal et al., 2011), while some of these studies used combined controls: no test versus inference test, versus fact testing (Brojde & Wise, 2008), no test versus rereading, versus web-based testing (Roediger, Agarwal et al., 2011), and one study used conceptual maps versus not using conceptual maps as a control (Ritchie et al., 2013). Obviously, all studies pointed out that being tested as a means to practice retrieval is better than doing nothing.
Placement of retrieval practice, repetitions, lag, and retention interval
Other comparisons analyzed repetitions and placement of the retrieval practice (Haebig et al., 2019; Haebig et al., 2021; Spitizer, 1939). They pointed to the fact that practicing immediately after learning leads to less forgetting (Spitizer, 1939). Likewise, in some groups, retrieval practice with short lags (distributed throughout one week) was better than retrieval practice with long lags (distributed throughout two weeks) (Goossens et al., 2016). On the other hand, increasing the number of retrieval practices with less rereading generated better retention of learned material when compared to more rereading with less retrieval practice (Leonard et al., 2020; Spitizer et al., 1939). Lima and Jaeger (2020), in research conducted in the Brazilian southeast in schools using encyclopaedical texts as material, verified that the posttests (episodic memory dependent) resulted in more retrieval than the pretests (semantic memory dependent). That is, the children supplied more correct answers when taking the posttests than when taking the pretests. However, the pretests, as well as the post-tests, produced better memory performance than rereading, which was a fact that was evidenced by the multiple-choice tests (Lima & Jaeger, 2020). Another finding is that including these tests in the learning process, that is, encouraging students to try and remember information during study time, might facilitate good posttest performance in comparison to rereading, in immediate posttests as well as in late posttests (Leahy & Sweller, 2019).
The lags used in the experimental procedures, that is, the time between learning (content introduction) and the first retrieval practice, ranged from immediate up to two weeks (Table 2). If we consider immediately the amounts of time ranging from seconds to a few minutes, this represents 71% of the sample. Again, almost all studies showed positive effects of the retrieval practice, except for the one by Goossens et al. (2016), in which there were indications of the fact that the effect of the retrieval practice depended on a smaller interval between repetitions, but not on the lag, as the moment of the first retrieval (lag) did not change between the two conditions (Goossens et al., 2016).
The retention interval, that is, the time after the manipulation (retrieval practice and control condition) and the final test, ranged from immediate to 9 months (Haseneder et al., 2019) (Table 2). The results suggest that the retrieval practice generates positive effects, including long retention intervals.
Setting
Concerning the setting, most of the studies (56%) were conducted at school (Table 2), (Agarwal et al., 2014; Bailey et al., 2012; Barenberg & Dutke, 2019, Fritz et al., 2007; Goossens, Camp, Verkoeijen, Tabbers & Zwaan, 2014; Goossens, Camp, Verkoeijen & Tabbers, 2014; Goossens et al., 2016; Haseneder et al., 2019; Jaeger et al., 2015; Jones et al., 2015; Karpicke et al., 2014; Karpicke et al., 2016; Kliegl et al., 2018; Leahy & Sweller, 2019; Lechuga et al., 2006; Lima & Jaeger, 2020; Lipko-Speeda et al., 2014; Moreira, Pinto, Justi & Jaeger, 2019; Ritchie et al., 2013; Roediger, Agarwal et al., 2011; Rohrer et al., 2010; Rowley & McCrudden, 2020; Spitizer, 1939). In the studies conducted in the classroom at school, only one obtained a negative effect of the retrieval practice, which is probably not related to the setting, but to the interval between repetitions and the control condition (copying) (Goossens et al., 2016). One result was neutral when compared to the concept map condition, but positive for the rereading condition (Karpicke et al., 2014), and the other results were positive for retrieval practice (Table 2).
Despite taking place at school, nine studies were conducted in individual rooms, which characterizes it as a laboratory study (Table 2) (Bailey et al., 2012; Fritz et al., 2007; Goossens, Camp, Verkoeijen & Tabbers, 2014; Jaeger et al., 2015; Kliegl et al., 2018; Leahy & Sweller, 2019; Lechuga et al., 2006; Lima & Jaeger, 2020; Lipko-Speeda et al., 2014). In the longitudinal study, some children changed schools or districts. Therefore, they were assisted on campus or in a van (Bailey et al., 2012). One study was conducted with small groups, with 2 to 4 children (Rohrer et al., 2010), also mimicking the laboratory setting. Out of these studies, six found positive effects of retrieval practice, and the other studies were mixed (positive and neutral). Seventeen (41,5%) studies were conducted in a laboratory (see Table 2). Out of these studies, 6 were realized in a laboratory, 12 pointed to the positive effects of the retrieval practice, 4 were mixed (positive and neutral) and one was positive for easy questions and negative for difficult questions in the absence of feedback (Marsh et al., 2012) (Table 2). That is, in all environments (classroom, school, but not in the classroom, or a laboratory), there are more positive than negative or neutral effects of the retrieval practice.
Other studies were total (Cornell et al., 1988) or partially realized in the participants' homes. Sheffield and Hudson (2006) realized experiments 1 and 2 in the laboratory, and experiment 3 was conducted in the laboratory and at home, according to the availability of each participant. As a result, the obtained effects were more positive for the retrieval practice in turn with the readings of the text (Cornell et al., 1988), as well as in the condition where the children physically re-enacted the target actions in an active way than when the children just listened to or observed the answers passively (Sheffield & Hudson, 2006).
Presence of feedback
Feedback was provided in 61% of the studies (Table 2). Of these, 68% found positive effects of the retrieval practice followed by feedback (Fritz et al., 2007; Haebig et al., 2021; Hotta et al., 2016; Hughes et al., 2018; Kliegl et al., 2018; Lipko-Speeda et al., 2014; Ma, Li, Duzi et al., 2020; Marsh et al., 2012; Roediger, Agarwal et al., 2011; Rohrer et al., 2010; Ritchie et al., 2013). The other ones were positive or neutral. However, difficult questions with no feedback generated a negative retrieval practice effect (Marsh et al., 2012).
Effects of the retrieval practice
Concerning the retrieval practice effects, 40 studies (approximately 93%, Table 2) found positive effects in some conditions that involved the retrieval practice (Agarwal et al., 2014; Bailey et al., 2012; Barenberg & Dutke, 2019; Bouwmeester & Verkoeijen, 2011; Carneiro et al., 2018; Cornell et al., 1988; Coyne et al., 2015; Dang, Yang, Che et al., 2021; Dang, Yang & Chen, 2021; Fritz et al., 2007; Goossens, Camp, Verkoeijen, Tabbers & Zwaan, 2014; Goossens, Camp, Verkoeijen & Tabbers, 2014; Haebig et al., 2019; Haebig et al., 2021; Haseneder et al., 2019; Hotta et al., 2016; Hughes et al., 2018; Jaeger et al., 2015; Jones et al., 2015; Karpicke et al., 2014; Karpicke et al., 2016; Kliegl et al., 2018; Leahy & Sweller, 2019; Lechuga et al., 2006; Leonard & Deevy, 2020; Leonard et al., 2020; Lima & Jaeger, 2020; Lipko-Speeda et al., 2014; Lipowski et al., 2014; Ma, Li, Li & Zhou, 2020; Ma, Li, Duzi et al., 2020; Marsh et al., 2012; Moore et al., 2018; Moreira Pinto, Justi & Jaeger, 2019; Ritchie et al., 2013; Roediger, Agarwal et al., 2011; Rohrer et al., 2010; Rowley & McCrudden, 2020; Sheffield & Hudson, 2006; Spitizer, 1939).
In approximately 25% of the research analyzed, experiments had a neutral effect, that is, no difference between carrying out the retrieval practice or the control condition (Carneiro et al., 2018; Cornell et al., 1988; Goossens, Camp, Verkoeijen, Tabbers & Zwaan, 2014; Hughes et al., 2018; Karpicke et al., 2014; Kliegl et al., 2018; Leahy & Sweller, 2019; Lipko-Speeda et al., 2014; Ma, Li, Duzi et al., 2020; Ma, Li, Li & Zhou, 2020; Moore et al., 2018). This neutral effect can be explained by the use of some variables such as the testing format. For example, the students did a little better in the cued-recall than in the free recall (Karpicke et al., 2014), or the effect stopped being neutral and became positive in the presence of feedback (Ma, Li, Li & Zhou, 2020; Kliegl et al., 2018).
Formats of the retrieval practice
It is possible to observe a variety of formats that one can use to engage in retrieval practice, such as multiple-choice, free recall, short answer, cued-recall, fill-in-the-blank, and recognition tests, as well as a combination of them all (Table 2). For example, a study by Karpicke et al. (2014), using texts adapted from pedagogical textbooks on sciences as material, used a free recall test compared to rereading and conceptual maps as a control condition. The final test was a short answer format; as a result, they found that retrieval practice is feasible as a study strategy for children when they receive guidance from their teachers to reach successful retrievals (Karpicke et al., 2014).
In experiment 2, Lechuga et al. (2006) applied cued-recall tests, presenting the first letters of the target word that matched the studied word pairs (material used in the research). In the final test, they used free recall. Similar to a previous study, even using different test formats at the moment of study and the criterion test, it is still possible to verify the positive effects of the retrieval practice, that is, this effect does not depend on the congruence between the formats of the test at the moment of reintroduction to the information and on the criterion test (Lechuga et al., 2006).
However, by offering | easy and difficult tests with contents on science, history, geography, and other educational topics, Marsh et al. (2012) conducted experiments with multiple-choice tests (in which students were given tricky choices, that is, wrong answers among the right alternatives), and the final test was short answer tests. As a result, they observed that the difficult questions with no feedback generated a negative effect on the retrieval practice, as students learned wrong answers.
Discussion
The present systematic review assessed 41 complete articles on retrieval practice with children. Due to the need to instruct educators to use an education based on evidence (Slavin, 2020), there is plenty of replicability in the retrieval practice effect, which leads to proposing its use since childhood, according to what was previously suggested (Fazio & Agarwal, 2020). The present review included studies on different age groups, starting with babies as young as 18 months old (Sheffield & Hudson, 2006). Such a result matters because it might amplify the use of this study strategy from an early age, as studies suggest that the direct effect of the test, that is, the decrease in forgetfulness by making learning last through the use of retrieval practice, shows up early in human life (Dang, Yang, Che et al., 2021). In addition to this direct effect, there is also facilitation for further learning (Dang, Yang & Chen, 2021).
Nonetheless, there are details concerning age, perhaps due to cognitive development itself. For example, a study points to the fact that older children benefit more from retrieval practice than younger children do (Lechuga et al., 2006), which is consistent with the findings by Aslan and Báuml (2015). Nevertheless, it is still possible to observe the benefits of the retrieval practice in different age groups (Table 2), which does not appear to depend on levels of reading skills, comprehension, or processing speed (Karpicke et al., 2016). This corroborates the premise that the human brain is ready to learn from the very beginning using retrieval practices (Hotta et al., 2016).
However, to recover information accurately, it is necessary to have some skills developed (Bailey et al., 2012). Thus, we must pay attention to the development of children by adapting the practices, which takes us to the topic of test formats. The test formats (e.g. multiple-choice, short answer) might vary and amplify the possibilities for teachers in the classroom, and it will not be necessary to be restricted to just one format. In addition, when there is incongruence between the formats at the moment of reintroduction to content and the final test, a transfer can occur, leading to even more expressive results (Rohrer et al., 2010). It is important to emphasize that the test format to be chosen must be in accordance with the level of difficulty of the content, that is, for difficult content, it might be a good idea to avoid multiple-choice tests as students can learn the wrong answer (Butler, 2018). Besides that, professionals should use scaffolding strategies when reintroducing content via retrieval practices, for example, by providing clues, such as the initials of the words the students are supposed to remember (cued-recall) (Lechuga et al., 2006). Images can be presented with little pixel distortion (Kliegl et al., 2018). Thus, there is an increase in the probability of obtaining at least half of the answers correctly, which seems to be necessary to reveal the beneficial effect of this practice (Rowland, 2014). Other recommendations point to the fact that tests that attribute little or no grading might benefit learning more than those that involve grading (Roediger, Agarwal, et al., 2011).
Similar to the review by Agarwal et al. (2021), most of the studies were conducted in the US. Therefore, it is still necessary to conduct research in other cultures with diverse populations (Henrich et al., 2010). Even though one ofthe inclusion criteria was that the articles were written in English or Portuguese, we found no studies in Portuguese. This is probably because publications are required to be internationalized, that is, published in English to extend the reach of the article.
We can observe that the materials used in the studies were diverse, and it was possible to note benefits ranging from vocabulary amplification, the learning of synonyms (Goossens, Camp, Verkoei-jen & Tabbers, 2014), orthographic improvement (Jones et al., 2015), learning of semantic information (Heabig et al., 2021), proper names (Fritz et al., 2007), texts in illustrated books (Cornell et al., 1988), chapters from books on science, history, and geography (Marsh et al., 2012), and the learning of first-aid skills (Haseneder et al., 2019). Such diversity of materials is important because it shows that the effect of the retrieval practice can be observed in different materials, which makes it a highly useful strategy (Dunlosky et al., 2013).
Similar to in other reviews (e.g. Agarwal et al., 2021), the most commonly used control condition is rereading. Here, our analysis also shows a replication ofthe benefit of retrieval practice. However, there are other ecological strategies for the classroom in this age group such as copying/rewriting (Rohrer et al., 2010; Goossens, Camp, Verkoeijen, Tabbers & Zwaan, 2014; Jones et al., 2015; Rowley & McCrudden, 2020), oral repetition (Fritz et al., 2007; Leonard & Deevy, 2020), and conceptual maps (Karpicke et al., 2014). Most of them point to the beneficial or neutral effect of the retrieval practice. Future studies should use other strategies that are widely used for this age group as a control condition, such as word-searching games, hidden vocabulary and brainstorming.
Concerning the ideal placement of the retrieval practice, in the intervening phase, it is possible to observe experiments that offer this practice before (Lima & Jaeger, 2020), during (Cornell et al., 1988), or after the introduction of content (Lipko-Speeda et al., 2014) with diverse lags, and especially immediate ones (or almost immediate, from seconds to minutes). Regarding that matter, there is no consensus about the best moment for providing retrieval practice. For example, Üner and Roediger (2018) verified that the retrieval practice after reading a whole chapter or after reading each section promotes equivalent benefits. However, one naturalist study with undergraduate students, showed that performing retrieval practice at the end of the class was more effective in promoting long-lasting learning, than placing this moment at the beginning of the following class (Ekuni & Pompeia, 2020). At the same time, there are indications of greater benefit in smaller intervals between repetitions in experiments with children (Goossens et al., 2016). Smolen et al. (2016) explain that the ideal procedure involves not waiting too long after the initial introduction to the content to do the retrieval practice because time makes it harder to reactivate memory.
There is no research comparing the same experiment in a real classroom environment and in the laboratory. However, concerning the setting, there are retrieval practice benefits in studies conducted at the participants' homes (e.g. Cornell et al., 1988), as well as in the laboratory (e.g. Haebig et al., 2021), and most importantly, at the schools in ecologic conditions (e.g. Rowley & McCrudden, 2020). Thus, there is coherence with literature reviews that replicate the efficacy of retrieval practice strategies, not only in the laboratory (Moreira, Pinto, Starling & Jaeger, 2019).
Feedback is provided by most of the studies, although there is a positive effect of retrieval practice even without feedback (e.g. Moore et al., 2018). Nevertheless, it is important to emphasize that feedback helps children correct their mistakes and retain the correct information (Marsh et al., 2012). The inclusion of feedback reversed the negative effects of the retrieval practice when compared to rereading. The paradigms that include feedback do not measure only the participants' capacity to learn with the retrieval practice; they also measure their capacity to learn with the feedback itself (Fazio & Marsh, 2019). In other words, there is an enhancement in learning when feedback is included (Agarwal et al., 2012; Agarwal et al., 2018; Fazio & Agarwal, 2020). Some studies demonstrate more significant benefits when immediate feedback is provided (Kliegl et al., 2018).
Finally, concerning the durability of the retrieval benefit, most of the studies took short retention intervals (less than 30 days). We emphasize that more repetitions lead to better retention (Leonard et al., 2020; Spitizer et al., 1939), and in regard to long-lasting learning, this is a variable that must be taken into account. Performing several tests has learning benefits, however, a meta-analysis conducted by Rowland (2014) shows that the first test produces a greater magnitude effect, whereas the second repetition is not as robust as the first test boost. Future studies should test longer retention intervals for this age group.
Given the need to provide students and teachers with guidelines on the best way to study (Tullis & Maddox, 2020), retrieval practice improves meta-cognition, that is, the capacity to recognize, or become aware of one's own performance during the teaching and learning process (Agarwal et al., 2018; Rowland, 2014). This is a relevant factor in regard to the implementation of this practice since early childhood. Other advantages involve an increase in memory precision and less confusion (Moore et al., 2018), less test anxiety (Agarwal et al., 2014), and the promotion of self-confidence in preschoolers (Fritz et al., 2007).
Thus, the recommendation is that teachers can improve learning results by encouraging students to retrieve the previously introduced information. This applies to children with typical development and seems advisable for children with specific learning disorder (Haebig et al., 2021; Leonard & Deevy, 2020; Leonard et al., 2020), and even brain injuries (Coyne et al., 2015).
However, even though the searches had been realized on a large number of databases (Journal Portals: CAPES/MEC, ERIC, PubMed, PsycINFO, Scielo, and Web of Science), the present systematic review presents limitations regarding the results identified in the database search using the proposed keywords. It is not possible to guarantee that the searches have identified all available papers on the theme. One example is the paper by Spitizer (1939). However, in this review, we considered the studies mentioned in other sources of reviews to mitigate such limitations. Nonetheless, some papers might not have been contemplated in the present review. Another limitation was related to the systematic literature review without conducting a meta-analysis. Future studies might fill that gap, including unpublished studies to avoid publication bias.
In sum, it is possible to observe that the results of this research are consistent with the findings of systematic reviews with meta-analyses in adults, which hinted at the fact that retrieval practice is more beneficial for learning than commonly used strategies (such as rereading) (Adesope et al., 2017; Rowland, 2014). This effect is replicated both in the laboratory and in the classroom (Agarwal et al., 2021), including different educational levels, test formats, and procedures. Therefore, it is possible to improve the teaching and learning process by encouraging students to use the retrieval practice when they study from an early age, according to what was suggested in the narrative review by Fazio and Marsh (2019).
Final Considerations
The present review focuses on studies involving retrieval practice with children, both in real classroom environments and in the laboratory. The results, with a high degree of replicability, indicate that retrieval practice is beneficial in the teaching and learning process for children from the age of 18 months with a diversity of materials and it has durable effects. However, it is necessary to take into account the adaptation to the level of difficulty of the material and provide clues so that recuperation is not so difficult. In addition, it is advisable to provide opportunities for retrieval practice immediately after introduction to the content to avoid forgetting information. It is important to emphasize that retrieval practice can be enhanced by providing feedback. Thus, teachers in child education are provided with robust evidence to support their decision to include the retrieval practice more often in their pedagogical routines.