Narrative review and meta-analysis of MALL research on L2 skills

This study employed a narrative review and a meta-analysis to synthesize the literature on mobile-assisted language learning (MALL). Following a systematic retrieval of literature from 2008 to 2017, 17 studies with 22 effect sizes were included based on predetermined inclusion and exclusion criteria. By categorizing the characteristics of the studies retrieved, the narrative review revealed a detailed picture of MALL research in terms of the language aspects targeted, theoretical frameworks addressed, mobile technologies adopted, and multimedia components used. The qualitative review helped to contextualize and interpret the results found in the meta-analysis, which revealed a large effect for mobile technologies in language learning, identified three variables (i.e. type of activities, modality of delivery, and duration of treatment) that might influence the effectiveness of mobile technologies, and confirmed the existence of a redundancy effect and a novelty effect in MALL practice. Implications for future research and pedagogy are discussed.


Introduction
The advancement and sophistication of mobile technologies have created new opportunities for language learning, both inside and outside the classroom.For example, learning materials could be readily and effectively delivered to language learners via mobile devices (Thornton & Houser, 2005).Learners could engage in activities that are directly or indirectly related to language learning and interact/communicate with other people in the target language, which could enhance the utilization and retention of the newly acquired language knowledge (Duman, Orhon & Gedik, 2015).Furthermore, the affordances of mobile technologies could boost learners' interest and motivation in language learning, leading to their deeper engagement with learning resources and hence increased language proficiency (Golonka, Bowles, Frank, Richardson & Freynik, 2014).However, new technologies might also expose learners to incomprehensible input and inaccurate feedback, or distract them with innovative software or hardware, leading to an emphasis on technological means over pedagogical goals.In the past 20 years, a large amount of research has been conducted to examine the use of mobile technologies in language learning, and now a synthetic analysis of mobile-assisted language learning (MALL) is possible.
Previous syntheses of MALL research have mainly adopted a descriptive approach, documenting MALL studies' characteristics and research trends (Burston, 2014a;Chinnery, 2006;Duman et al., 2015;Kukulska-Hulme & Shield, 2008;Shadiev, Hwang & Huang, 2017).One exception is Burston (2015), who analyzed the effect of mobile technologies on learning outcomes.By setting high minimum requirements such as reasonable duration for the project and adequate number of participants for the statistical generalizability, Burston found positive results that evidenced a MALL application advantage.However, although this study confirmed the efficacy of MALL, it made no use of statistical analyses.Technically, it is still a narrative review of MALL, as noted by Plonsky and Ziegler (2016).More objective evidence for the effectiveness of mobile technologies in language learning is yet to be yielded.Given that statistical analyses are essential for the quantitative evaluation of a treatment effect, the present study intends to use effect sizes to quantify MALL effectiveness.
To discuss MALL effectiveness, it is useful to first clarify what we mean by MALL.We adopted Kukulska-Hulme's (2020) definition of MALL as "the use of smartphones and other mobile technologies in language learning, especially in situations where portability and situated learning offer specific advantages" (p.743).With the mobility of learners and learning, MALL makes it possible to deliver learning materials anytime and anywhere, to provide learning feedback just in time, to support learning in both formal and informal settings, to enhance individualized and collaborative learning, and to provide multimedia affordances for language learning (Burston, 2014b).
This study also takes account of two fundamental problems that may permeate MALL projects: inadequate research design and technocentricity, as suggested by Burston (2015).For one thing, the frequent use of pre-or quasi-experimental methods may fail to adequately reveal the impact of mobile devices on learning outcomes.Without a control group for objective comparison, differences found between the pre-and post-test results may not necessarily be attributable to the use of mobile devices (Tallent-Runnels, Thomas, Lan, Cooper, Ahern, Shaw & Liu, 2006).As for technocentricity, with too much focus on technological innovation, MALL researchers may pay scant attention to uncontrolled variables that might also influence the learning outcomes, such as the "wow factor" or novelty effect (Ma, 2017), modality of information delivery, and type of activities (Burston, 2015).
To summarize, the present study employs a meta-analysis to analyze the effectiveness of mobile technologies in language learning, by focusing on studies that featured a comparison between MALL-based treatments and non-MALL-based treatments.Although limitations are often noted about comparing MALL-based treatments with more traditional learning treatments, many studies conducted on MALL effectiveness are based on this comparison, so we believe it is useful and necessary to synthesize this line of research with a meta-analysis.In doing so, MALL-related variables like multimedia modality, technology-mediated activities, and the novelty effect are also examined.Finally, the meta-analysis is complemented with a narrative review, which categorizes the characteristics of MALL studies in terms of the language aspects targeted, theoretical frameworks addressed, technology type adopted, and multimedia components used.As noted by Boulton (2016), a quantitative analysis could provide essential insights, but it does not capture the whole picture.Instead, it needs to be complemented with qualitative analysis that is more thought provoking and heuristically rich.Therefore, the guiding question of the narrative review is: What are the characteristics of MALL research?Specifically, 1.What skills have commonly been investigated in MALL studies? 2. What theoretical frameworks have commonly been addressed in MALL studies? 3. What mobile technologies have commonly been adopted in MALL studies? 4. What multimedia components have commonly been used in MALL studies?
The questions that guided the meta-analysis are: 1. Is there any difference in the effects of MALL-based treatments and other treatments on second language (L2) skills? 2. What factors might affect the effectiveness of mobile technologies in L2 learning?

Study identification and retrieval
To conduct the synthesis, an exhaustive and replicable search procedure for relevant literature was carried out.First, two commonly used electronic databases in the fields of applied linguistics and education were searched: Linguistics and Language Behavior Abstracts (LLBA) and Education Resources Information Center (ERIC).We decided to include these two databases following previous meta-analyses on language learning in general (Li, 2010) and on computer-assisted language learning effectiveness in particular (Grgurović, Chapelle & Shelley, 2013).The keywords and combinations of keywords used were mobile-assisted language learning, mobile-assisted instruction, mobile-assisted pedagogy, mobile-assisted teaching, mobile-assisted assessment, mobile-based, m-learning, mobile devices, portable devices, traditional instruction, second language acquisition/learning/development, foreign language acquisition/learning/development.
Second, a manual search was performed in widely cited journals related to technology and applied linguistics, including Computer Assisted Language Learning (CALL), CALICO Journal, ReCALL, Language Learning and Technology (LL&T), System, and TESOL Quarterly.Next, the reference section of the following published syntheses of MALL was also carefully examined: Burston (2014a), Burston (2015), Duman et al. (2015), Kukulska-Hulme &Shield (2008), andShadiev et al. (2017).
In searching the primary studies, the "file-drawer" problem should be acknowledged.That is, some studies without significant findings might be tucked away in researchers' file cabinets due to the fact that significant results increase the likelihood of publication (Norris & Ortega, 2000).One way to alleviate the publication bias is to include unpublished PhD dissertations (Hunter & Schmidt, 2004;Lipsey & Wilson, 2001), for they are carefully designed and contain detailed information about research methodology and statistical analysis.However, although we collected relevant dissertations, none of them were included for further analysis.The reasons for this were (a) the inaccessibility of the full text of some relevant PhD studies, (b) the overlap between some dissertations and published journal articles, and (c) the mismatch with the focus of our synthesis.In view of the exclusion of PhD dissertations from the current study, the publication bias was further addressed with a funnel plot and a trim-and-fill analysis (Borenstein, Hedges, Higgins & Rothstein, 2011;Li, 2010).
The process described above led to a retrieval of 135 empirical studies of potential interest, among which vocabulary (37) was discussed most, followed by the four basic skills (i.e.reading, writing, listening, and speaking) (32), L2 learning in general (23), affective aspects (i.e.perception, beliefs, attitudes, motivation) (19), use of mobile technology (13), and other aspects (e.g.critical thinking, learner preparedness) (15).Note that different language aspects were sometimes examined in one study.

Inclusion and exclusion criteria
Explicit inclusion and exclusion criteria for the present synthesis were developed based on the research questions we posed.
A study was included if it met all of the following criteria: 1.The study was published between January 2008 and October 2017 (i.e. the end of the literature search).One reason for choosing the year 2008 as the starting point is that Apple launched the App Store with 500 apps that year, which brought along a new MALL research areaapps for language learning (Stockwell & Sotillo, 2011).Moreover, Burston (2015) noted that empirical MALL implementation projects were seldom carried out before the year 2008.2. The study was published in English.3. The study adopted mobile technologies to assist language learning.4. The study targeted any of the four language skills (i.e.listening, reading, speaking, and writing).Our emphasis on language skills is not meant to diminish the importance of systematic research on learners' language features like vocabulary.We are particularly interested in exploring MALL effectiveness for learners' language use in communication.
Ultimately, the goal of instructed language learning is to help learners communicate effectively, as emphasized in the Common European Framework of Reference for Languages (CEFR) and the American Council on the Teaching of Foreign Languages (ACTFL). 5.The study featured a comparison between MALL-based treatments and non-MALL-based treatments.6.The study featured an empirical pre-test/post-test design and reported the mean difference between groups.In other words, Cohen's d was taken as a measure of effect size in this meta-analysis.Although it is possible to convert one effect size index to another (e.g. from r to d and vice versa), meta-analyses using different effect size measures are usually difficult to interpret (Lipsey & Wilson, 2001).
Studies were excluded for the following reasons: 1.The study did not address the effect of mobile technologies on L2 skills.Instead, it dealt with learners' motivation, perceptions, or other aspects of L2 learning.2. The study measured L2 skills using self-report measures, or different L2 skills were measured with a unified test.3. The report was a literature review or a theoretical piece without any empirical data (e.g.Burston, 2015).4. The study adopted a pre-test/post-test design without group comparison. 5.The study did not provide sufficient data for effect size calculation (e.g.means, standard deviations, and sample sizes).
It is worth noting that when multiple skills were examined in a single study, each skill contributed an effect size to the meta-analysis.As a result, 17 studies were included, resulting in 22 effect sizes (https://figshare.com/s/8281d4373589517809e6).We wish to emphasize that the excluded reports are by no means less valuable than those included here.However, due to our particular focus, we were unable to include reports that did not meet the aforementioned requirements.

Coding
The coding protocol for this synthesis mainly consisted of three categories: source descriptors, substantive aspects, and methodological features (Lipsey & Wilson, 2001).In the following, the most relevant descriptors (e.g.independent, dependent, and moderator variables) that have been taken into account for the current study are discussed.
In terms of independent variables, the experimental group and comparison group in each study were coded as MALL-based treatment and non-MALL-based treatment respectively.Some studies had more than one experimental group or comparison group.For example, Saran, Seferoğlu and Çağiltay (2012) included a mobile-mediated group, a computer-assisted group, and a control group in their study.In such a case, one effect size was calculated for the comparison between the mobile-mediated group and the computer-assisted group, and another effect size was calculated for the comparison between the mobile-mediated group and the control group.Dependent variables were the four basic skills (i.e.reading, writing, listening, and speaking), which were indicated by learners' test scores reported in primary studies.
As for the moderator variables, in systematically coding the MALL studies, a number of potential moderators were considered of theoretical or empirical foundation (e.g.learners' age, language proficiency, L1, target language).The coding results for these variables are presented in Table 1.Lastly, we decided to examine three variables in detail (i.e.type of activities, modality of delivery, and duration of treatment), because the information on these was complete in each study included.

Type of activities
Pedagogical practice involved in mobile language learning is performed either individually or collaboratively.When a task requires language learners to collaborate or negotiate, this results in the interweaving of language input, learners' internal capacities, and language output (Long, 1996), which enables learners to identify the gap between their production and the target forms and to monitor their own language (Gass & Mackey, 2015).Collaboration of this kind offers the learner both receptive and productive linguistic benefits.With a mediation of mobile technology, learner collaboration manifests itself either synchronously or asynchronously.Synchronous interaction relies on instant messaging in the form of audio, video, texts, or a combination (Awada, 2016).Conversely, asynchronous interaction does not involve real-time communication.It allows "revision and reediting of output or even task-breaking" (Wiemeyer & Zeaiter, 2015: 199), thus giving learners time to prepare and rehearse.
Collaboration aside, individual activities are also commonly seen in the MALL context (Cavus & Ibrahim, 2017).When required to undertake a mobile-mediated activity individually, learners are exposed to input or required to produce output.The beneficial role of input and output in learning a second language has long been established (Krashen, 1985;Swain, 1985), which is also the case for an input-output combination (Wen, 2018).However, it remains unknown how mobile technologies could be better integrated in individual learning activities to promote both input and output efficacy.Instead, most individual learning activities merely concern the delivery of content/ materials/information.As implied by Burston (2013), this uncreative use of mobile technologies ignores the other distinctive features mobile technologies can offer, such as peer connectivity and advanced communication.
In light of the varied efficacy of different mobile-mediated activities, it is important to view type of activities as a moderator variable and examine its role in mediating the learning outcomes.In the present study, mobile-mediated activities were coded as individualized or collaborative, with the latter subcoded as synchronous or asynchronous.

Modality of delivery
The second moderator variable considered here is modality of delivery.Mobile technologies with multimedia capability provide learners with rich input in the form of text, audio, video, pictures, graphs, and so on, which increases the possibility of the input being efficiently processed, comprehended, and integrated (VanPatten, 2015).However, other researchers (e.g.Kalyuga & Sweller, 2014) were less convinced of the positive effect of input multimodality.Kalyuga and Sweller (2014) argued for the existence of redundancy effect in multimedia learning.According to them, redundancy occurs when the same information is concurrently presented in multiple modes.Coordinating redundant information increases working memory load, which may interfere with rather than facilitate learning.Therefore, issues concerning the multimedia effect or the redundancy effect need further examination.In this meta-analysis, we coded the modality of information delivery as single (i.e.information provided in one mode), double (i.e.information delivered in two modes), and multiple (i.e.information offered in more than two modes).

Duration of treatment
The novelty effect, also known as the "wow factor" of mobile technology use, should also be acknowledged when evaluating the MALL effect (Stockwell & Hubbard, 2013).In their discussion of the potential impact of the novelty effect, Stockwell and Hubbard (2013) related to Stockwell and Harrington's (2003) examination of 15 email messages sent by learners of Japanese to Japanese native speakers over five weeks.Stockwell and Harrington (2003) observed a pronounced "first-message effect," which refers to the phenomenon that the initial email was richer and more elaborate compared to the next.They attributed the phenomenon to the initial excitement of learners communicating via emails.Chiu (2013) similarly found that learners' improvement is superior when intervention or treatment is shorter than a month.The longer a treatment lasts, the less effective mobile technologies seem to be.One possible reason might be that a fatigue phenomenon is likely to appear among learners after their initial experience of novel technologies.Therefore, in the current analysis, the duration of MALL treatment was coded as within 4 weeks, 4-8 weeks, or above 8 weeks.This time division was based on Chiu (2013) and Chwo, Marek and Wu (2018).
In the coding process, the first author coded all articles twice, with 20% of them randomly selected and recoded by a research assistant.The intercoder reliability was 0.96.Disagreements were resolved through discussion and modifications were made accordingly.

Data analysis and methodological decisions
For the narrative review, we adopted a thematic analysis (i.e. a qualitative research method) to identify patterns in the data (Smith & Firth, 2011).The meta-analysis, however, required a quantitative measure: effect size.The effect size may take different forms, depending on the construct investigated.Given that this meta-analysis examined the relative effectiveness of two treatments (MALL-based vs. non-MALL-based), the effect size index here relates to mean differences, and hence Cohen's d was calculated in this study.Most second language acquisition (SLA) metaanalyses have followed Cohen's (1988) benchmarks in interpreting the magnitude of effect size: 0.80 for a large effect, 0.50 for a medium effect, and 0.20 for a small effect.The present metaanalysis similarly adopted this criterion.
A two-step procedure was applied in the meta-analysis: effect size aggregation followed by moderator analysis.The effect size aggregation generated an average effect size that indicated the overall effect of MALL treatment.The 95% confidence intervals (CIs) for the average effect size were also calculated.The moderator analysis (i.e. a heterogeneity analysis using the Q test) was conducted to identify the variables that could mediate the treatment effectiveness.All analyses were conducted in Comprehensive Meta-Analysis software (Rothstein, Sutton & Borenstein, 2006).
One issue meta-analysts face is sample size inflation.Sometimes more than one effect size is taken from a single study, which might lead to a bias towards the findings that support better learning outcomes or learning effects (Borenstein et al., 2011).To minimize this bias, a shifting unit of analysis was adopted (see Patall, Cooper & Robinson, 2008;Shintani, Li & Ellis, 2013).Initially, effect sizes were calculated separately for each study.When the effect sizes were aggregated to obtain an overall effect, each study contributed only one effect size, namely the average of multiple effect sizes calculated.
Another issue often subject to debate is the choice between a fixed-effects model and a randomeffects model when aggregating the effect sizes (Borenstein et al., 2011).Given that participants and treatments of our primary studies varied in ways that could influence the learning outcomes, assuming there is no single true underlying effect size for the entire MALL effectiveness would be reasonable.Another reason for us to choose the random-effects model was that the primary studies included here were sampled from published literature.

The narrative review
As mentioned, 17 MALL studies met the requirements for the qualitative and quantitative analyses.In the sections that follow, we will first present the results regarding characteristics of MALL research in terms of language aspects targeted, theoretical frameworks addressed, mobile technologies adopted, and multimedia components used.

Commonly investigated skills in the MALL studies
Generally, the four basic language skills (i.e.reading, listening, writing, and speaking) were evenly examined in the 17 studies selected for analysis (see Table 2).This might be due to the affordances of mobile technologies in helping learners practice and hence improve their receptive and productive skills (Wiemeyer & Zeaiter, 2015).For example, Ahn and Lee (2016) designed mobile-assisted activities that allowed learners to continuously and spontaneously access authentic learning resources and practice their speaking skill.With the maturity of voice/speech recognition technology, using mobile technologies to enhance L2 speaking has been avidly proposed (Duman et al., 2015), which was also mirrored in our review.We found that all studies on speaking here only appeared from 2016 onward.The other three skills have frequently been investigated in the last decade, which may imply that MALL researchers constantly endeavor to innovate ways to enhance learners' language skills by integrating emerging mobile technologies and resources (Chapelle & Sauro, 2017).
In addition, the advancement of mobile technologies generates a strong sense of learning community within language learners, as noted by Kukulska-Hulme and Shield (2008).In this virtual community, both native speakers and language learners communicate with each other in the form of written or oral exchanges.In cases where classroom instruction focuses on vocabulary and grammar learning rather than the four basic skills, mobile technologies provide learners with opportunities to practice their listening, speaking, reading, and writing outside class (Chapelle & Sauro, 2017).These affordances have drawn the attention of many researchers who endeavor to improve language learning to the MALL field.This might be another reason why the four skills were evenly researched during the past decade.

Commonly addressed theoretical frameworks in the MALL studies
Nearly half (41%) of the MALL studies did not clarify any theoretical background, which corroborates Viberg and Grönlund's (2012) view that the MALL field is still in need of solid theoretical frameworks.Without mature frameworks to guide the discussion or interpretation of MALL results, the empirical evidence for MALL application could easily be ignored.The remaining 59% were related to varied theories (see Table 3), although most were previously established theories in language learning research: sociocultural theory (2), constructivism (2), learnercentered/self-regulated learning (2), situated learning (1), attention (2), output-driven/inputenabled model (1), and repeated reading strategy (1).
Besides, these theories were not referenced explicitly.That is, authors of the theory-based studies (59%) presented a theory somehow related to their experiments, but did not return to it in their discussion.For instance, Lan and Lin (2016) based their study on a sociocultural perspective, accentuating the important role of context in enhancing learners' pragmatic competence.However, they did not touch upon other essential concepts of the theory such as mediation and the zone of proximal development, nor did they return to the theory when discussing their findings.It may be concluded that a lack of a clear connection between the theory and the experiment is common in MALL research.Additionally, we found one theory-generating study (Andujar, 2016), but no theory-testing study, which echoes the findings of Viberg and Grönlund's (2012) review.It seems that the research efforts in the period 2013-2017 have not markedly contributed to the theoretical framework of MALL research.
Although there is still a lack of clear theoretical foundation, innovative efforts have been made.For example, rather than focusing on mobile technologies, Hsieh, Wu and Marek (2017) turned to specific activities enabled by LINE, a smartphone app.Based on Wen's (2018) output-driven/ input-enabled model, they designed a holistic oral training course that featured online written and oral interaction.Unlike the traditional focus of MALL research on innovative technologies, Hsieh et al. (2017) turned to "the way the technology was manipulated" (Burston, 2015: 16) and pushed the development of MALL in another direction by creating a tight link between MALL research and SLA research.
Our review also acknowledged a lack of field-specific theory for MALL research.The only exception is Andujar (2016).He introduced the framework for the rational analysis of mobile education (FRAME), which describes mobile learning as a process of mobile technologies converging with learners' learning and interaction.This model integrates technological characteristics of mobile devices and social and personal aspects of learning, attempting to distinguish the MALL field from other language learning areas.More research is needed to develop a MALLspecific theory (Viberg & Grönlund, 2012).

Commonly adopted mobile technologies in the MALL studies
Of the studies included, only one (Hwang et al., 2014) did not specify the mobile technology used.We categorized mobile technologies into three types: mobile devices, mobile apps, and MALL applications, following Grgurović et al.'s (2013) classification of technology.Mobile apps here refer to generic apps that are not specifically designed for language teaching and learning.The distinction between mobile apps and MALL applications was also made in consideration of Stockwell and Hubbard's (2013) attribution of MALL novelty effect to the fact that the primary

Technology-oriented approaches
Framework for the rational analysis of mobile education (FRAME) Andujar ( 2016) design of mobile devices and various apps are for personal and social communication purposes but not for L2 learning directly.
As shown in Table 4, half of the studies (8/17) used mobile apps to assist language learning, evidencing varied types of technology used for learning and calling for an expanded definition of mobile technology (Duman et al., 2015).Also, the rapid advances of mobile technologies influenced learners' choice of mobile devices.Previously, personal digital assistants (PDAs) were commonly used for language learning, which is no longer the case.The technological sophistication of mobile phones made PDAs completely redundant (Burston, 2014a).As a result, MALL has now essentially become synonymous with mobile phone applications (Burston, 2013).
Another trend is the emergence of varied MALL applications that make extensive use of pictures, videos, and audio.Generally, multimedia functionalities of mobile devices have only recently been exploited and more efforts are needed in this regard.

Commonly used multimedia components in the MALL studies
Among the MALL studies discussed here, a mix of multimedia components was found; that is, 13 studies (77%) used more than one type of media.The tendency of combining different media forms coincides with the advance of mobile technologies.The combination of media affords multiple modes of communication and interaction among language learners and hence markedly distinguishes MALL from the conventional paper-and-pen type of learning (Mills, 2011).Generally, audio and texts were used most frequently for content delivery.In our review, 13 studies (77%) used audio and 12 (70%) employed texts, followed by pictures (47%) and video (29%).Other forms were employed less frequently (see Table 5; for specific studies, please go to https://figshare.com/s/e58cfdea35aac491b378).Compared to the research that examined MALL implementation (Duman et al., 2015), the studies that targeted specific language skills tended to use multimedia in a more traditional manner, focusing on texts, pictures, audio, video, or a combination.Other forms like animation (Segaran, Ali & Hoe, 2014) and games (Burston, 2013) were neglected to some extent, as noted in Burston (2013) that MALL research has been slow to exploit these multimedia functionalities.
Another issue worth noting is the concept of text in new literacy contexts.The concept text becomes more ambiguous as a variety of media are mixed and integrated (Lankshear & Knobel, 2007), which now includes diverse semiotic formats like visual, audio, and written (Park & Kim, 2016).In the current study, however, we categorized the multimedia components without further conceptualizing the terms.That is, the word text here was directly taken from the primary MALL studies.

The meta-analysis
This section will report on the results concerning the effect of mobile technologies on L2 skills.Of the 17 studies that could reliably determine the effectiveness of mobile technologies, 12 appeared in 2016-2017, which echoes Burston's call for "statistically reliable measures of learning outcomes" (Burston, 2015: 16).
In order to ascertain whether our selection of primary studies was characterized by publication biasthat is, in order to check whether the selected studies all featured a large effect sizewe first performed a funnel plot together with a trim-and-fill analysis.The analysis revealed one missing value on the left side of the plot, and imputing the missing value changes the mean effect size from 0.95, 95% CIs [0.58, 1.30], to 0.83, 95% CIs [0.45,1.21].Generally, the extracted effect sizes were symmetrically distributed, indicating that the retrieved studies were reliably representative (i.e.without publication bias).

Effectiveness of mobile technology on L2 skills
The overall results for the relative effects of mobile technologies on L2 skills compared to non-MALL-based treatments are presented in Table 6, which includes information concerning number of comparisons (k), mean effect size (and related p value, standard error, CI), and between-group Q test results.For L2 skills as a whole, learners exposed to mobile-mediated learning substantially outperformed those receiving other treatments like traditional classroom-based instructions (d = .95,p < .01).For the four specific skills, mobile technologies benefited writing (d = 1.33, p = .05)and listening (d = .99,p < .05) the most, exerting a large effect respectively.They are followed by reading (d = .52,p < .05)and speaking (d = .46,p < .05),where only small-to-medium effects were found.Basically, our findings quantitatively mirrored Burston (2015).That is, in a general sense, MALL research focusing on listening, reading, speaking, and writing all evidenced a MALL application advantage in skills related to each of these domains of communication.It should be noted that these effects may be related to specific subskills (e.g.pronunciation) that have been included in these four overarching skills.

Moderator analysis
Table 7 summarizes the results of our moderator analysis.With respect to the first moderator variable (i.e.type of activities), all activity types enabled by mobile technologies significantly improved L2 skills, with individualized practice showing the largest effect (d = 1.37, p < .01),followed by asynchronous interaction (d = 1.10, p < .05)and synchronous interaction (d = .64,p < .01).The smaller effect size for synchronous interaction, compared to the asynchronous mode, suggests that asynchronous interaction may be more effective for language learning than synchronous interaction, echoing the findings of Lin's (2014) meta-analysis on computermediated communication.The reason might be that offline communication gives the learner more time to process the information and edit/re-edit the output, thus saving them from the embarrassment of making mistakes.Strikingly, it is the individualized practice that exerted the largest effect on L2 learning, although learner collaboration has increasingly been accentuated in MALL research (Burston, 2014a;Duman et al., 2015).This finding suggests that language learning requires learners to take control of their learning and self-regulate their choice of learning activities, as proposed by Agbatogun (2014).No significant differences were found across the three types of activities (Q b = 2.57, p = .27),which assures teachers and learners that both collaborative and individualized practice mediated by mobile technology can be effective.
One point worthy of further exploration is the distinction between cooperative and collaborative learning activities, both of which were labeled as collaborative in our meta-analysis.In some cases, however, learners may be able to learn individually through mobile devices in ways that allow them to interact with other people without collaboration.
Analysis of the second moderator revealed that the effects of delivery modality differed according to how multimedia components were combined.It seems that providing one source of information benefited L2 skills most (d = 1.85, p < .05),followed by the combination of two sources (d = 1.26, p < .05).This finding is inconsistent with Yun's (2011) meta-analysis, which found a larger effect for visual-text combination than text only.Yun argued that learners provided with multimodal learning opportunities are more likely to adapt to their individual learning styles and strategies.The reason behind this inconsistency might be that Yun only compared single modality with double modality, both of which exerted large effects on L2 learning outcomes as evidenced in our analysis.When more media components are presented, the learning effect deteriorates, which may corroborate the redundancy effect in multimedia learning (Kalyuga & Sweller, 2014).As evidenced in our study, only a medium effect was found for multimodality (d = .59,p < .01).It appears to be the case that when several sources of input are simultaneously presented to learners, they need to coordinate and integrate these sources, which may generate heavy demands on working memory and waste cognitive resources on unnecessary information.However, this finding should be interpreted with caution, as it is not certain that, in the studies included here, the information provided in two or more modes would be the same.Kalyuga and Sweller (2014) noted that, when the same information is presented in multiple forms, learners' cognitive resources might be distracted to unnecessary information, leading to their neglect of upcoming information and ineffective learning.We previously mentioned that empirical MALL research is slow to exploit these multimedia functionalities.In the future, it might be better for researchers to explore how to better implement the multimedia components in language learning, rather than focusing on delivering the same learning contents in multimedia modes.Similar to the activity type and the delivery modality, the final moderator variable that we will discuss (i.e.treatment duration) also witnessed varied effects.Basically, mobile-mediated learning seems most effective when treatment is shorter than 4 weeks (d = 2.61, p = .22).But this should be interpreted with caution, for the probability value is larger than .05.This might be due to small sample size (k = 2).Moreover, we found a trend that the longer the MALL implementation lasts, the less effective mobile technologies are.When a MALL project lasted longer than 8 weeks (d = .59,p < .01), the technologies only exerted a medium effect on L2 learning outcomes.One possible reason for the deterioration of the MALL effect is the fact that mobile devices and various applications are primarily designed for personal and social communication rather than L2 learning (Stockwell & Hubbard, 2013).As noted by Stockwell and Hubbard (2013), learners may be initially interested and engaged in using these technologies for their L2 learning, but their enthusiasm will decrease over time, along with their engagement.A related reason might be that L2 learners will experience boredom after their initial engagement with mobile technologies, which might further impede their L2 learning.This finding implies that in the future MALL researchers should endeavor to find ways to motivate and engage the learner so as to fully utilize authentic language input and online resources that the learner is exposed to.To do this, research efforts should either be on examining how learners use the same MALL app over weeks and pinpointing factors that moderate MALL effectiveness, or on conducting longitudinal research to examine the process of language development that is mediated by the use of various technologies and resources.It should also be noted that, although we discussed the duration of treatment as a moderator here, it is still unclear whether the MALL treatment in the studies examined was continuously applied.In some cases, the duration of a project might not be equivalent to the duration of a treatment.Future research could look closely at what exactly happened during the treatment phases and gain some insight, as suggested by one of our reviewers.

Conclusion and implications
The meta-analysis provided an empirically based answer confirming that pedagogy supported by mobile technology can be effective in enhancing second language learning relative to pedagogy implemented in more traditional ways.Our results yielded a large effect size (d = .94)for mobile technology applications on language learning.Specifically, the four global language skills (i.e.listening, reading, speaking, and writing) all witnessed a medium to large effect of MALL in the studies analysed for this paper.The effectiveness of mobile technologies was also found to differ according to learning conditions.Our moderator analysis conducted on type of activities, modality of delivery, and duration of treatment suggests the existence of a novelty effect and a redundancy effect in MALL implementation.Specifically, short-term interventions (within 4 weeks) produced larger learning effects than longer interventions (including 4-8 weeks and above 8 weeks), and pedagogy with multiple media components seems less effective than that with a single or dual medium support.Furthermore, we also found a larger effect for individualized mobile learning practices than collaborative ones.These findings offer insights and hold implications for future MALL research and practice.First, although multimedia affordances are increasingly emphasized in the MALL field, the larger effect we found for information delivered in one or two modalities implies that, instead of accentuating the provision of learning contents in multimedia modes, future MALL research should focus on exploring ways to manipulate the technology, integrating the learning material, and boosting its learning potential.Second, both short-and long-term treatments merit attention in terms of research methodology and teaching practices.For example, technology-novelty effects should be noted when designing a short-term study with an intervention mediated by mobile technology, for the effect of the intervention might be confounded by the novelty effect (Stockwell & Hubbard, 2013).Meanwhile, teachers may consider diversifying the MALL practice and take advantage of the technology-novelty effect to engage learners in effective utilization.As for longer interventions where learners' enthusiasm deteriorates, logistical support (e.g.elaboration on learning materials, guidance on technology use, and options for learning activities) is expected from teachers.Last, the large effect we found for the individualized practice in the moderator analysis warrants a focus on learners' individual learning needs, learning pace, and preferred way of using technologies in the design of learning activities.
Although our study has acknowledged the efficacy of mobile technologies in language learning, some limitations of this meta-analysis should also be noted.For example, there is a scarcity of rigorously designed MALL research on communicative skills (e.g.listening, speaking) (17 studies included in this synthesis).Similar to the issue of limited empirical research on skills, the number of theory-driven studies is also limited.Solid frameworks are needed to establish a link between (language) learning theories and mobile-mediated learning activities, thus yielding more robust conclusions related to empirical MALL research.Additionally, although much of the research included in the present study suggests positive effects, studies vary along a variety of dimensions, such as modalities, outcomes, and both the size and the nature of samples they include.Such variations should be noted in future research (including meta-analyses), and innovative methods are needed to examine MALL effectiveness in relation to these variations.A related limitation of this meta-analysis would be that sometimes different skills and competences are being researched under the same umbrella term.For example, a "speaking" study included in this meta-analysis could be concerned only with pronunciation, whereas another study could be looking at students' ability to describe a picture.Finally, it appears that few effectiveness studies documented learnerinitiated learning practice with a technology mediation.As Levy (2015) proposed, digging deeper into learner-initiated learning experience could contextualize the quantitative results found in experimental studies.Given its efficacy for individualized learning, future MALL research may consider proceeding in this direction.For instance, it would be interesting to examine how individuals engage in different learning experiences mediated by mobile technology, and how such experiences contribute to different language developmental paths over time.Innovative methods like longitudinal clustering technique might be employed to deal with this issue (see Scholz & Schulze, 2017).

About the authors
Hongying Peng is a PhD student in applied linguistics at the University of Groningen.Her current research interests include the use of mobile technologies in language learning and methodological issues related to complex dynamic systems theory.Sake Jager holds a PhD in applied linguistics from the University of Groningen.He is assistant professor in applied linguistics and project manager at the Centre for Learning Innovation and Quality, Faculty of Arts, University of Groningen.He specialises in computer-assisted language learning (CALL) with a research focus on the integration of CALL in institutional environments.
Wander Lowie holds a PhD in linguistics from the University of Groningen and is chair of applied linguistics at this university.He is a research associate of the University of the Free State in South Africa and is associate editor of The Modern Language Journal.His main research interest lies in the application of dynamic systems theory to second language development (learning and teaching).He has published more than 50 articles and book chapters and (co-)authored five books in the field of applied linguistics.

Table 1 .
Methodological features of the MALL studies https://doi.org/10.1017/S0958344020000221Downloaded from https://www.cambridge.org/core, on subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms.

Table 2 .
Distribution of commonly investigated skills in the MALL studies

Table 3 .
Distribution of commonly addressed theoretical frameworks in the MALL studies

Table 4 .
Distribution of commonly adopted mobile technologies in the MALL studies

Table 5 .
Distribution of commonly used multimedia components in the MALL studies

Table 6 .
Specific results: Comparative effects of MALL on L2 skills

Table 7 .
Moderator analysis Note.Synchronous and asynchronous activities both belong to collaborative type of activities.* Statistically significant at *p < .05,**p < .01.