Yes, DI did it: the impact of Direct Instruction on literacy outcomes for Very Remote Indigenous schools

Abstract In the journal article Did DI do it? The impact of a programme designed to improve literacy for Aboriginal and Torres Strait Islander students in remote schools, Guenther and Osborne (2020) compare schoolwide NAPLAN reading scale scores for 25 Very Remote Indigenous schools implementing Direct Instruction through the Flexible Literacy for Remote Primary Schools Program (‘Flexible Literacy’ or ‘the program’) with those for 118 Very Remote Indigenous schools not involved with the program, to assert the program has not improved literacy outcomes. Good to Great Schools Australia (GGSA) undertook an analysis of the same school data for Reading, Writing, Spelling and Grammar and Punctuation scores. Our findings contradict theirs. In all areas, schools participating in the program show significant growth compared with all Australian and all Very Remote Indigenous schools. In Reading, schools involved in the program from 2015 to 2017 averaged 124% growth, while the average growth for comparable ages was 19 and 34% for Australian and Very Remote Indigenous schools, respectively. In Grammar and Punctuation schools involved in the program in the same period grew 180%, whilst growth for Australian schools was 15%, and for Very Remote Indigenous schools, 28%. These contrasting results illustrate the importance of evaluating growth to assess the impact of educational programs, rather than achievement alone, particularly in the case of Very Remote Indigenous schools where achievement levels are far below Australian grade levels. Guenther and Osborne's comparison of achievement across schools rather than measuring growth within schools obscures real gains and is misleading.


Background
In 2014, the Australian Government sought tenders for the implementation of its Flexible Literacy for Remote Primary Schools Program ('Flexible Literacy' or 'the program') in schools in remote Indigenous communities (the schools that are the focus of this paper are hereafter called 'Very Remote Indigenous schools') but also some schools in regional centres with Indigenous students.
The program had two objectives: (1) increase teacher pedagogical skills in teaching literacy through the use of 'alphabetic teaching approaches' in particular, Direct Instruction or Explicit Instruction; and (2) improve literacy results for students in participating schools (Commonwealth of Australia, 2013).
Good to Great Schools Australia (GGSA) was awarded the tender and offered two literacy approaches: Direct Instruction and Explicit Direct Instruction. The choice of literacy approach was made either by the relevant school system or individual schools.
The program commenced in schools in Queensland, Western Australia and the Northern Territory at the beginning of the 2015 school year.
Direct Instruction was developed by Engelmann and combines direct instruction pedagogy with a carefully sequenced curriculum set out in scripted lessons that include regular student assessment (Adams and Engelmann, 1996;GGSA, 2014). Students are grouped in learning levels rather than age or year levels. Lessons are highly structured and students taught to mastery. Regular mastery tests are administered (every 5-10 lessons depending on the unit) and 90% proficiency is required before moving to the next set of lessons. Direct Instruction is one of few programs, evidence for the effectiveness of which is included in Hattie's (2009, p. 73) original 138 interventions in Visible Learning (see also Stockard et al., 2020). Hattie (2009, p. 205) attributed an effect size of 0.59 to Direct Instruction, commenting: Instruction' method as first outlined by Adams and Engelmann (1996). Direct Instruction has a bad name for the wrong reasons, especially when it is confused with didactic teaching, as the underlying principles of Direct Instruction place it among the most successful outcomes.
Explicit Direct Instruction, developed by Hollingsworth and Ybarra, employs direct instruction pedagogy and may be said to be derivative of Direct Instruction (Rosenshine and Stevens, 1986;Hollingsworth and Ybarra, 2013;GGSA, 2014, p. 27). Explicit Direct Instruction does not use a teaching script and is delivered to year level classes rather than learning levels, unlike Direct Instruction. GGSA worked with Hollingsworth and Ybarra to develop a custom-designed P-6 literacy curriculum, providing schools with a comprehensive set of ready-to-teach lessons aligned to the Australian Curriculum.
The program was delivered for 3 years from 2015 to 2017. Not all schools commenced at the same time. A number of schools left at various times, either as a result of school choice or school system policy change. The most significant departures came after the 2016 election, when the new Labor government in the Northern Territory enabled schools to leave the Flexible Literacy program and take up the Northern Territory's home-grown literacy approach, Literacy and Numeracy Essentials (LANE). Schoollevel departures usually followed changes in school leadership (GGSA, 2017a). School leadership turnover was very high during the 3 years of Flexible Literacy, with some schools turning over four principals during 2 years of implementation (GGSA, 2017a). It was not uncommon for principals to change in the middle of the school year.
Some schools switched their learning approach during the program, the most significant being Very Remote Indigenous Catholic schools in Western Australia that shifted from Catholic Education's initial preference for Explicit Direct Instruction to Direct Instruction (GGSA, 2017a).
The funded program ended at the end of 2017. It was extended for 15 schools in 2018, 13 schools in 2019 and 8 schools in 2020. Guenther and Osborne (2020, p. 1) reported on their analysis of NAPLAN results for 25 Flexible Literacy schools falling into the definition of Very Remote Indigenous schools, compared to other remote schools which they call 'non-participating primary schools'. To their rhetorical question 'Did DI do it?', referring to whether Direct Instruction achieved the objective of improving literacy results of the Very Remote Indigenous schools, they answer an emphatic 'No'.
Although Flexible Literacy consists of two distinct instructional approaches, Guenther and Osborne confined their analysis to Direct Instruction, so we have also only considered schools utilising Direct Instruction in this analysis.
Our analysis comes up with a different answer to Guenther and Osborne's question. The answer is: 'Yes, DI did it!' We analyse the same schools using the same NAPLAN data with a growth analysis and arrive at a completely different result. Our analysis measures literacy outcomes over time (from 2015 to 2017, and from 2017 to 2019) rather than a snapshot in time of their NAPLAN scores compared to other schools.

Measuring achievement versus measuring growth
Educators familiar with schools where learning achievement gaps are profound-extending to years of learning-will be well aware that it is growth that demonstrates the effectiveness of interventions. Goss (2018) reports that '[t]he average year nine Indigenous student in a very remote area scores about the same in NAPLAN reading as the average year three non-Indigenous city student, and significantly lower in writing'. When students in such schools are way behind the national Mean Scale Scores, it will take a long time and more than just literacy interventions, for improvements to be reflected in comparative achievement levels.
According to the Grattan Institute: Australia puts too much emphasis on student achievement at a point in time, and not enough on students' progress over the course of their schooling … Progress tells us more about the contribution schools make to student growth. The best schools in Australia are not those with the highest NAPLAN scores. The best schools are those that enable their students to make the greatest progress in learning (Goss and Sonnemann, 2018, p. 8).
This overreliance on snapshot measures of achievement has been met with persistent criticism from voices across the educational research field, with many highlighting the inappropriateness of using these measures to make judgments about educational effectiveness (e.g., Betebenner, 2009;Reynolds et al., 2014).
Research has called attention to the fact that accountability systems built around proficiency counts rather than achievement growth may not help students who are currently far above or far below standards (Buzick and Laitusis, 2010;Neal and Schanzenbach, 2010). Student growth is the measure of NAPLAN results from one point in time to the next-for example, from Year 3 to Year 5. Growth shows how much students learn as they move through their schooling. Achievement, on the other hand, refers to NAPLAN results measured against nationally agreed proficiency scales at a point in time. Achievement trends show year level results over time (Goss and Sonnemann, 2018, p. 8).
Growth measures tell us more about the value schools add to student outcomes, because they indicate what learning takes place in the classroom, while achievement measures are more likely to reflect the influence of student backgrounds and the socioeconomic background of the school as indicated by the Index of Community Socio-Educational Advantage (ICSEA).
It is for this reason the Review to Achieve Educational Excellence in Australian Schools recommended a goal to 'deliver at least one year's growth in learning for every student every year' (Gonski et al., 2018, p. x).

Purpose of current study
Guenther and Osborne examined NAPLAN data to evaluate the impact of Direct Instruction delivered through Flexible Literacy to 25 Very Remote Indigenous schools using an achievement measure. They conclude Direct Instruction had no impact.
Using the same measure, we evaluate academic growth in the same Very Remote Indigenous schools examined by Guenther and Osborne. We compiled three datasets.
First, we replicate Guenther and Osborne's dataset as closely as possible. This dataset is not accurate for reasons we explain below.
Second, we look at a dataset that accurately reflects schools that delivered Direct Instruction.
Third, we look at a dataset of literacy outcomes for schools that exited the program, to determine growth in those schools during the time they were involved in the program, and what happened after they exited.

Guenther and Osborne's methodology
We first address certain methodological issues with and inconsistencies in Guenther and Osborne's analysis.  Buckingham (2020) in her refutation of Guenther and Osborne's article:  findings were uncritically welcomed and promoted by commentators and academics who are opposed to Direct Instruction and Explicit Direct Instruction. But do they have any validity?
In a word, no. The analysis has a number of important weaknesses and one basic fatal flaw: the time period studied.
The analysis compares the average NAPLAN reading scores for a 'preintervention' period (2012-2014) with the average NAPLAN reading scores for a 'post-intervention' period (2015)(2016)(2017). The problem with this should be obvious straight away-the so-called 'post-intervention' period is not post-intervention at all.
The FLRPS program was announced in 2014 and the first full year of implementation was 2015, starting with 33 schools, and increasing to 34 schools at the end of 2017. Therefore, the 'post-intervention' data in the Guenther and Osborne study were actually collected in the first year (in fact the first four months for the 2015 data) and the two subsequent years of the intervention. When the final data set was collected in NAPLAN 2017, the program still had six months left to run.
2. Guenther and Osborne selected their sample from a list of schools found on the Australian Government website, to determine their dataset (Australian Government, 2020). Of the original list of 36 schools, they disregarded six schools listed as teaching Explicit Direct Instruction. They then disregarded a further five schools because these were either not Very Remote, not predominantly Indigenous (defined as 80% or more Indigenous students) or for which NAPLAN data were not available. This gave Guenther and Osborne a sample size of 25 schools.
One problem with this sample is that three of the schools did not start until 2016 and were only in the program for 16 weeks when they sat NAPLAN in the second year of the program. The classification of NAPLAN results for 2015 as post intervention, whilst beyond the knowledge of Guenther and Osborne, nevertheless is inaccurate (Appendix 1).
Furthermore, five of the 25 schools selected by Guenther and Osborne started with Explicit Direct Instruction in 2015 and only changed to Direct Instruction between 12 and 24 months after. These were Very Remote Indigenous Catholic schools in Western Australia that commenced with Explicit Direct Instruction as the preference of Catholic Education, but switched to Direct Instruction on their own initiative and on advice from GGSA. Essentially, the Catholic schools in regional centres fared well with Explicit Direct Instruction whereas the Catholic schools in remote communities found Direct Instruction more compatible with their circumstances (GGSA, 2017a).
Therefore, 17 of the 25 schools included in the sample used by Guenther and Osborne were relevant schools fully implementing Direct Instruction at the time of study, and eight were not-comprising almost one-third of the sample.

Our methodology and results
In the following section, we present GGSA's analysis of NAPLAN results to determine the impact on literacy outcomes for Very Remote Indigenous schools utilising Direct Instruction in the Flexible Literacy program from 2015 to 2019. We apply a growth rate measure that compares There are no available data for individual students, so we assume the same cohort of students remain in the analysed schools from Year 3 to Year 5. This is not ideal as student transience between schools is a factor, however the assumption about core student cohorts persisting between these NAPLAN tests, is reasonable. Year 3 students in 2015 can be assumed to be the same students in Year 5 in 2017. The same assumption applies to Year 3 students in 2016 being the same students in Year 5 in 2018. It is common for NAPLAN studies to use schoollevel data rather than individual student data. Guenther and Osborne use school-level data.
However, there are limitations to NAPLAN data in remote school contexts including incomplete data and issues related to small school size.
Guenther and Osborne compare results for their 25 sample schools with their 118 non-intervention schools. The 118 schools are not identifiable so we are unable to replicate their control group. Instead, we use two benchmarks: Australian schools as a whole and Very Remote Indigenous schools as a whole. It is not altogether clear why Guenther and Osborne chose these 118 schools for comparison rather than a more readily ascertainable control group.
Our three research questions were as follows: 1. What is the difference in achievement growth between the 25 schools reported by Guenther and Osborne in comparison to Australian schools and Very Remote Indigenous schools? To compare like for like to determine the results of using a growth rather than an achievement measure, we analysed NAPLAN results for the same 25 schools that we assume Guenther and Osborne used, even though only 17 of the 25 were actually delivering Direct Instruction between 2015 and 2017, as explained above.
In their analysis, Guenther and Osborne use 2012-2014 as 'before the program' and 2015-2017 as 'during the program'. We were unable to replicate the exact time period used by Guenther and Osborne because the earliest NAPLAN data currently available on MySchool is 2014. Instead we use data for the same 25 schools in the Guenther and Osborne sample to determine growth between Year 3 2015 to Year 5 2017. 1 Formula for the growth rate measure: Growth rate (Year 5 yeari+2-Year 3 yeari) = 1 n n k=1 Noel Pearson Given the first full year of implementation was 2015 this timeframe offers a more suitable representation of growth before and after program implementation. Using a growth measure, these 25 schools show significantly better growth in all test areas-Reading, Writing, Spelling, Grammar and Punctuation-in comparison to the growth of all Australian schools and Very Remote Indigenous schools overall. Figure 1 represents the growth in all NAPLAN literacy test areas, from Year 3 2015 to Year 5 2017.
Guenther and Osborne did not look at other test areas instead using Reading as a proxy for literacy. In relation to Year 3 Reading, these authors found that 'for the DI schools the average NAPLAN scores declined by 23.43 points while for the non-intervention schools the results increased by 4.47 points' (Guenther and Osborne, 2020, p. 5). In relation to Year 5 Reading, they found 'for the DI schools the average NAPLAN scores declined by 19.48 points while for the nonintervention schools the results declined by 15.12 points' (Guenther and Osborne, 2020, p. 5).
Our growth analysis tells a very different story. Reading shows 89% growth for program schools, while the Australian schools average is 19 and 34% for Very Remote Indigenous schools.
Program schools made the greatest growth in Grammar and Punctuation: 131% from 2015 to 2017. Overall Australian schools growth for Grammar and Punctuation is 15% and for Very Remote Indigenous schools, 2%.
Spelling growth for program schools is 67% whilst Australian schools growth is 23% and for Very Remote Indigenous schools, 48%. Writing growth for program schools is 42%, whilst Australian schools growth is 13% and for Very Remote Indigenous schools, 24%. For both periods, the highest growth is in Grammar and Punctuation: 180% from 2015 to 2017 and 85% from 2016 to 2018. The growth of Australian schools is 15 and 21% for Very Remote Indigenous schools.
Reading also shows a high impact with 124% growth from 2015 to 2017 while the Australian schools average is 19 and 34% for Very Remote Indigenous schools. During 2016-2018, Reading growth is 50% whilst the Australian and Very Remote Indigenous is 13 and 24%, respectively.
Spelling shows 67% growth from 2015 to 2017 and 60% from 2016 to 2018. In both periods, the growth was higher than the Australian and Very Remote Indigenous school averages.
Writing shows 50% growth from 2015 to 2017 compared to 13 and 24% Australian and Very Remote Indigenous schools, respectively. For 2016-2018, growth is 25% whilst the Australian and Very Remote Indigenous is 10 and 9% respectively. 3. What is the impact on literacy outcomes of schools exiting the Flexible Literacy program? At the commencement of Flexible Literacy in 2015, there were 20 schools utilising Direct Instruction, 10 Indigenous The Australian Journal of Indigenous Education schools and three mainstream schools utilising Explicit Direct Instruction. In 2016, a further five schools joined the program. During 2017 and 2018, a number of schools exited. This was sometimes due to a change in school leadership, but the change of government in the Northern Territory in August 2016 produced a policy change that disfavoured the program. The program was extended in 2018 for 15 schools and 12 schools continued to utilise Direct Instruction. Figure 3 shows the results for schools that exited Flexible Literacy by comparing their growth for the period they were in the program with their growth in the period after they exited. Looking at the 14 schools that discontinued and comparing their growth between 2015 and 2017 (when they were still In Reading, while the schools were participating in Flexible

Noel Pearson
Literacy from 2015 to 2017, the growth was 178% and after they left the program dropped to 96% between 2017 and 2019. Similarly, in Spelling and Writing, the growth for schools after leaving Flexible Literacy dropped from 69 to 45% and from 51 to 35%, respectively, during the periods under analysis.
The analysis shows that the 14 schools that discontinued Flexible Literacy lost the opportunity to achieve higher literacy growth.
These results contrast with Guenther and Osborne's allegation that Flexible Literacy schools had better results before the program. They write: However, the lower post-intervention results for DI school NAPLAN scores should be of some concern as they suggest that the intervention has a potential to be associated with educational harm to at least some students (Guenther and Osborne, 2020, p. 5).
As the data we are presenting here makes clear, this conclusion is startling. In light of these results, the real educational harm comes from the flawed methodology adopted by Guenther and Osborne to misrepresent the real growth experienced by schools utilising Direct Instruction, and their deteriorating performance after they discontinued.
It is a disservice to these schools for results like Guenther and Osborne's to go unchallenged when they conclude wrongly that a proven program with the evidence base accumulated by Hattie had no positive effect on the literacy growth of students from Very Remote Indigenous schools utilising Direct Instruction. Our analysis here shows rates of literacy learning growth superior to comparison Very Remote Indigenous schools and to Australian schools generally. Whilst coming off a very low base and having a long way to go before achievement gaps are closed, this growth is hopeful and consistent with worldwide evidence of the efficacy of Direct Instruction.

Discussion
The limitations of using NAPLAN for measuring literacy growth and program effectiveness The original evaluation framework for Flexible Literacy adopted by the Advisory Committee in 2014 proposed that program data would be used, as well as any relevant assessments administered by the various school systems, such as the Progressive Achievement Test-Reading (PAT-R). The various school systems represented on the Advisory Committee committed to providing data for the evaluation.
NAPLAN was not proposed as a data point for the evaluation as it was understood from the outset that the target schools were many years behind their mainstream peers, and much more finetuned measures of early literacy skills were needed. A range of instruments were considered including Burt Reading Test, FELA (Foundations of Early Literacy Assessment), SPAT-R (Sutherland Phonological Awareness Test-Revised), TOWRE-R (Test of Word Reading Efficiency Second Edition) and PAT-R. However, because the Advisory Committee identified the need to minimise workload on schools, PAT-R and the Catholic Education system's literacy data set EYLND (Early Years Literacy and Numeracy Data) were used as these tests were already administered in schools.
One of the challenges with the monitoring and evaluation of interventions like Flexible Literacy, particularly in resourceconstrained contexts like those prevailing in Very Remote Indigenous schools, is the preparedness of schools to take on the time and responsibilities associated with the administration and reporting of additional tests.
GGSA proposed DIBELS (Dynamic Indicators of Basic Early Literacy Skills) as an additional metric to program data, based on its experience with the Cape York Aboriginal Australian Academy (Australian Council for Educational Research, 2013;Grossen, 2013), however this was not adopted.

The Australian Journal of Indigenous Education
In late 2016, the Australian Government Department of Education changed the evaluation framework for Flexible Literacy to make NAPLAN the primary metric. This was after almost 2 years of the program. NAPLAN is a crude tool, and its limitations were expressed in the incredulous question about the Flexible Literacy evaluation from a senior official (pers.comm., n.d.) to the department's senior executive team: 'Whose idea was it to use Year 5 Indigenous remote students' NAPLAN results as a measure?' The answer was that it was the department's new incoming chair of the Advisory Committee who decided unilaterally to make NAPLAN the measure of literacy progress in respect of schools that are many years behind the mainstream in terms of literacy and numeracy achievement. Subjecting a Year 5 student with Year 2 equivalent reading achievement to a Year 5 NAPLAN test to determine whether she is making literacy progress is not only somewhat unfair but can obscure the gains she has made since she started at K level 2 years before.

The Northern Territory Government's refusal to release PAT-R data
One of the participating jurisdictions, the Northern Territory, commenced the administration of PAT-R tests, developed by the Australian Council for Educational Research, across its schools in 2015. In the program's second year, the then Northern Territory Government promoted the PAT-R results in its Flexible Literacy schools utilising Direct Instruction, stating in a media release on 2 June 2016: Progressive Achievement Testing data indicates that the government schools that have implemented Direct Instruction in Literacy have seen positive results, particularly for students in Years 1-4. On average, Direct Instruction students have had an improvement greater than that of similar schools who are not participating in the program (Chandler, 2016). However, following the 2016 election, the new Labor government decided to withhold PAT-R data from the evaluation of Flexible Literacy, citing the 'instability' of the data as a reason for its decision. The Australian Government instructed The University of Melbourne not to use the PAT-R data, despite other school systems willingly providing their relevant datasets. GGSA was informed by the Australian Government that it accepted the position of the Northern Territory Government. This refusal to share PAT-R data as originally agreed by the Northern Territory government prior to the program's commencement, extended beyond 2016 and the data has been withheld every subsequent year to the present.
Moreover, the Northern Territory government progressively facilitated the departure of schools from the program from 2016. It was clear the new government was not supportive of the Flexible Literacy program and Direct Instruction. The pertinent and outstanding question remains: what did PAT-R data say about the literacy progress of these Northern Territory schools prior to their withdrawal from Flexible Literacy? If the data does not reflect well on Direct Instruction then why not release it for evaluation?

Findings highlight instructional and structural factors crucial to remote school improvement
At the end of the second year of Flexible Literacy, GGSA produced an Implementation Report (GGSA, 2017a) and provided this to the Australian Government and the program's evaluators. After 2 years, the program implementation brought into relief six factors crucial to successful school improvement in these remote contexts.
Three 'instructional factors' were identified: • Instructional leadership • Effective teaching • Requisite time on instruction Three 'structural factors' were also identified: • Teaching numbers meet student need • Stable teacher and leadership turnover • Student attendance

Instructional factors
Instructional factors involve teaching and learning that is within the remit of the school and their core responsibility. School leaders need to provide instructional leadership to their teaching team to follow the program and to keep 'rolling the log' of continuous improvement. Effective teaching is the keystone: teachers and teaching assistants delivering effective instruction to students in the classrooms. This was the point of supporting the schools with training and coaching on the delivery of Direct Instruction and Explicit Direct Instruction lessons at the coalface. The third imperative is for schools to allocate the requisite time on instruction: students behind grade level require at least 2.5 h of literacy per day to catch up. Too many remote schools participating in Flexible Literacy consistently failed to allocate the necessary instructional time (GGSA, 2017a).

Structural factors
Structural factors go beyond the remit of individual schools and depend upon support from the wider school system. They go beyond teaching and learning and concern questions of resources and other inputs from systems. The first structural imperative is that teaching numbers meet student needs: the schools with the greatest disadvantage do not have sufficient teacher numbers. The achievement gap can only be closed with more teaching resources, ideally two teachers per classroom: one providing Tier 1 all-class instruction whilst the other providing Tier 2 instruction to small groups and Tier 3 remedial attention to individual students that require additional support. This is an argument not for smaller class sizes, but rather more teachers in the same class. It is a key learning from GGSA's experience with Flexible Literacy (GGSA, 2017a). The second structural imperative is the need for stable teacher and school leadership turnover: an issue that has not been resolved for remote schools for decades. From the commencement of Flexible Literacy in 2015 to the end of 2016, 12 schools had the same principal, 10 had two, four had three and three had four. Seven schools had more than 100% teacher turnover in 1 year, 19 had less than 50%, nine had more than 100% teaching assistant turnover in 1 year, and 10 had less than 50%.
More can and needs to be done to ensure greater stability in school teaching and school leadership. This means increasing retention to at least 3 years for teachers, 5 years for school leaders. Importantly, it also requires more stability in the movement of teachers in and out of schools. It is enormously debilitating for schools to lose more than a third of their teachers in 1 year.

408
Noel Pearson The third self-evident factor is student attendance: no school teaching and learning program can succeed without it. Five schools in the Flexible Literacy program had annual attendance between 80 and 90%, four between 70 and 80%, six between 60 and 70%, and 14 less than 60%. Twenty-one schools had attendance rates lower than the lowest jurisdiction (Northern Territory) average for Indigenous students.
Schools can do much to increase student attendance but they cannot resolve this on their own. They require the cooperation and support of parents and community leaders. They also require school systems to invest in effective strategies to increase attendance.
Unfortunately, the current default position is that the inadequate levels of remote schools' attendance-between 60 and 80% (and lower)-are considered acceptable. Efforts aimed at increasing attendance tend to slacken off at around 3.5-4 days per week. However, missing 1 day per week in primary school is the equivalent of missing 1.5 years of primary level schooling.
Research findings have been unequivocal about the importance of attendance as a predictor of long-term success or failure (Chang and Romero, 2008;Gottfried, 2010;Sanchez, London, & Castrechini, 2015).
A seventh important factor: implementation governance Subsequent to its 2017 Implementation Report, GGSA identified a seventh critical success factor: implementation governance (GGSA, 2019). Literacy and other interventions in remote schools will always have mixed results that are not sustainable unless these seven factors are addressed.
Implementation governance has a bearing on whether the six other factors are addressed by the schools and the systems that own them. Unless solutions are found to implementation governance then school improvement investment by the Australian Government into remote schools will not work. Even where gains are made through the use of effective teaching, sustainability requires the continuity and care in implementation provided by proper governance. GGSA's experience with the Flexible Literacy program has been that school system owners and the Australian Government as funder simply do not, and cannot, provide the necessary governance that is needed for successful implementation that ensures all instructional and structural factors are addressed.

Guenther and Osborne's Red Dirt Education
In 2017, Professor John Halsey undertook the Independent Review into Regional, Rural and Remote Education on behalf of the Australian Government (Halsey, 2018). In a discussion paper, Halsey referred approvingly to a report produced by The Cooperative Research Centre for Remote Economic Participation (CRC-REP) (Guenther, Disbray, and Osborne, 2016). Halsey (2018, p. 51) wrote: An important question guiding the research was 'What is education for in remote communities?'. The answer according to those who live there is that 'education is not primarily about preparing young people for work; rather, it is to ensure that their language, culture and identity remain strong and that they maintain a connection to their land'.
Guenther and Osborne, along with Disbray, were the authors of Red Dirt Education. In a submission to the review, GGSA (2017b, p.4) wrote: … GGSA has reviewed this report, and wishes to state that its philosophy and assumptions are diametrically opposed to it. Our work in Cape York Peninsula has been aimed at rejecting this kind of approach to thinking about Indigenous school education, particularly in remote communities. We urge the Independent Review to avoid adopting the so-called 'Red dirt' approach put forward by the authors of this report. At its core the 'Red dirt' thinking is low expectations education and compounds the tragic failure in remote education. It is thinking that both accepts and explains such low expectations by existing failure. The worst aspect of this thinking is that it attempts to harness the views and expectations of Indigenous parents and communities in remote areas, as the reason to adopt 'Red dirt' thinking. No government or society would use the victims of failed educational policies and poor school provisioning as support for such low and differential expectations of Indigenous students, compared to other students of the nation. It is far too late in the day to reprise the flawed thinking of thirty and forty years ago when it concerns Indigenous remote schooling. No contemporary Australian government would inflict such poor policy thinking on mainstream students: now is not the time to compound the disadvantage of Indigenous remote students by following the flawed thinking of the authors of Red dirt. The point is to provide school education which is inclusive of Indigenous culture and ancestral languages to remote Indigenous schools-so that Indigenous students can 'enjoy the best of both worlds'-without lowering expectations about Indigenous students gaining the skills and knowledge to make their way successfully in a global world.
The policy disagreement about the purpose of remote school education between the authors of Red Dirt Education and GGSA is important but not the subject of this analysis. Rather the focus is literacy outcomes from the Flexible Literacy program and the utilisation of Direct Instruction. Evidence of efficacy should trump ideological discourse in educational policy, particularly for Indigenous students. The result of our analysis shows an astounding contradiction between Guenther and Osborne's presentation of achievement score analysis and our growth analysis.

Conclusion
Our analysis of NAPLAN results to determine the true impact of the Flexible Literacy program on literacy outcomes for Very Remote Indigenous schools uses a growth measure rather than an achievement measure. We compare these outcomes with outcomes from Australian schools and Very Remote Indigenous schools. Our findings tell a very different story to Guenther and Osborne.
Guenther and Osborne's analysis produced a finding that Reading outcomes in Flexible Literacy schools declined in comparison with other schools. In contrast, our analysis shows a 124% growth for Very Remote Indigenous schools involved in Flexible Literacy from 2015 to 2017 while growth in the same period was 19 and 34% for all Australian and Very Remote Indigenous schools, respectively.
Why are these findings so dramatically at odds with the findings reported by Guenther and Osborne in a paper that has been widely reported in the press and promoted in social media (Chrysanthos, 2020;Duncan, 2020;Little, 2020)? Guenther and Osborne (2020, p. 6) state that the alleged failure of Direct Instruction raises 'ethical' questions in terms of policy implementation and research and evaluation practice. In the light of the analysis presented in this paper, the ethical question really concerns why researchers concerned with the education of students from remote Australian communities would misrepresent evidence of learning progress in such an egregious way.
The Australian Journal of Indigenous Education 409