DIGITAL TOOLS FOR REAL-TIME DATA COLLECTION IN EDUCATION

Real-time performance data in education enable critically needed tracking of program activities and instructors’ and learners’ progress to better inform adaptations along the way. In this report, we provide an overview of some of the key considerations related to real-time data collection in education, with a particular focus on the digital tools that enable their collection. Importantly, we have developed a typology of tools and selection criteria to support policymakers, practitioners, and researchers around the globe in either developing new tools or selecting from the landscape of existing ones. Our aim is to help initiate dialogue around the use of real-time data for adaptive management in education and contribute to data-informed decisions toward ensuring that all children have access to quality teaching and learning experiences.

Introduction

The availability of—and access to—real-time performance data from classrooms and other learning spaces will be critical in addressing the increasingly dire learning crisis facing today’s children and youth. Real-time performance data focus on the ongoing tracking of program activities and progress of instructors and learners in order to better inform adaptations to program inputs, activities, and outputs along the way. Administrators (such as school or government leaders or program facilitators) can use information on performance to adaptively manage resources and tailor programs according to the needs of students and instructors. Similarly, instructors can use student data to ensure that they are teaching at the right level and customize their instruction based on individual learning patterns and needs. Students and families can use real-time data to track progress and use this information to advocate for their needs.

Collecting data in real time is inherently challenging. Paper-based systems of data collection can be slow, administratively burdensome, and prone to human error. Digital technologies potentially offer more efficient collection and analysis of real-time data, allow for more flexibility and customizability, and can provide functionalities such as automatically generated visualizations and ongoing recommendations.

In this report, we provide an overview of the critical factors related to real-time data collection in education to support policymakers, practitioners, and researchers across the globe in either developing new tools or selecting from the landscape of existing ones. After providing brief background on the state of data and measurement in the global education sector, we explore the who, why, and for whom of real-time data collection and use in education systems. Next, we focus our attention on one part of the puzzle—digital tools for data collection. Informed by survey responses from the developers of over 20 such tools, we classify tools based on a typology of their purpose. We elaborate on important factors such as tool usability, functions, and context, which can help decisionmakers select the most appropriate tool for their setting (potentially reducing the need to develop new ones where they already exist), or inform the design of a new tool. These sections are accompanied by an interactive database of digital tools for real-time education data. We conclude with key findings and policy recommendations.

We have seen a response to this need at the global policy level. As a follow-up to the 2018 World Development Report (World Bank, 2018) recommendations, for example, the World Bank—supported by the Bill and Melinda Gates Foundation and the U.K. Department for International Development—announced the establishment of the Global Education Policy Dashboard, which aims to provide policymakers with actionable data on indicators associated with learning outcomes (inputs and infrastructure, teaching, learners, school management, and learning poverty) in basic education (World Bank, 2019).

This and similar types of policy responses are critical because real-time data have the potential to achieve equitable positive learning outcomes by facilitating timely decisionmaking and course corrections and represent an important shift from program monitoring data, which primarily track how much an education program’s activities and outputs align with intended milestones, as well as possible unintended consequences. While program monitoring data are important, they rely on critical assumptions about the links among activities, outputs, and the proposed final outcomes, which may or may not occur. Real-time performance data, on the other hand, track ongoing program activities and instructors’ and learners’ progress to better inform data-driven adaptations to program inputs, activities, and outputs along the way (Black & Wiliam, 2008). In this way, real-time data can inform the design and management of a variety of education programs and initiatives (GIZ, 2017).

As shown in Table 1, there are a variety of reasons to collect real-time data in education, which correlate with the types of data collected and the users of the data and tools. The collection of these data can encompass a variety of domains, including organization of instruction, personnel management, resource management, and administrative planning. Across these domains, data are primarily used for tracking and responding to student and instructor behavior and performance with the goal of ensuring student learning. Broadly speaking, the types of real-time data needed to achieve these goals can include: child (or student) attendance and enrollment, child (or student) performance data (or assessment), instructor (or other service provider) attendance, and instructor (or other service provider) performance, which can be self- or objective assessment. These data can be collected and used by many actors in the education system, ranging from children and their parents, to instructors, coaches, and school administrators, to high-level government administrators.

TABLE 1. A taxonomy of real-time education data

Assessing student performance can take multiple forms—for example, conducting examinations to track individual student progress and inform learning strategies. Formative classroom assessments (for example periodic quizzes, assignments, or learning checks) are one approach to track student progress toward learning, provide immediate feedback to inform classroom instruction, and guide teacher professional development. Data can also include detailed information on students’ social and behavioral experiences, which can help instructors and students build stronger relationships and change the way children learn. Just as formative assessment can be used to improve students’ learning, tracking data on educator performance can be used to inform professional learning and development (Dillaha & Haren, 2017).

National and international level large-scale summative assessments are instrumental as monitoring and evaluation strategies and provide feedback and data on general trends and the state of the education system (World Bank, 2020; World Bank, 2019).

For any data to be effective at improving teaching and learning outcomes, they must reach the proper personnel, or those who can make and enforce decisions and take action. As shown in Box 2, each education decisionmaker has a different role in determining the ultimate outcomes of student learning—from the national or ministerial level and local government to school administrators to classroom or non-center-based instructors/coaches, to children and their parents/families. Furthermore, for data-informed decisionmaking to take place, it is not enough for stakeholders to have access to the data—they also need to have the capacity, time, and authority to collect and use it. The following section discusses some of the factors that facilitate real-time data collection and use, as well as some of the potential pitfalls.

Box 2
Data for whom in the education ecosystem?

Real-time data are valuable at all stages of the program lifecycle from design to implementation, and their analysis can drive both immediate operational adaptations at the frontline of educational instruction by teachers, school administrators, and mentors, and at high-level tactical adaptations at a systemic level (USAID, 2017).

This adaptive management approach prioritizes “learning by doing” (Ramalingam et al, 2017), but by firmly relying on systematically collected data. Such adjustments can be either tactical⁠—that is, targeting immediate operational processes⁠—or strategic—addressing systemic shortcomings in the mechanics of the program itself. One well-cited example of this is the Educate Girls Development Impact Bond (Gustafsson-Wright, Osborne, & Massey, 2020), where a strong system of adaptive management was developed within the project, allowing for timely course correction and ultimately a positive impact on student outcomes.

Broadly, government administrators (in education ministries and subnational ministries) use education data for policy design and strategic planning. This may include monitoring national, subnational, or school-level attendance, attainment, and achievement levels, as well as measures of equity. This information feeds into decisions about allocation of resources and development and revision of standards and goals. It may also include decisions around the employment and training of administrators, teachers, and other staff.

At the school administrator level, education data are used to track, evaluate, and support student, teacher, and staff performance and progress. These data can also be used to develop school action plans and guide-school level practices.

Instructors and coaches use education data to assess the performance, progress, and needs of students in order to develop and revise classroom instruction. These data can come through a variety of forms including formative and summative assessments.

Source: Author adaptation based on Gill et al. 2014 (Mathematica).

Principles and pitfalls of real-time data collection and use

The academic literature identifies a number of principles that can help ensure accuracy, efficiency, effectiveness, and equity in the collection and use of data and that stakeholders are employing on the ground. The Right-Fit Evidence unit at Innovations for Poverty Action, for instance, works to implement the CART principles identified by Gugerty and Karlan (2018) to support data collection for monitoring and evaluation. According to these principles, data should be credible and actionable, data collection should be responsible, and findings should be transportable (IPA, n.d.). These principles also feed into the “nimble evaluation” approach—whereby data are quickly collected on short-term outcomes (Karlan, 2017)—which the Strategic Impact Evaluation Fund (SIEF) at the World Bank is currently testing (World Bank, 2018). USAID also outlines five key standards to improve the consistency and accuracy of collected data. These include validity of data, integrity of the data collection process, precision (or a small margin of error on performance indicators), reliability or rigor in analysis, and timeliness in using data in decisionmaking.

Some potential pitfalls are also highlighted in the literature. Specific to the education sector, for instance, Pritchett (2018) argues that efforts to collect real-time data must be grounded in a cohesive theory of change framework to avoid being overwhelmed with unstructured data and potentially lose sight of the larger picture of child well-being. In other words, because education systems tend to be non-linear and interactive, understanding causal connections between different inputs is essential to making sense of real-time data. Another potential pitfall is that real-time data can create an illusion of information if decisionmakers are unable to detect relevant trends from the noise generated by continuous tracking in a complex environment. For example, across four inventory management experiments, Lurie et al. (2009) found that when provided with real-time data, managers focused asymmetrically on the most recent data and failed to adequately compare data from multiple time periods.

Drivers of successful data collection and use

There are a number of potential drivers that can facilitate or hinder the collection and use of data. These can be split broadly into three categories: capacity, logistics, and incentives. The capacity category includes understanding which data are relevant for responsive decisionmaking and when they should be collected, the ability to analyze data and apply data-driven insights to teaching practices and support, and the ability to act on data. The logistics category includes practical issues around data collection and analysis, such as the tools used to collect and analyze data and the format in which data are accessed or displayed. In the final category, incentives are related to the intrinsic or extrinsic motivation to collect and analyze data and apply the findings (Author adaptation from multiple sources including USAID, 2010; Gugerty & Karlan, 2018; Gill et al., 2018, and Kaufman et al., 2022).

FIGURE 2

Furthermore, these categories can interact with one another. Research on data usage finds that instructors are more likely to use data when schools develop a culture of data use or when there is support, training, time, and/or systems in place for data use. For instructors, data timeliness, perceived validity, and capacity for instructional adaptation all affect self-reported data use; instructors are most likely to use formative data to learn about new students, group students for instruction, and assess weaknesses and strengths at the class level (Tyler, 2011).

This study focuses on one aspect of the data ecosystem: the tools to collect and analyze data—specifically, real-time data for education. A comprehensive deliberation of this category requires that we also take into account other factors mentioned under the costs, incentives, and capacity associated with the process of turning data into decisions. Limited exploration of these factors aside, we restrict ourselves to a detailed description of the nature of digital tools developed for use in education systems. This includes, for example, “how to collect the data” and “how to analyze the data” in the capacity category and “cost of data collection” and “analysis and application” in the incentives category. The other drivers listed above are outside the scope of this study, as are attempts to analyze whether real-time data and/or the factors associated with their collection and use have any impact on learning outcomes. While related and of interest, such literature should be seen as complementary to this report and framework. In the following sections, we provide a brief overview of the tools for data collection before zooming in on digital tools for collection and analysis of real-time data.

Tool typology

Tools to collect data can take many forms, ranging from a simple pen and paper survey to the more current technologies that can collect data remotely on and offline—for example, using social media like WhatsApp and Facebook. In low- and middle-income countries in particular, education data have historically often been collected manually and on paper. The lack of standardization of the data pipeline, susceptibility to human error make it more challenging to analyze and/or use this data to change an intervention. However, in recent years, a boost in the availability, affordability, and accessibility of technology—and education technology specifically—has led to an increase in the digitization of data collection globally.
Digital technologies can be used at different stages of the data life cycle, starting from data collection to analysis and dissemination of results to decisionmakers. Digital tools may use different approaches to carry out these functions, depending on the context for which they are designed. The following Table 2 is an indicative, non-exhaustive list of the methods generally used to collect, analyze, and disseminate data (Author adaptation from USAID, 2021).

TABLE 2. Broad methods to collect, analyze, and disseminate data

In high-income countries in particular, the use of digital tools is commonplace; instructors, administrators, and families typically have access to this information via dashboards, allowing them to track student participation and progress, and identify where students need additional support. Historically, due to various constraints (lack of access to technology, software, and internet, as well as limited capacity) in lower-income countries, there are fewer of these technologies available or utilized. However, as the demand for real-time data increases, their use is expanding rapidly.

Digital technologies have an increasing presence in many components of the education system and offer the potential to transform the management of education systems to overcome geographic disparities, personalize learning, and make information available in digital forms (West, 2012). Digital technologies can also specifically support instructors with teaching and reach: engaging with learners in new ways; broadening access to learning materials; reducing administrative burdens; and creating tools for formative and summative assessment (Trucano, 2013). Furthermore, data digitalization enables “closed-loop” experimentation which allows for iterative impact evaluation of an intervention (Kaufman et al., 2022).

Real-time data systems can range across the spectrum of digital technologies—from those that depend on face-to-face surveys and observations to sensor-based automated computational systems that rely on web- or application-based tools for computers, tablets, and phones (smart and basic feature)—for collection, sharing, managing, and reporting data (GIZ, 2017). Integrating more complex digital data systems into education practice requires robust infrastructural support, along with training of personnel involved in the data collection and use (The Economist, 2021).

Type A: Data collection

Purpose: These tools collect and store data that are likely to be analyzed with an external statistical program or spreadsheet application. The data could be used to track attendance, administer assessments, deliver household or individual surveys, conduct classroom observations, or more. They differ from other types of tools in that they do not have analysis or content delivery features, but they could be linked or integrated with other tools and systems that do possess these features.

*Note: In our survey, we did not come across a tool that performed the sole function of data collection. This type of tool may have been more common in earlier stages of the digitization process, whereas more sophisticated recent tools often have analysis functions embedded within.

Type B: Data collection + analysis and visualization

Purpose: These data collection + analysis and visualization tools (Type B) collect and store data, and also provide data analysis and visuals of the result. These could be more complex surveys or observational, formative, or summative assessment tools for children or instructors, school or classroom management, and more. The data collected can then be analyzed or visualized (often automatically), for example in the form of an integrated dashboard, and likely would be used for program monitoring and evaluation activities, such as identifying trends, measuring progress, and/or devising potential ways to address challenges.

Example: Waliku is a suite of digital tools for student learning and well-being. Along with tracking enrollment and attendance for more than 30,000 students in Indonesia and Guatemala, it also records health metrics to aid child welfare management.

Type C: Data collection + learning content delivery

Purpose: These tools (Type C) usually function as game-based education technology applications, student learning platforms, or as a lesson/pedagogy delivery tool aimed at instructors or school leaders. With the included data collection functions, these tools are geared toward gaining a deeper understanding of the tool’s usage and users. Content delivery is usually the primary function of these tools, and could include educational content targeted at learners, instructional or coaching content targeted at instructors, family engagement content targeted at caregivers, or others. The data collected alongside content delivery are nearly always collected automatically.

Example: Chimple is an open-source mobile application for children to learn mathematics, reading, and writing in informal environments without instructor supervision. It features an adaptive learning algorithm and a gamified environment to facilitate dynamic learning.

Type D: Data collection + analysis and visualization + learning content delivery

Purpose: These tools (Type D) are typically quite similar to Type C, but with the addition of data analysis and visualization, these applications can provide a one stop shop for delivering content for interventions, as well as measuring impact and presenting that data to decisionmakers in real time. These are typically more complex system-level tools that address the holistic engagement of students, learners, and education systems, and may include education technology applications with adaptive learning capabilities.

Example: Tangerine is an open-source tool that was previously used primarily for offline, oral student assessment and timed data collection required for early grade reading and math assessments. It has evolved to include content delivery through its “Teach” and “Coach” applications.

There are a number of sub-elements in each of these categories, related to their varied features and applications, which can make them better or less suited for particular uses. These sub-elements were based on a survey that we administered to tool developers (see Appendix for further details), and examples of tools from this survey administration can be found in our interactive Tool Finder database.

The components and sub-components are summarized in Table 2 below, and further described in the section which follows.

TABLE 3. Detailed selection criteria for digital tools