Our article, "What data and analytics can and do say about effective learning", was published in the journal npj Science of Learning recently. The Nature Research team had a few questions for us about our article, which Linda Corrin and I have answered below.
What is learning analytics, and how does it differ from ‘traditional’ analysis of classroom learning through tests, essays and student–teacher interactions?
Learning analytics refers to the collection, integration and analysis of data across multiple sources (predominantly digital learning environments, student information systems etc.) for the purpose of understanding and enhancing student learning. The scope of learning analytics includes these “traditional” analyses and seeks to find new ways to understand and visualize the outcomes of the analysis. It also builds on these previous forms of analysis in education in that the data can now be derived from multiple sources and integrated to allow more sophisticated statistical analysis. The combined datasets can end up being quite large with many possible indicators of student progress and performance. These data can be used to detect patterns in student learning behaviour and performance beyond what was possible previously.
Can you talk about how computer technologies are influencing the classroom? What opportunities and datasets do they provide that aren’t obtainable via traditional means?
Computers are increasingly impacting on the ways in which humans access and use information. While it is still uncertain whether these changes are fundamentally influencing how we process information and learn, the increased use of computers does provide opportunities for developing this understanding. In a practical sense, the increased use of computers in classrooms allows teachers to monitor and provide feedback in close to real time as students learn. The data captured by some learning systems can be at a very fine level providing detail on the actions, sequences and timing of individual students’ interaction with learning resources and tasks. Effective analysis of this data can provide students with a sense of how they are progressing without requiring teacher intervention. These possibilities then allow for a more personalised experience that can help students develop in their understanding no matter their starting point. The cues allowing for this level of personalisation are not always evident to teachers who are facilitating the learning of many students simultaneously.
Many people probably don’t think of education, or the study of classroom learning, as being data rich. What sort of measurements are being taken, and which are showing the most promise in guiding learning?
Any learning environment is generally information rich. Classrooms, in particular, are awash with information about how students are going and whether teaching practices are effective. Teachers are very good at using this information to provide support for students. For example, when students are confused, they tend to have very distinct facial expressions i.e. the ‘confused face’. This information allows teachers to adapt to the conditions in the classroom. Where this adaptation becomes difficult is that teachers cannot stay on top of all the information they receive from a complex classroom environment with many students to attend to.
So there is an advantage to having systems that can capture student behaviours as they are learning to help cut down the amount of information teachers deal with. So, if a student is confused and reaches an impasse, they might simply stop interacting with the technology, i.e. they stop clicking on interactive elements of the environment. They may also click on one part of an environment rather than a more relevant part. Information, in the form of data, can also be collected by asking students questions or getting them to interact with simulated people, animals or phenomena. Their behaviour and responses can then be flagged through predictive modelling as an instance where a student might need individual or personalised help, which can then be sent as an alert to the teacher. The teacher can then focus on this student individually and assist them to overcome the impasse.
The work we are doing in the ARC-SRI Science of Learning Research Centre suggests that the trajectories students follow as they work through learning activities provide great insight into their learning processes and strategies. These sorts of data about indications of student progress show a lot of potential for providing interactive simulations that we are only beginning to realise. Once it was just procedural tasks, like learning how to fly a plane, that could be effectively simulated. Now, adaptive environments can be used to help students understand more complex, conceptual ideas too. These systems help to provide a personalised learning experience for students where they can learn at their own pace. When combined with a capacity for teachers to better monitor and respond to student progress, the use of data in this way will lead to the most beneficial use of technologies and teachers in unison.
When I think of big datasets, I think of machine learning. Is this something currently being applied to learning analytics? How? What do you see as the pros and cons of this approach?
Yes, machine learning is being applied to learning analytics. Our understanding is that this is a relatively new approach in educational settings. Much of the focus of learning analytics when it initially emerged as a research field was on intervening when students are at risk of failing or disengaging from their studies. The power of machine learning will come to the fore as adaptive learning environments increase in sophistication. The kinds of real time predictions about student progress that machine learning will make possible lend themselves to the creation of immersive simulated learning experiences where the environment can be responsive to students. Again, flight simulators and immersive surgical simulators are pointing to what might be possible when adaptive virtual environments mature. Machine learning is an advance that will allow for the kind of sophisticated real time analysis of data from multiple sources to personalise the experience for students learning complex concepts and ideas.
What are your thoughts on privacy issues and student consent regarding the data collected? What guidelines exist and can you envisage any problems in the future?
There have been extensive discussions around the ethical implications of increased data collection and advanced analytical techniques in the learning analytics community since its inception. These discussions have often focused on student privacy and consent. These are complicated issues, but all the standard guidelines and processes the ethical collection, analysis and storage of data are applied in learning analytics as they are elsewhere in research. Students who are involved in learning analytics research are fully informed about what data are being collected and why. Their informed concept is also obtained. In applied environments, students are generally also fully informed about what data is being collected. Principles and guidelines have been developed in different parts of the world to ensure that data are collected in ways that ensure student privacy and, where at all possible, obtain student consent.
More broadly though, the learning analytics community is mindful of the kinds of inferences we can make about individual students on the basis of data-driven predictions. The community incorporates people from a range of disciplines including psychometrics and psychology. Given these disciplines have a long history in the ethical use of sensitive data, these disciplines are helping inform the progress in the learning analytics field. Great initiatives have been emerging around the world that provide guiding principles for institutions and teachers who engage with the use of student data for analytics. While a one-size-fits-all approach is not possible, the guiding principles set forward from these projects provide a great basis for institutions to develop policies that cover both learning analytics research as well as the day-to-day use of analytics to support student learning.
What are the major challenges for the field? E.g. getting enough data? Getting good quality data? Do we know which areas of learning are best suited to a data analytical approach, and which might be unsuitable?
There are a number of challenges facing the field of learning analytics, both from a research perspective and a practical implementation perspective. One of the biggest challenges is dealing with the data in a way that it can be used by teachers and institutions to improve education. Often data are stored in separate systems with incompatible data structures. Institutions need to work to make these data accessible to teachers in formats that they can use and in a timely way. However, despite the availability of large sets of data about students’ actions, there’s a lot of data that aren’t and can’t be captured about how students learn. The challenge to the learning analytics community is to find ways to make meaning, and inform the provision of feedback on the basis of these data, despite the gaps “between clicks”. Part of this conversation needs to focus on the ethics behind access to and use of data, especially the acknowledgement of the assumptions and biases that may be inherent in the design of predictive algorithms. A key challenge is to help institutions and teachers to link the analytics, regardless of the learning area, with learning design of the learning activity. This is vital to be able to make sense of the data and analysis, and to inform the actions that can be taken to improve learning approaches and environments.
Assuming research into learning analytics continues to grow, how should the findings be translated into practice? Is policy implementation likely to be difficult? Why or why not?
The challenge to all educational institutions is to be able to provide a suite of tools and techniques to teachers to be able to apply learning analytics to practice. Research in learning analytics has resulted in the development of a number of generic and specialized tools for understanding different types of learning activities or for general predictions of student retention. Pleasingly we are now seeing a number of these tools being made available as open source initiatives that institutions can adopt and customize to their own needs. In designing and implementing an analytics solution for an institution it is important to use existing research to inform the selection and analysis of data and tools to ensure effective and appropriate use of learning analytics. This will involve the development and management of policies to protect students and to ensure that the institution upholds its duty of care in providing a supportive educational environment. While the implementation of policy may be difficult at first in getting the guidelines right and communicating this to all the relevant stakeholders, it is vital for embedding learning analytics for the long term to improve learning.