In corporate learning, learning analytics make use of big data to understand learning and learner behavior. Some of the major goals of learning analytics are to help determine the effectiveness of a particular learning intervention, gain insights on how learning supports on-the-job performance, and decide which programs or learning initiatives to implement at an organizational level. Increasingly, LMSs provide learning analytics that mine large volumes of learner data to determine ROI and other learning impacts.
While big data has its merits, it also comes with limitations. By placing a significant emphasis on big data, we run the risk of thinking it is the end-all, be-all of gaining insights into learning.
Here are a few things to consider when it comes to big data in learning analytics:
- Big data has too narrow a focus. Typically it tracks one thing with massive amounts of information, such as a particular course’s completion rate over a long period of time, to make predictions based solely on that one thing. However, that in itself is not meaningful as it doesn’t answer the question of why certain people did not finish the course.
- Big data is messy. Big data is essentially raw data being collected on an ongoing basis. It is not clean (as it collects everything digital there is to collect in large quantity rather than quality) and thus is very hard to make actionable. How would you make a decision about whether to change a quiz question based on the number of wrong answers without examining how the learners comprehend the question? Would you rephrase the wording? Get rid of the question altogether? Provide additional support on the learning material?
- Big data is not easy to analyze. There is a disconnect between how to analyze the data, which data sets to analyze, and how to triangulate different data sets to draw practical and useful conclusions. It can be expensive, and companies often lack the expertise (and budget) to conduct these analyses. To further confuse the issue, LMSs now provide a large array of data collection options, and just because you can collect and measure different data sets doesn’t mean you should.
- Big data is impersonal. We are more than the sums of our parts, yet big data fails to represent that. In big data learning analysis, people are often reduced to numbers. The generalizations that are made are often incorrect and, in some instances, completely opposite what the intervention should be. For example, when big data shows a learning path is being routinely skipped, we might be tempted to conclude there is a lack of motivation and interest on the learners’ end; however, it may be that the content is the problem, aiming too low- or too high-level.
The Virtue of Small Data
To bridge the gap where big data is not serving our needs, small data can offer personal insights and meaning. So what is small data?
In a nutshell, small data is small enough in size for human comprehension. It differs from big data in the sense that small data can be easily analyzed and compared. It provides a human-scale amount of information that could potentially be tailored to the end users’ needs rather than focusing on the business gains, as in the case in measuring ROI.
In an instructor-led-training session, small data could be interpreted as instructor observations, peer assessments, learner reflections and external consultants’ audits on how well the learning goes. In an e-learning environment, small data has the greater potential to provide more granular insights. It allows administrators and course designers to closely examine each learner’s digital learning traces — where they click, for how long, what they skip, where they go from one piece of content to another, what questions they post and so on. It also can help examine learner-centered questions, such as who is engaged or not engaged in discussion forums, how people can receive better coaching feedback online, how a particular user moves through a particular course and whether that user’s trajectory varies greatly from those of others.
Small data allows for a deeper dive into each learner’s activities and learning patterns. In return, it can dynamically generate real-time recommendations and adaptations. Furthermore, it could potentially create a two-way dialogue between machine and human in what is known as human-computer cooperation. Learning content can be personalized by the end user and for the end user.
Some ideas to consider for the next generation of LMSs and other learning platforms are:
- A personalized learner dashboard. Many LMSs provide dashboards to display the status of each user’s learning, including the number of courses completed, in progress and overdue. Sometimes they also recommend courses and learning materials based on a user’s learning history. I suggest that it could go one step further and dynamically map out each learner’s competencies and skills (instead of at a course level, it would be based on the learner’s current role or aspired future role). It could also invite feedback from each learner based on their changing aspirations, external learning inputs (e.g., courses taken from a local university) and skills acquired outside the learning platform. Giving learners such data would encourage them to be self-directed and take ownership of their learning and career paths.
- Sentiment analysis of discussion and other social learning interactions. Sentiment analysis looks at the affective component to determine the emotional tone of students based on their discussion forum postings, Q&A section and other social interaction within the LMS. Analyzing such content would help instructors and facilitators to provide timely feedback. Additionally, sentiment analysis is useful at a collective as well as an individual level. People’s collective feelings toward a particular material within a course are often an indication that something is either amiss or on track, in addition to observing their performances and test scores. Triangulating these data sets is often manageable and provides a richer insight about both the learners and the learning.
- A deeper dive on learning path. Another useful framework would be to focus on measuring the learning path of each learner by using a learning pathway tool (based on path analysis technique) as it follows the learning path of each individual. The tool allows you to answer questions such as, “Does your learner learn in the sequence as you intended?,” “What are some surprising patterns?” and “Where do they skip and opt out?”
Hand in Hand
Strategically using small data in learning is a lot like a sped-up version of ethnography. It allows you to take a microscopic look at an individual level while capturing as many perspectives from as many small data sets as you can to paint a cohesive picture. Perhaps we should rename small data “human data,” with the emphasis on qualitative rather than quantitative information.
Bear in mind that not all the observations would seem relevant or even make sense at first glance. However, patterns would emerge and it is important to always involve your end users and share your small data findings.
In big data analysis, learners rarely get a chance to see why certain decisions are made based on the collection of their information; it is a one-way initiative. In small data, there is an opportunity to return the data to the owners, to invite conversations and to provide value by sharing these personal learning traces back to the person who generated them.
To be sure, there is no one definitive way to collect and implement small data, nor should we forgo big data in favor of small data. The key takeaway is to recognize the limitations of big data and avoid the temptation to make decisions based on big data analysis alone. As learning professionals moving forward, we must find ways for big and small data to work together to provide a more holistic view of the learning landscape.