By Kevyn Collins-Thompson

I came home to find my 9-year-old daughter in tears. She had been trying to find articles for a class project on sea urchins, but almost all the Web pages being returned by the major search engines had been too difficult for her to read. Her frustration provided that extra motivation for me to look further into the question:

How could we improve search technology to make it more effective at helping people learn?

Kevyn Collins-Thompson

Kevyn Collins-Thompson is an associate professor at the School of Information.

Search engines are one of the primary ways that people access the dynamic, ever-expanding information on the Internet. Major search services return fast, precise results for billions of queries per day from around the globe. To achieve this, the last 15 years have seen remarkable advances in search technology: from the complex server engineering required to guarantee reliable, split-second response times, to the algorithms that accurately predict when and where to crawl for the freshest content, how to fix spelling errors, and how to rank relevant pages for your query.

Yet as my daughter’s example shows, current technology still falls short for some important scenarios, like helping people with their learning goals. Here are four recent examples of directions that researchers are exploring to improve technology in service of the educational needs of both children and adults.

Search interfaces adapted to children and teens.

The PuppyIR project (Azzopardi et al., 2009) is a recent research effort that addresses the broad set of problems required to improve information services for young people: from better user interface design to improved algorithms for ranking, security, and content analysis.

Starting with content from commercial services such as Google, Flickr and YouTube, results in the PuppyIR system are processed by a pipeline that performs topic and age-appropriate filtering, comment moderation and other services. Results are presented using intuitive, visually-oriented interfaces. Systems using the PuppyIR framework and user interfaces have been deployed and studied in museum, school, and hospital environments in the Netherlands.

Personalizing Web search using reading level prediction.

Novices to a topic, or earlier-stage readers in general, might want their search results with the easiest introductory material ranked highest and advanced material ranked much lower, while experts might prefer the reverse. Recent work has explored the use of machine learning to help solve three problems that typically need be addressed for this scenario to work effectively:

  1. estimating a user’s expertise profile from their search preferences and behavior;
  2. estimating content difficulty; and
  3. combining these signals to train improved ranking algorithms that reliably find the right level of document difficulty for a given user.

Recent studies have shown that personalized search using readability and comprehensibility signals can improve the search experience significantly (Collins-Thompson et al., 2011, Tan et al., 2012), and it is only a matter of time before this technology becomes incorporated in widely-used search systems.

Search engines for intelligent learning systems.

What if applications like intelligent tutoring systems accessed the Web to help them perform their tasks just as readily as people do? For example, an online language tutor might find authentic examples of high-quality Web content that were tailored to individual student goals in order to help them learn new vocabulary in realistic contexts. Like people, intelligent systems would need an ability to find relevant material quickly and precisely. Unlike people, an application might use long, complex queries that expressed multiple specific constraints that good pages should satisfy: using the right target vocabulary, at the right level of difficulty, without too many other unknown words, etc.

The REAP system at Carnegie Mellon University is an example of such a system: it uses sophisticated filtering and ranking technology to deliver personalized language instruction in English, French and Portuguese. REAP has helped hundreds of second-language learners in classrooms for almost a decade, while also providing a fascinating experimental platform to study what helps students learn most effectively. In a controlled study, for example, using the underlying REAP search engine to personalize the topics of example material led to consistent gains in student performance (Heilman et al., 2010).

Search engines that help you explore topics.

Current search engines are optimized to give highly relevant results for individual queries. However, interactions like exploring and learning about a new topic typically require issuing multiple queries and reading multiple documents over time, sometimes using repeated sessions over hours or days. The field of exploratory search (White & Roth, 2009) deals with these more open-ended quests for knowledge. In particular, researchers are exploring algorithms that accurately predict users’ future queries to recommend sets of Web pages that cover topics effectively and efficiently (Raman et al, 2013). Future systems will understand how concepts depend on one another in order to augment difficult sections of textbooks, and recommend effective resources for background reading.

These four examples show how learning and searching are intertwined, for both people and machines. With rising participation in online education, the advance of more intelligent data mining and machine learning methods, and the pervasiveness of mobile and cloud computing, the future holds much promise for improved tools for understanding and learning.


L. Azzopardi, R. Glassey, M. Lalmas, T. Polajnar, I. Ruthven. “PuppyIR: Designing an Open Source Framework for Interactive Information Services for Children”, 3rd Annual Workshop on Human-Computer Interaction and Information Retrieval (HCIR 2009).

K. Collins-Thompson, P. N. Bennett, R. W. White, S. de la Chica, D. Sontag. “Personalizing Web Search Results by Reading Level.” Proceedings of the Twentieth ACM International Conference on Information and Knowledge Management (CIKM 2011). Glasgow, Scotland. Oct. 2011.

M. Heilman, K. Collins-Thompson, M. Eskenazi, A. Juffs, L. Wilson. “Personalization of reading passages improves vocabulary acquisition.” International Journal of Artificial Intelligence in Education, 20(1), 2010.

K. Raman, P.N. Bennett, K. Collins-Thompson. “Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search.” Proceedings of SIGIR 2013. pg 463-472.

C. Tan, E. Gabrilovich, B. Pang. “To Each His Own: Personalized Content Selection based on Text Comprehensibility.” In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM 2012).

Ryen W. White and Resa A. Roth (2009). Exploratory Search: Beyond the Query-Response Paradigm, San Rafael, CA: Morgan and Claypool.