The problem: AI and the decline of cognitive engagement
Artificial intelligence (AI) is rapidly transforming how we access information. Large Language Models (LLMs) can generate answers to complex questions, compose essays, and assist with coding. While this convenience is remarkable, it may also come at a cost – our cognitive skills.
According to recent studies [1] [2] [3], people increasingly rely on AI-generated responses without engaging in deep thinking, critical analysis, or problem solving. Instead of searching, evaluating, and synthesizing information as is, for example, common while searching the Web, users tend to accept AI’s answers, often without questioning their accuracy. Studies suggest that excessive dependence on AI can lead to cognitive offloading, where individuals become less engaged in reasoning and problem-solving, relying on AI to do the work for them.
The decline in cognitive engagement is notable across several key areas:
- Critical Thinking – People accept AI-generated responses without questioning their accuracy or considering alternative viewpoints.
- Reasoning Skills – Instead of analyzing and synthesizing information, users passively consume AI-generated content.
- Memory Retention – When information is readily available, there is less incentive to remember facts or develop recall strategies.
While AI has clear advantages, it is important to recognize these cognitive risks and work on solutions that encourage users to engage with information rather than passively accept it. So, how can we ensure AI enhances human intelligence rather than diminishing it? Although this problem is complex and multi-dimensional, one promising mitigation approach is the use of hints – subtle clues that guide users toward the right answers without directly revealing them [4]. In this blog post, we explore how hints can help, our approach to implementing hints, and how researchers could contribute to HintEval [5], a framework designed for automatic hint generation and evaluation.
The solution: how hints could help
Instead of direct answers, hints nudge users toward discovering solutions on their own. This approach, common in education, uses guiding questions rather than providing solutions. What if AI worked the same way?
Hints have the potential to help keep up cognitive effort by encouraging users to actively engage with information. However, not all hints are equally effective. The usefulness of a hint depends on various factors, such as how well it guides users toward an answer without directly revealing it, how familiar the hint feels to the user, and whether it avoids unnecessary complexity. A well-designed hint thus stimulates:
- Problem-Solving Skills – Users must think critically about the hint and how it relates to the question.
- Memory Retrieval – Hints encourage users to recall relevant facts rather than relying on instant AI-generated answers.
- Analytical Reasoning – Users must evaluate how the hint connects to potential answers.
This shift from "AI as an answering machine" to "AI as a thinking assistant" could allow users to remain in control of their cognitive processes while still benefiting from AI’s vast knowledge base.
Putting it to the test: using hints to solve this problem
With the aim to explore the usefulness of hints, we have been working on a project to generate and evaluate hints for fact-based questions. We developed a large-scale dataset of hints designed to help users arrive at answers without directly revealing them [4]. We prompt LLMs to generate hints solely based on information from the Internet and provide their sources. This approach helps prevent hallucinations in LLMs. Each hint was crafted to balance:
- Convergence – Leading users toward an answer.
- Familiarity – Using recognizable concepts.
For example, instead of answering "Which city is home to the International Monetary Fund?" directly, AI-generated hints might include:
- The city is the capital of the USA located on the east coast.
- The city is located on the Potomac River.
- The city is known for its neoclassical architecture.
These hints encourage users to piece together information, reinforcing cognitive engagement.
Testing the impact of hints
We conducted experiments where users answered questions using hints instead of direct answers. Results showed that hints significantly improved users’ ability to recall information and arrive at correct answers without feeling overwhelmed. Even for difficult questions, users found hints useful in guiding them toward the answer while keeping them actively engaged. Figure 1 presents the users' results.
This experiment confirmed that AI can assist with knowledge retrieval while still promoting independent thinking – a win-win situation.
Introducing HintEval: a research toolkit for hints
Encouraged by these findings, we developed HintEval [5], a framework for researchers interested in studying and creating hints.
HintEval serves two key purposes:
- Generating Hints – Researchers can experiment with different methods to create effective hints for various types of questions.
- Evaluating Hints – The framework includes evaluation tools to measure how useful and effective hints are in guiding users toward answers.
One challenge in hint generation is ensuring quality. HintEval introduces standardized evaluation metrics such as:
- Relevance – How closely the hint relates to the question.
- Readability – Ensuring the hint is clear, concise, and easy to understand for users.
- Convergence – Whether the hint guides users toward the correct answer.
- Familiarity – How well users recognize the concepts in the hint.
- Answer Leakage – Ensuring the hint doesn’t directly reveal the answer.
By providing a unified toolkit, shown in Figure 2, HintEval allows researchers to develop effective hint-generation techniques, test them across different datasets, and standardize evaluations. Our goal is to make hint research more accessible so AI can support cognitive skills rather than replace them.

Figure 2: Workflow of the HintEval: (1) Questions are loaded and converted into a structured dataset using the Dataset module. (2) Users can load preprocessed datasets as a structured dataset. (3) Hints can be generated for each question using the Model module and stored in the dataset object. (4) The Evaluation module assesses all generated hints and questions using various evaluation metrics, storing the results in the dataset object. (5) The updated dataset can be saved and reloaded as needed.
Conclusion
LLMs are here to stay, and their impact on human cognition cannot be ignored. While they make information more accessible than ever before, they also introduce a risk – reducing cognitive effort and discouraging independent thinking.
Hints could offer a solution by balancing AI’s capabilities with human cognitive engagement. By shifting from direct answers to subtle guidance, hints encourage users to:
- think critically,
- recall knowledge,
- reason through problems.
Our research demonstrates that hints significantly improve AI interactions – helping people learn, remember, and solve problems without becoming passive consumers of AI-generated content. With HintEval, researchers now have a toolkit to refine and expand hint-based AI interactions.
References
- Lee, H., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., & Wilson, N. (2025). The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers. In CHI Conference on Human Factors in Computing Systems (CHI ’25), April 26–May 01, 2025, Yokohama, Japan. ACM, New York, NY, USA, 23 pages. Retrieved from https://www.microsoft.com/en-us/research/wp-content/uploads/2025/01/lee_2025_ai_critical_thinking_survey.pdf
- Jošt, G., Taneski, V., & Karakatič, S. (2024). The Impact of Large Language Models on Programming Education and Student Learning Outcomes. Applied Sciences, 14(10). doi:10.3390/app14104115
- Heersmink, R. (2024). Use of large language models might affect our cognitive skills. Nature Human Behaviour, 8(5), 805-806. doi:10.1038/s41562-024-01859-y
- Mozafari, J., Jangra, A., & Jatowt, A. (2024). TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2060-2070). New York, NY, USA: Association for Computing Machinery. doi:10.1145/3626772.3657855
- Mozafari, J., Piryani, B., Abdallah, A., & Jatowt, A. (2025). HintEval: A Comprehensive Framework for Hint Generation and Evaluation for Questions. doi:10.48550/arXiv.2502.00857
DOI: tba
This work is licensed under a a Creative Commony Attribution 4.0 International License
Written by Jamshid Mozafari in February/March 2025
PhD candidate at the Department of Computer Science (Data Science Group)
University of Innsbruck
About the author
I am a PhD candidate in the Data Science Group, supervised by Prof. Adam Jatowt. My research focuses on natural language processing and information retrieval, with an emphasis on automatic hint generation, hint evaluation, and open-domain question answering systems. I have published articles in top-tier IR/NLP conferences and leading journals.
Research area
Natural Language Processing and Information Retrieval