What is it about?

Creating high-quality exam questions from textbooks is a time-consuming task that usually requires subject experts. While recent AI systems can generate questions automatically, they often produce content that is inaccurate, inconsistent, or poorly aligned with what students are actually taught. In this work, we present a method that helps AI generate reliable, curriculum-aligned multiple-choice questions directly from textbooks. The key idea is to guide the AI step by step: first by following the textbook’s structure, then by summarising each section twice and checking that both summaries agree, and finally by generating questions only from content that passes this consistency check. Using this approach, we automatically produced thousands of high-quality exam-style questions and evaluated them using carefully designed criteria that measure accuracy, clarity, and educational value. Our results show that it is possible to make AI-generated questions both scalable and trustworthy, provided that strong quality-control mechanisms are built into the process.

Featured Image

Why is it important?

As AI tools are increasingly used in education, training, and assessment, trustworthiness has become a critical concern. Question banks generated at scale are only useful if their answers are correct, their explanations are clear, and their content aligns with formal learning materials. This work introduces a practical framework for improving the reliability of AI-generated educational content. By combining structural guidance from textbooks with consistency-based filtering, our approach significantly reduces errors while improving the depth and clarity of generated questions. Although demonstrated in a specialised domain, the ideas behind this framework are broadly applicable to many fields where expert-validated knowledge is essential. The methods and evaluation criteria proposed here can help support safer, higher-quality use of AI in education and knowledge-intensive applications.

Perspectives

This project grew out of a simple concern: powerful AI systems are now able to generate educational content at scale, but scale alone is not enough—accuracy and alignment matter just as much. Working on this paper reinforced my belief that effective AI systems should respect existing knowledge structures rather than bypass them. By anchoring generation in textbooks and explicitly checking for consistency, we aimed to show that reliability can be designed into AI pipelines, rather than treated as an afterthought. I hope this work encourages more research on quality-controlled AI generation and helps bridge the gap between impressive language models and the practical standards required in real educational settings.

Haimo Lu
Tsinghua University

Read the Original

This page is a summary of: TCM-Align: Curriculum-Aligned MCQ Generation for Traditional Chinese Medicine, October 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3714394.3756270.
You can read the full text:

Read

Contributors

The following have contributed to this page