What is it about?

We took state-of-the-art tools designed to detect when a language learner mispronounced some word and put it in an app. The app taught beginner French using dialogues. We had absolute beginners of French use the app for about an hour and graded their improvement. We found that both a state-of-the-art mispronunciation detection tool and one that guessed at mispronounced words lead to similar improvements in the language. The guessing tool also matched teacher feedback more strongly.

Featured Image

Why is it important?

When building AI software, engineers often focus on technical goals that resemble their real-life counterparts but are easier to conceptualize and measure. For pronunciation training, we might end up trying to detect a learner's mispronunciations with 100% accuracy. When a teacher engages a learner, however, she must balance a variety of language goals while simultaneously keeping her feedback consistent and focused. That the guessing tool kept up with the state-of-the-art system is evidence that those technical goals aren't as in line with real-world performance as we might have thought. We advocate instead for focusing on evaluating language learning tools in terms of how well they perform their prospective task, rather than what is convenient from an engineering perspective.

Perspectives

When I use language learning apps day-to-day, I am often frustrated by the pass-or-fail nature of their feedback. Any mistake means an automatic fail, even if it has nothing to do with the content of the lesson. A teacher, seeing my frustration, would limit her minor qualms and focus on the lesson content in order to teach me the concept more quickly. To me, this exemplifies the difference in approaches between engineers and teachers. The former focuses on what's correct; the latter what's appropriate. That said, I don't think the approaches can't be reconciled: we can merely redefine correctness from something universal to something task-based. That is, the thing that is most correct is the thing that best leads to the desired outcome of the task. If these are learning goals, they can often be measured (e.g. did learners remember words after a week?). This may make traditional AI engineers uncomfortable, but is exactly the sort of research that practitioners in Human-Computer Interaction love to do.

Sean Robertson
University of Toronto

Read the Original

This page is a summary of: Designing Pronunciation Learning Tools, April 2018, ACM (Association for Computing Machinery),
DOI: 10.1145/3173574.3173930.
You can read the full text:

Read

Contributors

The following have contributed to this page