What is it about?

Keystroke dynamics has been shown to be a promising method for user authentication based on a user's typing rhythms. Over the years, it has seen increasing applications such as in preventing transaction fraud, account takeovers, and identity theft. However, due to the variable nature of keystroke dynamics, a user's typing patterns may vary on a different keyboard or in a different keyboard language setting, which may affect the system accuracy. In other words, an algorithm modeled with data collected using a mechanical keyboard may perform significantly differently when tested with an ergonomic keyboard. Similarly, an algorithm modeled with data collected in one language may perform significantly differently when tested with another language. Hence, there is a need to study the impact of multiple keyboards and multiple languages on keystroke dynamics performance. This motivated us to develop two free-text keystroke dynamics datasets. The first is a multi-keyboard keystroke dataset comprising of four (4) physical keyboards - mechanical, ergonomic, membrane, and laptop keyboards - and the second is a bilingual keystroke dataset in both English and Chinese languages. Data were collected from a total of 86 participants using a non-intrusive web-based keylogger in a semi-controlled setting. To the best of our knowledge, these are the first multi-keyboard and bilingual keystroke datasets, as well as the data collection software, to be made publicly available for research purposes. The usefulness of our datasets was demonstrated by evaluating the performance of two state-of-the-art free-text algorithms.

Featured Image

Why is it important?

An algorithm modeled with data collected using a mechanical keyboard may perform significantly differently when tested with an ergonomic keyboard. Similarly, an algorithm modeled with data collected in one language may perform significantly differently when tested with another language. Hence, there is a need to study the impact of multiple keyboards and multiple languages on keystroke dynamics performance.

Perspectives

We developed two novel datasets - multi-keyboard and bilingual datasets - to be used for research purposes. In demonstrating the usefulness of the datasets, we replicated two state-of-the-art algorithms on them. The results of our initial evaluation indicates that cross-keyboard and cross-language do significantly affect keystroke dynamics performance.

Ahmed Wahab
Clarkson University

Read the Original

This page is a summary of: Shared Multi-Keyboard and Bilingual Datasets to Support Keystroke Dynamics Research, April 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3508398.3511516.
You can read the full text:

Read

Contributors

The following have contributed to this page