AuthAttLyzer: A Robust defensive distillation-based Authorship Attribution framework

Abhishek Chopra; Nikhill Vombatkere; Arash Habibi Lashkari

doi:10.1145/3586102.3586109

What is it about?

In this paper, we draw inspiration from recent advances in cybersecurity and propose a novel approach using the concept of defensive distillation to construct a new architecture for source code authorship attribution, enhancing its robustness against adversarial attacks (reducing misclassification rates). Through empirical analysis, we demonstrate the effectiveness of our approach, defensive distillation, combined with varied feature selection, in reducing misclassifications on perturbed source code files from Google Code Jam and GitHub databases while maintaining a 95% accuracy on legitimate source code files.

Why is it important?

Although it poses a privacy threat to open-source programmers, SCAA has proven immensely valuable in developing forensic-based applications, including ghostwriting detection, copyright dispute resolution, identifying authors of malicious applications using source code, and other code analysis applications. Recent advancements in SCAA techniques have demonstrated exceptional performance on diverse datasets. Specific challenges have emerged, such as gradient-based attacks and universal perturbations that can adversarially modify source code, significantly reducing the accuracy of state-of-the-art classification techniques based on deep neural networks to as low as 10%.

This page is a summary of: AuthAttLyzer: A Robust defensive distillation-based Authorship Attribution framework, December 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3586102.3586109.
You can read the full text:

Read

Contributors

The following have contributed to this page

Source Code Authorship Attribution (SCAA) identifies the actual author of source code in a corpus.

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Source Code Authorship Attribution (SCAA) identifies the actual author of source code in a corpus.

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management