What is it about?

Modern software systems are large and complex, which makes it difficult to ensure that they stay reliable during failures. Chaos engineering tests how these systems behave by introducing controlled disruptions and observing the results. This paper reviews 96 academic and industry sources to understand how chaos engineering is defined, how it is used, and what tools support it. We identify the main features of chaos engineering platforms and propose a taxonomy that organizes tools by their environment, automation level, and deployment style. We also compare commonly used tools using this taxonomy and examine current research trends and open problems in the field.

Featured Image

Why is it important?

Chaos engineering is being used more widely, but research and industry often describe it in different ways. This paper brings these perspectives together by reviewing 96 academic and industry sources published over the last eight years. We provide a unified definition of chaos engineering and identify the core functions shared across tools and practices. A key contribution is a taxonomy that shows how tools differ in their environments and automation features. This allows organizations to choose tools that match their systems and helps researchers compare approaches on a common basis. Our findings also reveal gaps in current work, showing where further research and tool development are needed.

Perspectives

Writing this paper was a rewarding experience because it allowed me to bring together work from both researchers and practitioners in an area I care about. I enjoyed the collaboration involved in shaping the review, and the process deepened my appreciation for how much the field has grown. I hope the article helps others navigate chaos engineering more confidently and encourages further work on the open challenges we identified.

JOSHUA OWOTOGBE
Tilburg University

Read the Original

This page is a summary of: Chaos Engineering: A Multi-Vocal Literature Review, ACM Computing Surveys, November 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3777375.
You can read the full text:

Read

Contributors

The following have contributed to this page