What is it about?

Big data powers everything from chatbots and sports analytics to medical research, but its massive quantity and mixed quality make it easy to draw the wrong conclusions without time-consuming and expensive checking of data and results by humans. This paper demonstrates, with 3 real-world applications, how crowdsourcing tools like Amazon Mechanical Turk can be used to augment big data and improve its value.

Featured Image

Why is it important?

Big data undergirds more critical decisions every day, many of which connect to complex social issues. We define evidence-based best practices for improving its value and accuracy with less cost and effort, focusing on recommendations that anyone can implement to maximize the impact of the augmentation and the data itself.

Perspectives

The more people try tools like Chat-GPT, that rely on big data and promise to solve big problems, the more obvious it becomes that biases and limitations in our data can have huge real-world implications. What I hoped with this paper is to make it easier for researchers to try and address some of those in ways that are practical, even on short timelines and with limited budgets, regardless of their background or expertise. Again and again, we've discovered that collective intelligence of many everyday people can sometime exceed even that of experts , but it still takes work to do that well and this paper can hopefully make the doing it well part easier while also helping ensure workers are treated fairly.

Dr. Nathaniel D. Porter
Virginia Polytechnic Institute and State University

Read the Original

This page is a summary of: Enhancing big data in the social sciences with crowdsourcing: Data augmentation practices, techniques, and opportunities, PLoS ONE, June 2020, PLOS,
DOI: 10.1371/journal.pone.0233154.
You can read the full text:

Read
Open access logo

Resources

Contributors

The following have contributed to this page