What is it about?
Precise and comprehensive measurements of socioeconomic status are crucial for both academic investigations and policymaking. However, in developing countries such measures are available at the local household level only at extremely low frequencies for eg: via a decadal census. A number of papers have attempted predicting economic status at aggregated geographical levels such as district or neighborhood using Deep Learning on images with varying degrees of success. However, the utility of such an approach at the household level remains open. In this study we utilize Deep Learning models on household images collected from four northeastern states in India to assess the feasibilty and ethics of household level income status prediction. We categorize households into classes based on income and then train a Swin Transformer model with cross-entropy loss and triplet loss to predict the socioeconomic class of the household. We then compare the prediction accuracy of our model with predictions using a simple list of household assets and predictions from a set of expert human annotators. We ind that the use of Deep Learning on images does not lead to any substantial gains in prediction accuracy. Further, we note that human accuracy on this prediction tasks is low, raising questions on the information contained within the images. Our study raises important questions regarding the ethical implications of utilizing household images for predicting socioeconomic status. We explore these ethical implications, emphasizing the importance of a cautious and considerate approach in incorporating image-based techniques.
Featured Image
Photo by liborio di buono on Unsplash
Why is it important?
Automating socio-economic status prediction of households via images falls into a similar class of algorithmic solutions to high-stakes decision-making problems and can potentially help governments to allocate scarce resources for sustainable development by identifying low-income households in a timely and cost-efective manner. Further such predictions can also provide a foundation for the study of inequality and the determinants of economic growth at scale, particularly in developing countries with inconsistent income reporting and monitoring practices. Capturing images for socio-economic predictions can have many advantages over traditional methods of collecting detailed lists of asset ownership information. One of the primary beneits of using images for socioeconomic status prediction is the speed at which data can be collected. With image capture, enumerators may more quickly collect the data without extensive surveys or interviews. This would not only save time but also reduce fatigue for both data collectors and participants which can often lead to errors. Moreover capturing images requires less human capital compared to traditional methods, hence making it a cost-efective option. Other than saving time and cost, images may also help in comparative analysis over time. Whereas the collection of retrospective data by household members can easily sufer from recall bias, images can be taken over a period of time and stored eliminating any human recall bias.
Read the Original
This page is a summary of: Assessing the Feasibility and Ethics of Economic Status Prediction using Deep Learning on Household Images, ACM Journal on Computing and Sustainable Societies, June 2024, ACM (Association for Computing Machinery),
DOI: 10.1145/3675160.
You can read the full text:
Resources
Contributors
The following have contributed to this page