26.04.2024

How is Data Work for AI Counted (and accounted for) and Why We Should be Concerned

by Srravya Chandhiramowuli, PhD Researcher, University of Edinburgh, UK

4 min read

Data workers in the AI industry are expected to spend no more than 6 seconds on drawing boxes around objects—an essential task for training computer vision models used widely in self-driving cars, X-ray analyses and facial recognition systems. A senior data annotation manager in India, that I interviewed during my fieldwork in outsourcing data annotation centres, referred to this 6 second target as the ‘industry standard’ for this type of task.  Such stringent targets form the basis of how data work, the indispensable work of compiling, cleaning and annotating datasets with crucial information for training AI systems, is described, discussed and decided upon in the AI industry today. And this approach adversely impacts data workers and the AI systems they help build.

In the wake of burgeoning interest in AI, the global data work industry is rapidly growing. Millions of workers in regions of low income, high inflation, growing unemployment, as well as in rural areas, migrant communities, prisons, and refugee camps produce datasets that are foundational to AI systems.  Yet, a model-centric outlook towards AI innovation devalues data work as nothing more than a set of repetitive, mundane tasks. It is this characterisation of data work as mundane that is evident in the targets (like the one mentioned above) and other reductive metrics that make up data work.

One of the most widely used metrics in data work is the average handling time (AHT). It indicates the time required to complete a given task. For each type of task, the AHT varies. It determines deliverable timelines and deadlines. The AHT value is agreed upon by the client (AI company providing the dataset) and vendor (the annotation centre annotating the dataset) at the start of the project.

And this value is deeply contested - AI companies negotiate for a lower AHT, to push for faster turnarounds; annotation vendors, on the other hand, seek to push back unrealistic expectations without jeopardising the client relationship. In the global supply chain of AI, counting aids the assertion of the client’s power and authority over annotation processes. Underlying these AHT values and negotiations is a presumption of total countability - the notion that everything, from tasks, datasets and deliverables, to workers, work time, quality and performance, can be managed by applying the logic of counting.

This spawns more counting such as computing and tracking hourly targets, accuracy rates and productivity scores. In annotation centres, these metrics are closely monitored by managers, placing stressful demands on workers to meet the target count every hour. Instead of valuing people’s judgement and discretion and allowing machines to learn from people, counting regimes drive human workers to behave more like machines. Their work 1 and expertise is devalued, resulting in difficult working conditions, meagre wages, limited autonomy and scope for skill development.

Efforts to promote safety and responsibility in AI would be incomplete without acknowledging and responding to the ground realities of data production for AI. Empowering worker representation and autonomy, and enforcing better regulatory scrutiny and accountability in the AI supply chain are crucial to envisioning more just futures of data work and AI.

 

Further reading:

US lawmakers demand answers from AI companies on use of underpaid, overworked data workers

The Exploited Labor Behind Artificial Intelligence

Fairwork Cloudwork Ratings 2023: Work in the Planetary Labour Market

 

About the Author

Srravya Chandhiramowuli is a PhD researcher in Design Informatics at the University of Edinburgh. Her PhD research examines the work of data annotation for AI, paying particular attention to systemic challenges and frictions, to envision and inform just, equitable futures of AI.

Technology, Employment and Wellbeing is a new FES blog that offers original insights on the ways new technologies impact the world of work. The blog focuses on bringing different views from tech practitioners, academic researchers, trade union representatives and policy makers.


Connnect with us

Friedrich-Ebert-Stiftung
Future of Work

Cours Saint Michel 30e
1040 Brussels
Belgium

+32 2 329 30 32

futureofwork(at)fes.de

Team