In this article, Joshua Blumenstock argues for a “humbler” data science approach that attempts to aid human development while simultaneously mitigating the alleged silver bullets that have missed their mark in recent decades. He argues that the promise for data-based development lies in machine learning algorithms that can analyze data from mobile phones to develop solutions for humanitarian issues in a timely and cost effective manner. For example, resources can be matched to people living in poverty by slightly tweaking existing advertising algorithms that are used to match products to customers by corporations like Google and Facebook. Data driven solutions range from generating high-resolution maps of crop yields and childhood malnutrition to analysis of digital footprints that can improve public-health interventions during an epidemic or assist national and international responses to crises.
However, Blumenstock also points out four major problems with such tools. Primary among them is unanticipated effects. Using big data to develop solutions can prove to be problematic as it can bolster those who are already in positions of power rather than those who need aid as the power to extract valuable information from the data lies in the hands of those who are in the positions of power. He also highlights how there are flaws in new approaches to collecting data that have not been validated like older, time tested (but less efficient) approaches and cites evidence that suggests that patterns cannot be generalized as conditions under which data was collected can alter significantly. Next, Blumenstock underlines the fundamentally biased nature of algorithms trained with mobile phone data; the prerequisites to afford a mobile phone to provide data include a baseline level of electricity, connectivity, and social literacy that many people in developing countries do not have. He then argues that there is a major lack of regulation in data collection as a result of private companies having little incentive to do anything other than maximize profit. Finally, Blumenstock points to the ways forward in which new sources of data should be validated by acting as complements to existing data sets, customized by taking both specific/local contexts and existing algorithms into account, and that collaboration should be deepened between data scientists, development experts, governments, and private corporations.
The ideas of good intent, transparency, and the balancing act are the pillars surrounding the intersection of data science and human development. Like a system of checks and balances, each pillar proposes a set of unique challenges for both data scientists and development advocates to overcome. To clarify, good intent cannot be the only factor motivating data scientists to advance development as the lives of real people are at stake. Similarly, by increasing transparency between those who collect the data, the people who are providing the data, those who analyze the data, and those who implement solutions based on the data would only act to streamline the process as a result of both data scientists and human development advocates having access to the information they need.