Ongoing Projects

Modeling Climate-Induced Societal Adaptation and Population Displacement with New Machine-Coded Environmental Event Data

This project will generate original data to study (1) how climate and environmental stressors shape a broad range of individual and group adaptation behaviors and (2) which government-level adaptations reduce the prevalence of socially problematic adaptations, such as household migration or local inter-group violence, which often precede wider societal disruptions. To generate original data on climate adaptation policies and behaviors at the individual, social group, and governmental levels, we will use a series of fine-tuned Large Language Models (LLMs) applied to a unique corpus of 120 million articles published by local news outlets based in more than 60 developing countries from 2012-2024. This will allow us to create a monthly, sub-national (ADM1 [district] or ADM2 [province]) dataset on 10 distinct types of environmental adaptations never before measured at such scale. We will also use this method to generate new media-derived data for reporting on sudden onset weather and environmental events as well as slow onset environmental change that we expect to cause adaptation behaviors. We will complement these data with forthcoming data from the International Organization of Migration (IOM), publicly available project-level data on adaptation-focused development aid from the OECD, and high-resolution data on climate and environmental stressors.

Machine Learning for Peace

The Machine Learning for Peace (MLP) project provides digital tools to help donors and civil society by monitoring and anticipating major political events and influence from geopolitical competitors and a flexible research infrastructure for policy research. MLP uses recent advances in big data and machine learning to provide actionable data at an unprecedented scale and frequency across more than 60 countries. MLP builds continuously on a repository of more than 120 million news articles capturing more than 12 years of daily coverage from a curated sample of more than 350 online newspapers publishing in more than 37 languages.

Applying cutting edge tools, MLP identifies the events and locations being reported in each article and uses predictive analytics to detect historical patterns and forecast likely changes in political conditions and foreign influence up to six months into the future. MLP’s digital tools provide access to monthly data tracking 20 events bearing on civic space, including censorship, corruption, and legal changes, and 22 events indicative of foreign influence by geopolitical competitors, ranging from security engagements to domestic interference. Importantly, these data are updated every 90 days, ensuring their timeliness for analyzinging current events in addition to longer-term trends.

Central American Regional Media Project (ReMedios)

Under the ReMedios activity, DevLab is designing and implementing an evaluation of USAID programming meant to strengthen the capacity and security of journalists and media outlets working on issues related to corruption and government transparency. This project will involve a three-wave panel survey of journalists across four countries in Central America, scraping and machine classification of online news, and training a large language model to identify markers of quality in corruption reporting. See the Design Report for more details.

Distinguishing Legitimate, Superficial, and Fabricated Anti-corruption Campaigns

Under this two-year activity with Internews and the International Foundation for Electoral Systems (IFES), DevLab@Penn is developing a system to detect reporting on anti-corruption campaigns and classify them as Legitimate, Superficial, and Fabricated. Specifically, we will draw on the Machine Learning for Peace project’s infrastructure, which scrapes articles from more than 350 high-quality domestic media outlets based across 60 countries and uses large language models to detect reporting on corruption-related legal measures (including arrest, legal actions, purges, and raids). Under this project, DevLab will identify anti-corruption campaigns by detecting large increases in reporting on corruption-related legal measures. DevLab will then measure the extent to which legal measures indicate genuine or politically-motivated campaigns.