Data in fragile contexts – Enhancing context monitoring in Anbar

EXPERIMENT 3 Data in Fragile Contexts
P2 Define

February 8, 2021
Holger Kötzle, Moritz Mang, Heike Wintershoff, Michael Hillebrecht, Erik Lehmann, Christian Merz

0578 v3


65 % of GIZ partner countries are classified as fragile and the number of respective projects is increasing. Fragile countries often lack security, basic infrastructure and economic opportunities due to destruction. Transitional aid in international cooperation aims to reestablish conditions for a peaceful economic and social reconstruction process.

COVID-19 has exacerbated existing and sometimes deeply rooted political, economic, social and security challenges, especially in fragile contexts.

Accordingly, the GIZ project Restoration of Peace, Livelihoods and Economic Cycles in Anbar, Iraq focuses on the rehabilitation of productive and social basic infrastructure in order to reactivate local economic cycles and the creation of income opportunities, and promotes peaceful and inclusive coexistence in Anbar for the most vulnerable people.


The project relies on detailed information for planning, steering, monitoring and evaluation (internally as well as reporting to our commissioning parties). Up-to-date information is key in these dynamic environments and reliable ground/granular data is crucial for the quality of provided services. But data often isn’t available, up to date or reliable. Most data sources are outdated (e.g. household censuses) or highly aggregated (e.g. population structure provided only on a macro/national scale). Security and resource constraints in fragile environments make third-party monitoring with granular data a time consuming, expensive and sometimes impossible task.

The monthly Context Monitoring in Anbar and Ninewa Governorates in cooperation with CLIC & Baastel, a M&E service provider, regularly delivers updates on the situation in Anbar and other conflict regions by combining information from Focus Group discussions, Key Informant interviews, manual web and (to some extent) social media analyses, local NGOs, local administrative representatives and international agencies.

The heterogeneity of beneficiaries (the most vulnerable of host communities, small-scale farmers, farmers associations, women and youth) is a major challenge in terms of comprehensively understanding the status quo and potential conflicts amongst them. It’s key to understand beneficiary needs on a sub district level, to identify societal and economic trends and to correlate potential conflicts with dynamic characteristics of individual beneficiary groups. As an example, current conflicts are related to territorial claims, land ownership, ISIS affiliation, limited available resources etc. Accordingly, CLIC & Baastel interviews key informants from various groups like senior officials, influencers, secondary influential decision makers and beneficiaries themselves to collect information on concerns, (mutual) perceptions, economic needs, desires etc.

Other groups of interest are community leaders, change makers, students and youths. These groups are well represented in social media and conduct directionaland trend-setting discussions (humanitarian, civil society related etc.). Consequently, the key question that we want to address with our experiment is: “How can we utilize alternative data sources in order to better understand contextual realities on the ground that are correlated with conflict dynamics”?​


We propose to extent the project’s context monitoring by leveraging alternative data sources and systematically (and automatically) merge and cross-reference them to provide reliable – even real-time – information to assess the status of beneficiary behavior and perceptions and their correlation with potential conflicts.

Secondary data of interest include social media, websites, open data, citizen generated data, contributor networks etc. We want to prove our hypothesis that the integration of alternative data supports the validation of findings (e.g. through KII’s) and provides further details on beneficiaries’ behavior beyond the current context monitoring report, e.g. on hot topics, concerns, perceiving, sentiments etc. of communities.

Data collection in fragile countries is disproportionately difficult, expensive and conflictive by itself. However, throughout the Covid-19 crisis public and private institutions gradually adapt their data strategies and are more willing to develop concepts for data-sharing. We consider this as a chance to develop a cross-sectional database, which merges and cross-references various data sources and provides new and more accurate insights on national dynamics and real needs of urban/ rural population.

The technologies and methodologies we want to utilize comprise text mining & analytics, Natural Language Processing and computational linguistics (e.g. unsupervised text classification) as well as crowdsourcing from contributor networks that are employed by e.g. micro tasks advertised via facebook. All technologies shall be applied in a variety of languages including Arabic and local languages.

Except for higher frequency and quantity, we want to increase the quality of information. Analyzing metadata of social media posts like location, category etc. can help to detect bias of according posts since social media users do not represent overall demographics and societal structures. For example, social media users are younger on average than the overall population. In addition, influencer networks and political activists use social media more often than others to push a political agenda. To some extent there is systematic employment of users (fake groups, ‘human bots’) to raise certain ideas or topics etc. We want to systematically detect such contributions and validate the tweets and posts by dynamically identifying the source account and the origin of information thereby assessing the reliability.

Finally, on top of alternative data collection we consider data integration and innovative visualization to support informed decision making. For this purpose, additional secondary data sources (e.g. remote sensing imagery integrated in GIS) could be useful.

In summary we will

  • enhance current data gathering through social media listening, web scraping and crowd sourcing
  • validate information from social media through metadata analysis
  • convert and integrate raw data into infographics, data charts and dashboards for decision support