DATASHIFT CGD VISUALISATIONS

WHERE ARE CITIZEN-GENERATED DATA PROJECTS HAPPENING AND HOW DO THEY RELATE TO THE SUSTAINABLE DEVELOPMENT GOALS (SDGS)?

Ari Sahagún and DataShift have created a series of network visualisations to map and analyse the countries and themes that citizen-generated data (CGD) initiatives are exploring globally. This is part of our work to see how CGD could be used to track the progress being made against the United Nations Sustainable Development Goals (SDGs).

Visualizing connections instead of looking at a database or table format, reveals different qualities of the data sets. For example we can quickly see which SDGs are well represented and which aren’t and also where they may be gaps in data sets. By adding some additional geographic meta-data we can dig a little deeper to see how certain SDGs are being more prominently addressed in areas around the world and within DataShift’s focus areas.

The three tools we’ve produced compiled CGD projects that we’re aware of up to June 2016 but this is not an exhaustive list. Or, if you want to see the initiatives listed in text-format, browse the list of initiatives.

NETWORK VISUALISATION 101: THE TERMS AND TOOLS

TERMS, ALL THAT WONKY JARGON

Before using the tools an explanation of the components might be helpful. All network visualisations are comprised of two elements: circles and lines. The circles, sometimes called nodes, are connected to each other by lines, also called edges. For you data nerds out there, this is generally how network data is stored as well, in two separate lists of nodes and edges.

Many network visualisations, also called network maps or graphs, are displayed with a particular layout — in other words, where the circles are relative to each other is important. All of the graphs in this series are called force-directed. This means that the closer the circles are to each other, the more related they are.

This just barely scrapes the surface of network analysis. As we become more and more data literate, interesting ways to collect, analyze, visualise, and eventually act upon data are emerging.

Now that you know some jargon, let’s move on to looking at some maps.

NETWORK VISUALISATION 1: CDGIS AND UN SUSTAINABLE DEVELOPMENT GOALS

This map represents all citizen generated data initiatives in DataShift’s database as they are connected to UN Sustainable Development Goals.

CDGs

Network visualisation showing citizen-generated data initiatives (blue dots) as they are connected to UN Sustainable Development Goals (orange dots). Click to explore the live map.

Legend for this specific map:

  • The circles, or nodes, represent a combination of citizen generated data initiatives (blue) and UN Sustainable Development Goals (orange).
  • The lines between them, or edges, show that an initiative was tagged with that goal.

What to look for:

  • Orange nodes (representing UN Sustainable Development Goals) are sized according to the number of initiatives are connected to them. The bigger the node, the more initiatives are focused on that topic. Being very critical of the data, you could also question whether there is a bias in the coding of the initiatives – perhaps some sustainable development goals seem broader or less defined, and therefore could attract initiatives with a more vague mission. Which nodes are the biggest and smallest, and what does that say about this dataset?
  • Notice the closeness of orange nodes to each other: how close an orange node is to another orange node indicates their co-occurrence in initiatives. This could suggest relatedness of UN goals. Are there two goals that are particularly close to or far from each other? Further, SDGs that are closer to the center of the network map have more initiatives that are connected to other SDGs. These SDGs could be more interdisciplinary. On the other hand, SDGs on the margins have fewer initiatives with co-occurring SDGs.

NETWORK VISUALISATION 2: CGDIS AROUND THE WORLD

This map represents all citizen generated data initiatives in DataShift’s database as they are connected to UN Sustainable Development Goals and tagged with either a specific continent or working at the global level.

Network visualisation showing citizen-generated data initiatives (blue dots) as they are connected to UN Sustainable Development Goals (orange dots) and continents (green dots). Click to explore the live map.


Network visualisation showing citizen-generated data initiatives (blue dots) as they are connected to UN Sustainable Development Goals (orange dots). Click to explore the live map.

Legend for this map:

  • The nodes are a combination of citizen generated data initiatives (blue), UN Sustainable Development Goals (orange), and geographic location – either a continent an initiative is based in or global (magenta)
  • The edges show that an initiative was tagged with that goal and/or continent. Many of the initiatives are tagged with both UN SDGs as well as specific continents.

What to look for:

  • This map may at first appear rather confusing and cluttered. So first, start by looking just at the orange nodes – the SDGs. The positions of the goals in the center show how closely related they are to each continent (magenta nodes). For example, if a goal is closer to the lower right, it’s more well-represented in Africa. If it’s closer to the lower center, it tends to be represented by initiatives that work at the global level rather than within one specific continent.
  • With a quick glance and a trained eye, you can now easily see which continents have fewer initiatives related to them, which SDG goals are being focused on, and which are, on the other hand, underrepresented in this dataset.
  • Looking at just the magenta and orange nodes, you’ll notice they’re different sizes. These nodes are sized based on the number of their relationships – so smaller continent nodes have fewer associated initiatives. Remember though that this is limited to the data that DataShift has gathered, and doesn’t represent all initiatives — consider contributing to the data set by adding an initiative to the maps!

NETWORK VISUALISATION 3: CDGIS AND UN SDGS IN DATASHIFT’S FOCUS AREAS

This map represents only those citizen generated data initiatives in DataShift’s database that are concerned with one of our focus countries: Nepal, East Africa (Tanzania and Kenya), or Argentina. These initiatives are also tagged with UN Sustainable Development Goals they’re related to.

Network visualization showing citizen-generated data initiatives (blue dots) as they are connected to UN Sustainable Development Goals (orange dots) and DataShift’s 4 focus areas (green dots). Click to explore the live map.

Legend for the map:

  • The circles, or nodes, are a combination of initiatives (blue), UN Sustainable Development Goals (orange), and geographic location – one of the DataShift focus areas (green).
  • The lines between them, or edges, show that an initiative was tagged with that goal and/or the DataShift focus area. Notice that most of the initiatives have more than one link – one or more to a UN goal, and one to one of the focus areas.

What to look for:

  • This map may first appear rather confusing and cluttered. So first, start by looking just at the orange nodes – the SDGs. Similar to the map of the continents, the closer a SDG is to a geographic area, the more well-represented it is in that area (in this data set). Are there any SDGs that aren’t represented well in a particular focus area? This line of questioning can lead toward targeted support for these initiatives in a given area.
  • You can quickly and easily notice if any orange nodes — any SDGs — that aren’t connected to anything else on this map. This means that they are not represented by initiatives in the any of DataShift’s focus areas.

ABOUT THE DATA – THE COLLECTION AND ANALYSIS METHODOLOGIES

AS WITH ANY SET OF DATA, WE NEED TO BE AWARE OF ITS LIMITS.

Data are an artifact of their collection; this means that the quality of every data set you encounter is influenced by the how, when, where and by whom it was gathered. In other words what collection methodology was used, how accurately it was applied and what inherent bias may be present.

For our DataShift CGD Visualisations, the methodology followed for collecting this initial set of citizen generated data initiatives, was research through internet searches or from our growing DataShift community. Many of these initiatives were then categorized DataShift staff against SDGs that were deemed the most appropriate.

Using this approach to categorizing data bring in an aspect of possible bias. For example, the goal of Peace, Justice and Strong Institutions was the most represented amongst the CGD initiatives. Is this because the initiatives tend to be more focused on this goal, or because the DataShift initiative tends to attract this type of initiative, or is it because this goal seems more vague and is easily tagged when an initiative doesn’t fit easily into other categories? Acknowledging bias allows us to appropriately qualify any insights and conclusions as well as refine and improve future collection methodologies.

CONCLUSIONS

Network visualisations can be used to quickly understand connections within a data set. With a trained eye and some data literacy, you can glance at a network map and get a feel for what this group of citizen generated data initiatives is up to: from where they’re focused to what sustainable development goals they work on.

Questioning why certain connections you expected aren’t there, why some goals are strongly connected to specific places, and why some goals are over-represented starts getting your hands dirty with this data set.

From here, we can start to think about how to strategically intervene and change how the map looks – intentionally. For example, we want to see more initiatives about a specific SDG at the global level, and we can use snapshots of these network maps to measure progress over time.

About DataShift

Learning Zone

Direct Support

SDGs