Conversations about responsible data collection (Part 3)

This is the third in a series of blogs sharing lessons learned from a collaboration between DataShift and the SPEAK! campaign and the resulting conversations about data management practices among diverse organisations working to overcome social divisions around the world. The series aims to show that sound data management is built on common sense and available to everyone, no matter their level of technical expertise; to get readers thinking and talking about data; and to encourage conscious decisions about its creation, use, protection and disposal. Click here to read the earlier blog posts.


Responsible data management means acting responsibly at all stages of its lifecycle: creation, use and disposal. Taking a step back to consider data more holistically helps it become something integrated into daily work processes, and not just a technical issue for the experts.

Most of us are familiar with the need to protect data from theft, but we may not always look at the stages before and after. Was the data ethically collected in the first place? Perhaps some of it is superfluous to our needs? What happens to it once it’s out of date? Before thinking of how to protect your data, it’s useful to consider what you choose to collect or create, the reasons behind those choices, and how you ensure that your collection processes are ethical.

During SPEAK! 2018, campaign partners organised dialogue events to overcome division around the world. We used a loose script of questions designed to get them talking about how they work with data, and to help us design support that would meet their needs. The first set of questions in the script dealt with the first stage in the ‘data lifecycle’ – collecting and creating data.

What data do you collect? Is it sensitive or personal?

Data comes in many shapes and forms – numbers, stories, words, videos, images – but is not always recognised as such. Almost all organisations work with at least some data that is personal (meaning it can be used alone or in combination with other information to identify, contact, or locate a single person) or sensitive (meaning it has the potential to harm individuals or communities if exposed unduly).

Having people recognise a wider spectrum of content as data, and to consider the differing sensitivity of different types, can make them more likely to consider it in their strategies for protecting data.

Did you obtain the data subject’s consent to use their data? How?

A person may consent to use their data for one purpose, but not for another. If you intend to store data or use it for other purposes beyond that for which it was originally collected, the data subject’s informed consent should be obtained.

Sensitive data is not limited to data that can identify or harm individuals. For example, releasing violent crime data at a neighbourhood level can cause property prices to drop and harm businesses in a particular neighbourhood identified as having high crime rates; this can then affect employment and exacerbate the causes of violent crime.

Is all of it necessary for the project?

It’s common to assume that the more data we collect, the deeper our insights will be, but less is more. An important principle here is data minimisation: the idea that we should collect the minimum amount of data necessary to complete the task at hand. Take a look at everything that is collected – does all of it serve a purpose? If you can’t give a good reason, perhaps collecting it is counterproductive. Details that aren’t helpful to you can still be harmful if they fall into the wrong hands. Collecting only the data that is really needed is more efficient, and the less data is collected, the less is put at risk in case of loss, theft or another compromise. The best way to secure data is to not have it in the first place.

In 2016 Open Whisper Systems, the company behind the encrypted messaging app Signal, was obliged by a US court to hand over details about a number of its users. But because it had applied the principle of data minimisation strictly in its operations, the company had collected no personal details about its users and did not have any access to their messages. So when the company complied with the court’s request, it was unable to hand over any useful information and its users’ privacy remained secure.




These blogs are based on the publication How to talk about data? Learnings on responsible data for social change from the SPEAK! campaign, and this work was made possible through a Digital Impact Grant by the Stanford Center on Philanthropy and Civil Society.