In 2014, the data unit of La Nacion newspaper in Argentina had access to thousands of documents detailing the expenses claimed by senators in the country. All of them were scans of low-quality copies; the only technology that could scrape them was the human eye. So they built VozData, a platform to crowdsource the task; of which the code could also be used in other contexts (under the name CrowData).
As this platform was used to crowdscrape different files, La Nacion Data realised the importance of engaging users in universities in this activity. What features could the platform involve to make this happen?
La Nacion Data in Argentina was one of the first data units to emerge in Latin America. Starting out a few years ago by sitting together in coffee shops learning how to use Tableau Public; they have grown and gone on to win numerous GEN data journalism awards, have been granted Open News fellows by Mozilla and have continued to find creative ways to make data-driven journalism happen in Argentina.
One of these ways was the development of VozData, a crowdscraping platform where files can be shared with an audience, who then answers questions about themselves (as a way to extract information from them). This platform was instrumental in the completion of three projects in 2014; including Senate spending (by opening up over 10,000 files), as well as a classified investigation in 2015 (involving over 20,000 audio files).
Given the success of VozData, La Nacion decided to open the code for implementation in other contexts (CrowData). So far, it has been implemented by The Herald in New Zealand for the Money in Politics project.
Since the first use of VozData/CrowData, La Nacion has had an outreach strategy to promote participation on the platform and have partnered with organisations and institutions whose goals align with this exercise in transparency. They’ve also realised that educational institutions can be great allies in the process. What design features should the platforms have to encourage the participation of students?
The La Nacion team wanted to integrate a ‘teams’ feature, like in VozData, however migrating it to the open source platform was too complex and time-consuming.
We introduced La Nacion Data to a developer, who migrated the teams feature from VozData to CrowData. The idea behind prioritising this feature for the open source version was that the sense of belonging to a group promoted accountability and healthy competition on the platform, which in turn promoted the engagement of users in crowdscraping opportunities.
With CrowData, La Nacion aims to keep other newsrooms or organisations from walking the same path they did.
For an organisation already seasoned in the challenges around data-driven work, some of the main learnings from the process came from something that would surprise many: the difficulty in translating the goals of the exercise into English for the developer! Communications amongst the team in development projects cannot ever be taken for granted.
As this development was underway, La Nacion continued to promote VozData/CrowData in different settings, increasingly gaining wisdom on the different ways to promote adoption as they go along. Today, they believe in adding others’ logos and being thorough in attribution in communications’ materials; organising many in-person activities (such as; civic hackathons, events in universities), partly so that everyone can meet the data team of La Nacion, break the potential barriers between themselves and a newspaper of national circulation; and produce VozData-driven articles for the print version of the newspaper.
How has La Nacion Data been able to succeed in ways that other newsrooms could only dream of today? The forces behind VozData/CrowData are senior staff of La Nacion. They started the newspapers’ website in 1995, and years later they decided to start a data section and put it on the online homepage. This support from senior staff has been instrumental for institutional buy-in, which in turn has allowed La Nacion Data to grow into the data unit it is today.
As the developer concludes the migration of the teams feature to CrowData, La Nacion has gained plenty of ideas to keep data-driven journalism coming out of this tool. In their next steps, they will be working on structured data visualisations of the crowdscraped data. For example: in the case of the Senate expenses, who are those becoming the top contractors? Where are the highest expenses being made – travel, ceremonies?
La Nacion Data will continue to spread the word about the importance of working in opening data to create demand for more data and foster its usage… Especially among students. Putting potentially open, currently closed data in their hands will speed up the process of data awareness, and the demand for better quality, open data.
Read La Nacion Data’s DataShift community essay here.