Citizen Science on Twitter: Using Data Analytics to Understand Conversations and Networks

(He/Him)

Transparent

Verified academic

Lecturer

Information School

University of Sheffield

About

In this research, we wanted to understand how people discussed citizen science and crowdsourcing topics in online forums like twitter which we were studying this over a period over several months.

Citizen science is an area of research where members of the community, the public, or people who are non-professional scientists work on scientific problems to address specific issues that they are particularly concerned about.

One of the main topics of interest asks how citizens and communities work together to conduct scientific work. Increasingly, a lot of discussions about citizen science (as well as recruitments for citizen science) takes place online. Citizen science also relates to collecting data from communities and from citizens which is done through crowdsourcing.

How conversations around citizen science and crowdsourcing occur online is therefore crucial to understanding how to engage online users and communities in citizen science research.

So what we wanted to study is how people communicate about crowdsourcing and citizen science in Twitter on a longer term basis.

If we have a better understanding of the discussions that are around citizen science, citizen science project owners might be able to make use of this knowledge to engage more with communities in order to get people more and more interested about it. Because eventually, citizen science is mainly used to address issues that are of importance and of interest to members of the public.

Key Findings

The top ten hashtags that were identified from the data that we collected were: #citizenscience, #citsci, #crowdsourcing, #crowdsource, #data, #blockchain, #crowdfunding, #innovation, #cswglobal18, #ai, #decentralization and #crypto.
The majority of the interactions are in-bound, and primarily retweets of popular users/tweets. The following illustrates the different activities that we identified from the data collected: Crowdsourcing/funding initiatives to: find stolen laptop; support a defamation case against a politician; seek stories on sharing/hiring/borrowing assets or persons; seek lawyers willing to help the Philippine Long Distance Telephone Company (PLDT) employees for their contract termination; Promotion of: a hackathon to crowdsource mobility solutions; a homebuyer’s app; a marine mammal surveyor course on citizen science; maternity t-shirts; a non-profit organisation to promote Democratic candidates; News/Opinions regarding: how crowdsourcing (WhatsApp) was used to disrupt anti-terrorist operations in Kashmir; the release of a book on crowdsourcing for filmmakers; US DOJ’s actions on election integrity.
There exists key influencers with high number of outgoing and incoming links which, when weighted with the amount of connections (weighted degree) position themselves as key influencers within the community. Such accounts are primarily organisations, research projects and citizen science portals.
From the data we collected, our observations of the citizen science community discussions largely surrounds the following themes: Discussions on citizen science projects, platforms, organisations, personal appeals and courses; Citizens and organisations sharing (ecological) observations; Sharing news and current affairs on politics and public policy; Sharing examples where crowdsourcing has made an impact; Discussions on emerging topics such as bitcoin, blockchain, decentralization, data.

How to use

One of the things that it would be useful for citizen science project owners to identify networks of support, and people who could access influence to share and disseminate their projects. So if citizen science project owners were to look at improving their look at kind of building their presence on Twitter, they could make use of people who are well known within the community to share about their projects.
Citizen Science projects should share information that are valuable to wide audiences and not just talk only about their projects, but talk about the contextual aspects of the project in order to increase awareness and contribute to meeting wider audience.
For project owners to make the Twitter platform an opportunity to connect with different types of users will be quite valuable because people will be able to talk to them about how their work is being informed. In order to do so, it will be useful to do some online discussions during live events that can also help product owners build on their presence in Twitter.

Want to read the full paper? It is available open access

Mazumdar, Suvodeep & Thakker, Dhavalkumar. (2020). ‘Citizen Science on Twitter: Using Data Analytics to Understand Conversations and Networks’. Future internet, 12(210), pp. 1-22

About this research

This journal article was part of a collaborative effort

Dhavalkumar Thakker

This research was independently conducted and did not receive funding from outside of the university.

Recommended for

UN Sustainable Development Goals

This research contributes to the following SDGs

About this research

This research was independently conducted and did not receive funding from outside of the university.

This paper was co-authored

Single-Person-BLUE.png

Dhavalkumar Thakker

Recommended for

What it means

One of the ways we analysed the data was to study the networks that emerged by following the ‘@’ actions users would make while referring to another user. This analysis helped us observe the different types of networks and cluster of users that were being created.

So some of these clusters were based on really small isolated topics that only a few people were discussing about which led to small clusters being formed. These users were not really participating that much in conversations but just happened to read or talk to someone on a topic around citizen science. A large number of such networks were observed, which was as we would expect to be found in online social networks.

We found another type of network, which demonstrated strong connections between a large number of users, which we referred to as an inner core, because it consisted of lots of people talking about many topics, very much engaged in the conversations. These networks were strong and tightly knit, and key users often emerged who would disseminate information to a wide number of users. We refer to these users as key influencers within the community.

So there are different networks which facilitated discussions on social media around citizen science. Some of these key influencers are essentially organisations who are very well known within the community, and they often share information about citizen science observations or something to do with their projects.

Another question we asked was about what the predominant discussions around citizen science were and we had about five or six kind of points that we picked up. For example, there were discussions around citizen science projects, platforms, decision and discussions around citizens and organisations which were sharing observations. Furthermore, they were sharing news about where crowdsourcing has made an impact as well as discussing quite a lot of emerging topics at that point of time, like Bitcoin, decentralisation, Blockchain and so on.

Methodology

In order to understand social interactions among the citizen science and crowdsourcing community, we analyzed Twitter data before, during and following the European Citizen Science Association (ECSA) Conference 2018, held in Geneva, Switzerland between 3–5 June. Our approach primarily involves identifying keywords and hashtags relevant to the events, topics and themes of study.

As noted in the keywords and hashtags that we followed, Tweets relevant to generic citizen science topics were also collected, while also collecting ECSA2018 tweets.  In order to analyze the tweets, we used some natural language processing methods and we also looked at how often people are talking about citizen science.

We also used statistical methods to understand the different topics happening that are being discussed. In total, 238,063 tweets were collected, spread over 124 days, collecting an average of around 2000 Tweets per day for which we were using an automated approach and a method called topic modeling. And we use a method called LDA (linear discriminant analysis), which allows us to take a lot of content content and try to break them apart into different clusters of topics.

However, it’s important to note the limitations. We only focused on Twitter as a platform. And that limits the number of audiences that can be reached, the people knowing about the projects, or knowing about specific citizen science projects.

Another limitation is that we’ve done an observational study. And some of the aspects that we studied were collected and analysed by using automated methods to kind of come up with topics that were being discussed. If we would have used a manual approach then we might have had slightly different results.

What we did was more of a holistic study of these conversations that are happening. If we were to be a lot more detailed and a lot more thorough, then we could go down to individual tweets, go down to individual users and see how individual users are interacting, because different users might have different patterns of interaction. So it would be interesting to see that kind of study in the future.

Let your research make a social impact

Esther Feeken prepared this research following an interview with Dr Suvodeep Mazumdar.