Lecturer/Assistant Professor | University of Sheffield

Citizen Science on Twitter: Using Data Analytics to Understand Conversations and Networks

Sustainable Cities and Communities

In this research, we wanted to understand how people discussed citizen science and crowdsourcing topics in online forums like twitter which we were studying this over a period over several months.


Journal article: Citizen Science on Twitter: Using Data Analytics to Understand Conversations and Networks (2020)
Peer Reviewed


This empirical data collection used a mixed methods approach, combining language processing methods, topic modeling, LDA (linear discriminant analysis) and social media monitoring and analytics.

In order to understand social interactions among the citizen science and crowdsourcing community, we analyzed Twitter data before, during and following the European Citizen Science Association (ECSA) Conference 2018, held in Geneva, Switzerland between 3–5 June. Our approach primarily involves identifying keywords and hashtags relevant to the events, topics and themes of study.

As noted in the keywords and hashtags that we followed, Tweets relevant to generic citizen science topics were also collected, while also collecting ECSA2018 tweets.  In order to analyze the tweets, we used some natural language processing methods and we also looked at how often people are talking about citizen science.

We also used statistical methods to understand the different topics happening that are being discussed. In total, 238,063 tweets were collected, spread over 124 days, collecting an average of around 2000 Tweets per day for which we were using an automated approach and a method called topic modeling. And we use a method called LDA (linear discriminant analysis), which allows us to take a lot of content content and try to break them apart into different clusters of topics.

However, it's important to note the limitations. We only focused on Twitter as a platform. And that limits the number of audiences that can be reached, the people knowing about the projects, or knowing about specific citizen science projects.

Another limitation is that we've done an observational study. And some of the aspects that we studied were collected and analysed by using automated methods to kind of come up with topics that were being discussed. If we would have used a manual approach then we might have had slightly different results.

What we did was more of a holistic study of these conversations that are happening. If we were to be a lot more detailed and a lot more thorough, then we could go down to individual tweets, go down to individual users and see how individual users are interacting, because different users might have different patterns of interaction. So it would be interesting to see that kind of study in the future.



This research was independently conducted and did not receive funding from outside of the university.

Key points

  • The observational study that we did on citizen science discussions on Twitter led to a vast range of topics, many of which we didn’t expect. It led us to improve our understanding of what kind of networks facilitate these kinds of conversations.

Citizen science is an area of research where members of the community, the public, or people who are non-professional scientists work on scientific problems to address specific issues that they are particularly concerned about.

One of the main topics of interest asks how citizens and communities work together to conduct scientific work. Increasingly, a lot of discussions about citizen science (as well as recruitments for citizen science) takes place online. Citizen science also relates to collecting data from communities and from citizens which is done through crowdsourcing.

How conversations around citizen science and crowdsourcing occur online is therefore crucial to understanding how to engage online users and communities in citizen science research.

So what we wanted to study is how people communicate about crowdsourcing and citizen science in Twitter on a longer term basis.

If we have a better understanding of the discussions that are around citizen science, citizen science project owners might be able to make use of this knowledge to engage more with communities in order to get people more and more interested about it. Because eventually, citizen science is mainly used to address issues that are of importance and of interest to members of the public.


  • The top ten hashtags that were identified from the data that we collected were: #citizenscience, #citsci, #crowdsourcing, #crowdsource, #data, #blockchain, #crowdfunding, #innovation, #cswglobal18, #ai, #decentralization and #crypto.
  • The majority of the interactions are in-bound, and primarily retweets of popular users/tweets.

    The following illustrates the different activities that we identified from the data collected: Crowdsourcing/funding initiatives to: find stolen laptop; support a defamation case against a politician; seek stories on sharing/hiring/borrowing assets or persons; seek lawyers willing to help the Philippine Long Distance Telephone Company (PLDT) employees for their contract termination; Promotion of: a hackathon to crowdsource mobility solutions; a homebuyer’s app; a marine mammal surveyor course on citizen science; maternity t-shirts; a non-profit organisation to promote Democratic candidates; News/Opinions regarding: how crowdsourcing (WhatsApp) was used to disrupt anti-terrorist operations in Kashmir; the release of a book on crowdsourcing for filmmakers; US DOJ’s actions on election integrity.

  • There exists key influencers with high number of outgoing and incoming links which, when weighted with the amount of connections (weighted degree) position themselves as key influencers within the community.

    Such accounts are primarily organisations, research projects and citizen science portals.

  • From the data we collected, our observations of the citizen science community discussions largely surrounds the following themes: Discussions on citizen science projects, platforms, organisations, personal appeals and courses; Citizens and organisations sharing (ecological) observations; Sharing news and current affairs on politics and public policy; Sharing examples where crowdsourcing has made an impact; Discussions on emerging topics such as bitcoin, blockchain, decentralization, data.

What it means

One of the ways we analysed the data was to study the networks that emerged by following the ‘@’ actions users would make while referring to another user. This analysis helped us observe the different types of networks and cluster of users that were being created.

So some of these clusters were based on really small isolated topics that only a few people were discussing about which led to small clusters being formed. These users were not really participating that much in conversations but just happened to read or talk to someone on a topic around citizen science. A large number of such networks were observed, which was as we would expect to be found in online social networks.

We found another type of network, which demonstrated strong connections between a large number of users, which we referred to as an inner core, because it consisted of lots of people talking about many topics, very much engaged in the conversations. These networks were strong and tightly knit, and key users often emerged who would disseminate information to a wide number of users. We refer to these users as key influencers within the community.

So there are different networks which facilitated discussions on social media around citizen science. Some of these key influencers are essentially organisations who are very well known within the community, and they often share information about citizen science observations or something to do with their projects.

Another question we asked was about what the predominant discussions around citizen science were and we had about five or six kind of points that we picked up. For example, there were discussions around citizen science projects, platforms, decision and discussions around citizens and organisations which were sharing observations. Furthermore, they were sharing news about where crowdsourcing has made an impact as well as discussing quite a lot of emerging topics at that point of time, like Bitcoin, decentralisation, Blockchain and so on.

How to use

  • One of the things that it would be useful for citizen science project owners to identify networks of support, and people who could access influence to share and disseminate their projects
  • Citizen Science projects should share information that are valuable to wide audiences and not just talk only about their projects, but talk about the contextual aspects of the project in order to increase awareness and contribute to meeting wider audience
  • For project owners to make the Twitter platform an opportunity to connect with different types of users will be quite valuable because people will be able to talk to them about how their work is being informed
Already have an account? Log in

Or join Acume to share your socially impactful research with policymakers. Publishing research is easy, impactful and free.

Mazumdar, Suvodeep. 'Citizen Science on Twitter: Using Data Analytics to Understand Conversations and Networks'. Acume. https://www.acume.org/r/citizen-science-on-twitter-using-data-analytics-to-understand-conversations-and-networks/