by Alessandra Maia Terra de Faria, Carlos Trucíus, and Marcelos Cantañeda de Araújo

Reviewed by Matheus Lucas Hebling


This research aims to follow the tweets related to the three main presidential candidates according to the opinion polls available for the 2022 elections in Brazil.


Daily tweets spanning from May 1st to May 31st were collected for each one of the three main candidates in the Brazilian presidential election. Tweets were collected from both candidates’ timelines and Twitter users mentioning the candidates, totaling more than 13 million tweets. Data were extracted through a Twitter API used exclusively for academic purposes and analyzed using R software. The authors thank Twitter for the academic accounts granted to them.


General context


Herein is the updated data (May versus April, data as of June 15th, 2022) of Twitter followers for each of the candidates.

  • Bolsonaro – from 7.8 up to 8.2 million (5.1% increase in followers compared to the previous month)

  • Lula – from 3.4 up to 3.6 million (5.9% increase in followers compared to the last month)

  • Ciro – 1.3 million (no change verified)


Candidates’ tweets

In Image 1, we report the number of tweets on the candidates’ timeline, among the three that were part of our survey: Ciro, Lula, and Bolsonaro, according to the frequency with which the candidates tweeted in April.

Image 1: Timelines

Images 2 and 3 present the most frequent words in the candidates’ timeline tweets and the most frequent words in the candidates’ timeline tweets weighted by the inverse document frequency (TF-IDF).

Image 2: Most frequently used words in the candidates’ timeline.

Image 3 TF-IDF by candidates’ timeline

The analysis of the most frequent words in the candidates’ timeline tweets in Image 2 reveals a dominant overview of the subjects they deal with. In Bolsonaro’s profile, it is possible to point out a continuity, about the month of April, with emphasis on the government’s achievements spending in “Reais (r)” expressed in figures of “billions” [“bilhões”] and “millions” [“milhões”]. The consolidated novelty of the month of April appears in the mention of the term “taxes” [“impostos”] associated with these images. Lula’s profile highlights words such as “country” [“país”], “Brazil”, “people” [“povo”], “everyone” [“todos”], “persons” [“pessoas”] and the focus on the present – “today” [“hoje’]. In Ciro’s profile, the concern remains with mentioning the other two candidates by name, and it is also possible to note, in data consolidated for the month, the mentions of “people” [“povo”], “today” [“hoje”], “Brazil”, “folk” [“gente”] and “follow” [“acompanhe”]. Finally, the name “Brazil” is a common mention observed in the profile of the three candidates.

In Image 3, the TF-IDF (term frequency-inverse document frequency) reflects the frequency of words in candidate timeline tweets that are infrequent for the three candidates overall. Thereby:

  • There is continuity in Lula’s actions, with a more developed profile, through the emphasis given to the use of verbs such as “spread” [“espalhar”], “take care” [“cuidar”], “study” [“estudar”], “work” [“trabalhar”] and “we want” [“queremos”]. New labels emerge such as “love” [“amor”], “university” [ “universidade”] and “universities” [“universidades”]. It is interesting to note that the two terms, singular and plural, were the main topic of the month.

  • Bolsonaro’s profile highlights the words “taxes” [“impostos”], “hiring” [“contratação”], “measures” [“medidas”], “federal”, “km” and “delivered” [“entregues”], words characteristic of the ongoing government achievements. The reference to the year “2019” remains in comparison to April, and the novelty is the mention of the year “2021” and the number “4”.

  • In Ciro’s profile, emphasis is given to the words “gregorio”, “debate” and “candidate” [“candidato”] (referring to the debate between the candidate and Gregório Duvivier, a Brazilian TV show presenter, that took place in the second half of May). The last words mentioned (“let’s go” [“vamos”], “participate” [“participle”], “because” [“porque”], “my” [“minha”], “youtube” and “7:30” [“19:30”]) refer to the live Ciro Games given by the candidate every Tuesday at 7:30 pm.

In general, it can be noticed, on one hand, that Bolsonaro’s tweets focus on actions taken during his ongoing government. Lula’s tweets, on the other hand, focus on propositional actions as a candidate. Finally, Ciro’s tweets highlight recent events involving the candidate.

Tweets about the candidates

The total number of tweets mentioning each candidate is displayed in Image 4 and the daily evolution in Image  5.

Image 4: Number of tweets mentioning the candidates.7

Concerning what was found in April, the previous rank was kept. It is possible to point out, however, that the distance that once separated Bolsonaro from Lula has diminished. In April, that distance was almost 2 million tweets. In May, that difference drops to less than a million. Compared to the previous month, the number of tweets mentioning the candidates increased by approximately 5% (Bolsonaro), 33% (Lula), and 39% (Ciro).

Image 5: Daily evolution of tweets mentioning the candidates.

In the daily evolution of the number of tweets (Image 5), it can be seen that, in general, Ciro has the lowest number of tweets (except on May 21st, where this number slightly exceeded the number of tweets from Lula). On the other hand, the number of tweets mentioning Bolsonaro is generally higher than that of the other candidates throughout the month, with the only exception of about 7 days in the first half of the month, when Lula had a higher daily number of tweets.

Word clouds

Finally, we present below three-word clouds with, excluding stop words, the top 100 words used in the interactions of Twitter users in May. For better visualization, each candidate’s name was taken from its cloud.

Image 6: Wordcloud for Bolsonaro

Bolsonaro: in the foreground, the words “Lula”, “Brazil” and “government” [“Lula”, “Brasil”, “governo’’]; are consolidated. In the background, “now” [“agora”], “against” [“contra”], “Moraes”, “today” [“hoje”], “people” [“povo”], “turn” [“turno”] and “Supreme Court” [“STF”]

Image 7: Wordcloud for Lula

Lula: in the foreground it appears “president” [“president”], “Brazil”, “PT”, “Ciro” and “turn” [“turno”]; in the background “wedding” [“casamento”], “campaign” [“campanha”], “first” [“primeiro”] , “turn” [“turno”], “Time”, “survey” [“pesquisa”], “vote” [“votar”], “vote” [“voto”], “because” [“porque”], “want” [“quer”], “today” [“hoje”], “day” [“dia”], “folk” [“gente”], “to have” [“ter”] and “now” [“agora]. It is worth noting that the mention of the word “Time” was not limited to the week in which the former President Lula was on the cover of Time magazine but remained present through the observed monthly profile.

Image 8: Wordcloud for Ciro

Ciro: the trend remained in the foreground, in an isolated way, as references to “Lula”, “Bolsonaro”, “Gregorio” and “turn” [turno”] stand out; in the background, “Brazil”, “to vote” [“votar”], “now” [“agora”], “today” [“hoje”], “Duvivier”, and “debate”.

Sentiment analysis

The sentiment of each tweet was constructed by identifying the sentiments of the basic units (the words) using the Oplexicon v3.0 and Sentilex dictionaries, from the LexiconPT Package. Thus, each word in the dictionaries receives 1, -1, or 0 scores, depending on whether the feeling is positive, negative, or neutral, respectively. Words not found in the dictionaries also receive a 0 score. The values assigned to each word within the tweet were added up, and depending on the result positive, negative, or zero, the sentiment of the tweet is classified. In Image 9, feelings (Negative, Neutral, and Positive) are presented in percentages per candidate. It is possible to highlight a balance between the feelings expressed in the tweets of the three candidates. Such data will be monitored over time comparatively. This is a portrait, a sentimental snapshot of May on Twitter.

Image 9: Sentiments of tweets per candidate

Next, it will be possible to look at the word cloud of each candidate, separately, according to the sentiments attributed to each tweet, in Images 10, 11, and 12. Words in pink appear in tweets rated as associated with positive feelings, words in blue appear in tweets rated as associated with negative feelings, and words in beige appear in tweets rated as neutral. The word clouds are considered the 200 most frequent words.

Image 10: Word cloud Sentiments for Bolsonaro

Image 11: Word cloud Sentiments for Lula

Image  12: Word cloud Sentiments for Ciro

  • Bolsonaro: Tweets related to candidate Bolsonaro that were classified as associated with positive sentiments are characterized by words such as “world” [“mundo”], “president” [“presidente”], “good” [“bom”], “better” [“melhor”] and “re-elected” [“reeleito”]. Tweets classified as associated with negative sentiments are characterized by words such as “to vote” [“votar”], “guilt” [“culpa”], “guy” [“cara”] and “arrested” [“preso”]. Finally, tweets considered as neutral are characterized by words such as “Moraes”, “now” [“agora”], “urgent” [“urgente”], “government” [“governo”] and “2022”.

  • Lula: Tweets related to candidate Lula that were classified as associated with positive sentiments are characterized by words such as “good” [“bom”], “know” [“sabe”], “world” [“mundo”], “to see” [“ver”], “better” [“melhor”] and “favor”. Tweets classified as negative are characterized by words such as “vote” [“votar”], “guy” [“cara”], “XP” and “inmate” [“presidiário”]. Finally, tweets with neutral sentiments are characterized by words such as “sai” [“leaves”], “president” [“president”], “first” [“primeira”], “13”, “convict” [“preso”], “Bolsonaro” and “showmício” ( there is no exact word in English to this Brazilian saying, “showmício” is the combination of two words [“show + comício”], to describe a type of rally in which one or more candidates for elections talk about their proposals, with moments reserved for the presentation of artists, usually singers of great popular appeal, to attract audiences).

  • Ciro: Tweets related to candidate Ciro that were classified as associated with positive sentiments are characterized by words such as “better” [“melhor”], “to know” [“sabe”], “good” [“bom” and “boa” – with the feminine(a) designation for good in Portuguese] and “to see” [“ver”]. Tweets classified as negative are characterized by words such as “to vote” [“votar”], “guy” [“cara”] and “to debate” [“debater”]. Finally, tweets with neutral sentiment are characterized by words such as “fucking” [“fudendo” – bad language], “Bolsonaro”, “Doria”, “behind” [“atrás”] and “Lula”.


The 25 most frequent bigrams in tweets mentioning each of the candidates are shown in Images 13 to 15. The direction of the arrow reveals the order in which the bigram appears and the greater the intensity of the arrow, the greater the frequency of the bigram.

Image 13: Bolsonaro’s Bigrams

Bolsonaro: among the most frequent bigrams we have “president => Bolsonaro”, “Jair => Bolsonaro” and “government => Bolsonaro”. Other bigrams directly related to these and that called our attention are the sequence “Bolsonaro => against =>Alexandre” referring to the discrepancy between the current president and the current minister of the Supreme Court. In a similar sense, the sequences “Flavio => Bolsonaro” and “Eduardo => Bolsonaro” are also frequent, referring to tweets that may not necessarily refer to the presidential candidate, but to his both sons, that are also elected representatives in the country ( Flávio Bolsonaro and Eduardo Bolsonaro, respectively senator and federal deputy elected together with their father by Rio de Janeiro state in 2018).It is also possible to highlight the secondary mentions “Armed => Forces”, “Elon => Musk” and “first => turn”.

Image 14: Lula’s Bigrams

 Lula: among the most frequent bigrams we have “ex => president/prisoner => Lula”, which unfolds, from the name “Lula => wants”, “government => Lula”, “Lula => leaves”, “Lula => says”, “Lula =>spoke”, “Lula =>said”, “Lula => thief”, “Lula => never”, “Lula => against”. Secondly, “first => turn”, “second => turn”, “Daniela => Mercury”, “middle => class”, “I will => vote”, “magazine => Time”, “fake => news”, “public => money”, among others.

Image 15: Ciro’s Bigrams

Ciro: among the most frequent bigrams we have, predominantly, “Ciro => Gomes”, “Ciro => games”, “Ciro => says”, “Ciro => Lula”, “Sérgio => Moro”, followed by “ Bolsonaro => 31”, “third => way”, “second => turn”, “every => body”, “Ipespe* => research”, “Gregorio => Duvivier”, “useful => vote” (or tactical vote), among others .

*IPESPE –Institute for Social, Political and Economic Research [Instituto de Pesquisas Sociais, Políticas e Econômicas] is one among the oldest and most respected centers of research on public opinion and market research in Brazil. ( )

Final comments

The presentation of this dataset aims to contribute to interpretations of the movement on Twitter of possible candidates in the 2022 elections, as well as about what is said about them in the interactions of users of the platform throughout the month of May, in comparison to April, and that can be found here ( This is ongoing research work and will be refined over the months leading up to the 2022 election.

Alessandra Maia Terra de Faria, Carlos Trucíos and Marcelo Castañeda de Araujo (2022) "May: Brazilian presidential candidates on Twitter". Brazilian Research and Studies Blog. ISSN 2701-4924. ISSN 2701-4924nameVol. 3 Num. 1. available at:, accessed on: April 12, 2024.