Tutorial: Twitter Bot Detection in Communalytic with Botometer API

The Twitter Bot Analysis Module is designed to detect potential use of automation based on a machine learning API called Botometer, a project by the Observatory on Social Media (OSoMe) at the University of Indiana. It analyzes accounts in your dataset and generates the probability scores for a variety of different types of automated and/or fraudulent activities.

Step 1

Go to the “My Datasets” page and click on the icon under “Twitter Bot Analysis” for the dataset you want to detect bots in.

Step 2

Next, you have to choose between analyzing top 100 accounts or all accounts in your dataset. Click on the respective button to start analysis.

Note: To perform the analysis on all accounts, you will require a Botometer API key. The video tutorial (at the end of the page) shows an overview of the steps required to acquire your key.

Step 3

From here, you are able to track the progress of your bot analysis. You can close the window and come back to check your progress as well. The load bar updates automatically after 1 minute and you can also update progress by clicking the “Check Progress” button. If you no longer want to conduct the analysis, you can click the “Cancel Analysis” Button

Step 4

After running the Bot Analysis, you can view the Botometer scores for the accounts. Along with the bot analysis measures, Botometer also displays the most used language for the accounts under the column “Majority Language“. As described by the Botometer team, the API calculates the following scores (each score ranges from 0=not likely to 1=highly likely):

Echo-chamber: accounts that engage in follow back groups and share and delete political content in high volume

Fake follower: bots purchased to increase follower counts

Financial: bots that post using cashtags

Self declared: bots from botwiki.org

Spammer: accounts labeled as spambots from several datasets

Other: miscellaneous other bots obtained from manual annotation, user feedback, etc.

Overall: this is a summary score based on several models trained by Botometer

In addition to the scores listed above, this page will also include CAP values (or Complete Automation Probability). CAP is a probability (between 0 and 1) that a Twitter account (with a given overall score or greater) is automated (aka bot). Read more about CAP and how to interpret CAP values on the Botometer FAQ page.

On Communalytic, you can also use the search bar to find scores for a specific account.

Step 5

To download the dataset with the scores, click the “download the full dataset link” in the text under “Results“.

Next, click the “Download CSV file” to download the data as a CSV file. In the CSV file, the variables beginning with “botometer_” will show the scores for the bot analysis.

Additional reference:

Botometer 101: Social bot practicum for computational social scientists (by Kai-Cheng Yang, Emilio Ferrara, Filippo Menczer)

Note: This pre-print is from the team that developed Botometer API