The Mastodon Recent Data Collector retrieves recent public posts and replies from a specified Mastodon server (aka Instance). Communalytic EDU can retrieve up to 5k recent posts, while Communalytic Pro’s cap is 50k. Unlike other platforms like Reddit and Telegram, you do not need to create a Mastodon account to collect data using the Mastodon API.
The following steps show how to collect data from Mastodon using Communalytic. The procedure for the EDU and PRO versions are identical.
Step 1 #
Go to the “My Datasets” page and click the “Server: Recent Posts” button.
Step 2 #
Name your dataset, then enter the name of the public Mastodon server you wish to collect from, for example, “mastodon.social” or “library.love”. Click here to see a list of Mastodon servers.
If you attempt to collect data from a private or non-existent Mastodon server, the following error message will appear.
Step 3 #
Select the maximum number of recent posts to collect from the drop down menu.
Step 4 #
Click the “Start Data Collection” button.
Step 5 #
To confirm that data collection is underway, you should be able to see your new dataset listed on the “My Datasets” page.
The Mastodon API limits data collection per IP address to one data request per second. Currently, five users can collect data at once. As a result, there may be a longer waiting time to collect compared to collecting Reddit or Telegram data on Communalytic. If Communalytic is at maximum capacity, you will get placed in a queue. Your dataset will still be listed on “My Datasets”.
Data Structure #
Visit the Mastodon Data Structure page to learn more about the types of data provided by the Mastodon API and collected by Communalytic.