top of page
Search

Python Code to Download Twitter Data

  • Writer: Robbie Geoghegan
    Robbie Geoghegan
  • Jan 1, 2020
  • 2 min read

Using the Python package "GetOldTweets3" to access Twitter Data - no developer license needed.


This article has also been published on Medium here.


I should start out by saying that the most robust approach for downloading Twitter data is to go to the source, sign up for a developer license with Twitter and access their API directly using Tweepy. However there is a much faster way to get your hands on Twitter data.



This guide is instead intended for those wanting to do one of the following:

  • Conduct some quick and simple analysis with Twitter data (this code can be executed in less than 10 minutes)

  • Access Tweets older than 1 week (the Twitter API only serves Tweets from the past week)

  • Download a large volume of Tweets (Twitter API limits the number of Tweets you can download after around 3,000)


The below guide will show how to download Twitter data using the Python package "GetOldTweets3" (documentation can be found here). This package allows you to set many useful filters for more targeted Tweet downloads including filtering by keywords, Twitter usernames, location and date ranges. To get started install the package.

pip install GetOldTweets3 

FILTERING BY KEYWORD AND LOCATION


First define what keyword and location to filter Tweets by, include date ranges and the maximum number of tweets.

Next use the package functions to download the Twitter data. Set up a DataFrame with the Twitter information.


Specific Twitter information needs to be extracted from the filtered data we've stored in the DataFrame above. We can extract several data points for each Tweet, including:

  • Tweet text

  • Username

  • Date of Tweet

  • Hashtags

  • Links to each Tweet

  • Retweets

  • Favorites

  • Mentions

Let's define a function to extract text, dates, hashtags and links to Tweets.

Finally execute the get_twitter_info function to return a DataFrame containing the 10 tweets we searched for with columns for each of the data points we extracted above.

FILTERING BY USERNAME


The package also enables you to filter by specific usernames in a similar way to keywords and locations. Simply define the username and run the code below.

FILTERING BY MULTIPLE LOCATIONS + EXPORT TO CSV


To gather tweets from multiple locations we can build a simple loop that leverages what we've defined above. First define a list of the locations of interest; useful if you want to analyze multiple cities or if you are searching variations of a specific location (e.g. New York, NY, Big Apple).


CONCLUSION


While this package isn't perfect in its data coverage of tweets, it enables quick and easy access to Twitter data using Python. Furthermore it has some advantages to the Twitter API I outlined in the introduction. The full Python code is accessible at my Github. I hope you enjoy!

GITHUB REPOSITORY

 
 
 

Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.
bottom of page