rehydratoR
This R package facilitates replication of Twitter-based research by providing a convenient function to download lists of tweets.
The input for the package is a list of tweet ID numbers. See https://archive.org/details/gaza-tweets for an example.
The output of the package are the tweets downloaded as a tibble or as JSON files. Examples for both are below.
This package limits the rate of tweet downloading so Twitter’s 90,000 tweet/15 minute limit is not exceeded. If you choose to download the tweets to JSON files, then a new JSON file will be created for every 90,000 tweet ID numbers.
Tweets that have been deleted or made private cannot be downloaded.
Getting Started
Users must acquire a consumer key, consumer secret, access token, and access token secret from https://developer.twitter.com on their own.
Examples
Contributing
kevincoakley, with zacharyst sending annoying e-mails.
Version History
0.5.2
-
Added a parameter called group_start that takes the list of split tweet IDs and keeps only those from group_start to the final list. That way, if a download is interrupted, which is likely for large corpuses, the user can restart the download at the group_start chunk, not from the beginning.
-
Added a line to print an estimate of how long a download will take
0.5.1
- Prepping for CRAN release
0.5.0
- Renamed Project to rehydratoR
0.4.0
- Save tweets to JSON files
0.3.0
- Changed rate limiting from waiting 915 seconds after every loop to timing the length of every loop and ensuring every loop is at least 915 seconds
0.2.0
- Rate limiting
- Added example file
0.1.0
- Initial Release
Authors
- Kevin Coakley - programming, testing
- Zachary Steinert-Threlkeld - requirements, testing, promoting
License
This project is licensed under the BSD License - see the LICENSE.md file for details
Acknowledgments
- Thank you to the support Christine Kirkpatrick and the San Diego Supercomputer Center at UCSD.