Tweetalyzer

A twitter sentiment analysis module that allows for the search and organization of millions of tweets from the web archive API. Improved JSON retrieval speed for the Internet Archive by 90%, and found a bug in the web archive requests system.

While exploring the vast resources of the Internet Archive, I set out to create a solution that could seamlessly retrieve and organize records into structured Parquet files. Along the way, I uncovered a glitch in the Web Archive API request system. Instead of letting it slow me down, I dove into the problem, optimizing the retrieval process and boosting efficiency by an incredible 90%. To tie everything together, I designed and built a multithreaded Textual UI application from scratch, making the entire workflow intuitive and user-friendly. This project not only streamlined data processing but also demonstrated the power of creative problem-solving.

Enter any archived twitter user, an amount of tweets, and a time frame to collect their tweets. A file can be specified to organize and place the tweets into for further analysis.
The tool in action. Tweets are collected one by one to avoid buffering out the API, but speeds can be increased tenfold within the archive HQ.