Skip to content

Simple Java code for storing the Twitter stream in local gzipped files.

License

Notifications You must be signed in to change notification settings

girishsaraf/TwitterStreamFilter

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compress and Store the Twitter Public Sample

Intro

This project allows you to run a set-it-and-forget-it system for streaming the Twitter public sample stream to a local set of gzipped files. Each file will contain the JSON for all the tweets from the 1% stream.

Building

This project was written with Maven, so you should be able to do mvn package and use the resulting jar file TwitterStreamFilter-1.0-SNAPSHOT-jar-with-dependencies.jar in the target directory.

Running

You have a few options with this code. If you run it without arguments, it should download the unfiltered 1% stream. You can run code for that as follows:

  • java -Xmx1536m -jar TwitterStreamFilter-1.0-SNAPSHOT-jar-with-dependencies.jar

You can also provide a file with keywords to track (separated by a newline) using the --keywords/-k flag, a GeoJSON file to get tweets within a geographic area using the --bounds/-b flag, or the path to a file with a list of user IDs to track with the --users/-u flag.

While running, this code will produce several files: warnings.log.YYYY-MM-DD-HH, statuses.log.YYYY-MM-DD-HH, and many .gz files. At the end of every hour, the current warnings.log. and statuses.log.* will be gzipped automatically.

About

Simple Java code for storing the Twitter stream in local gzipped files.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 98.1%
  • Shell 1.9%