A puppeteer-core powered scraper for Twitter.
npm i @lilyn/twitter-scraper
It's still in a minimal state, will add new features to it soon.
async function main() {
const browser = new CustomBrowser();
await browser.init({
headless: true,
execPath: "C:/Program Files/Google/Chrome/Application/chrome.exe"
});
const lilyn = new TwitterUser(browser, "LilynHana");
await lilyn.init();
// get profile info!
await lilyn.getProfile();
// get tweets every n scroll down
await lilyn.getTweetsbyPage(2);
// https://twitter.com/LilynHana/status/1624553417120022531
// gets every tweets until you encounter that tweet.
await lilyn.getTweetsUntilID("1624553417120022531");
// dont forget to close the browser
await browser.close()
}
$ twitter-scraper
Commands:
bin.js getProfile [at] Get the profile of the user/s.
bin.js getTweetsByPage [number-of-pages] Get the tweets of a user/s by pages.
[at]
bin.js getTweetsUntilID [at] [id] Scan every tweet until it encounters
the id. Example: getTweetsUntilID -@
LilynHana -id 9342084 -@ Soleil -id
89734
bin.js trackUser [at] [msRefresh] Tracks a user's profile and prints
when there's a tweet posted.
Options:
--version Show version number [boolean]
--path The file path of your chrome browser!
[string] [default: "C:/Program Files/Google/Chrome/Application/chrome.exe"]
--filepath The file path to write the results on! Note: Needs full file
path! [string]
--headless If puppeteer would show the window it's working on.
[boolean] [default: true]
--concurrency How much tabs puppeteer opens in a run. [number] [default: 3]
--help Show help [boolean]
Not enough non-option arguments: got 0, need at least 1
Feel free to send feature requests & issues in the Issues tab.
If you want to ask me about stuff, send me a DM at Discord: realalice#5699
.
Latest Update: 1.2.0:
- Image & Video Parsing in Twitter Posts
- Live Tracking
- Check the Twitter user periodically if they posted something new!
- Major Performance Boost:
- 20s-30s~ load times to 7-8s~ load times (for 8 users)
- New cache for Puppeteer
- faster loading times of twitter.com when warmed up
- Bluebird promise handling (only available in CLI, you'd need to implement this yourself in TS)
- better concurrency
- new
concurrency
option is added to determine how many tabs puppeteer will handle in one go
- Minor bug-fixes:
- changed resolution values (for twitter scrolling to work)
1.0.0 is broken, don't download it. Go with 1.0.1.
Ctrl + Shift + V to preview markdown in VSCode. Not related but.. I need a reminder here.