Skip to content
This repository has been archived by the owner on Jul 8, 2024. It is now read-only.

Could the program run multiple queries in parallel? #9

Open
JaimeBadiola opened this issue Nov 12, 2018 · 6 comments · May be fixed by #23
Open

Could the program run multiple queries in parallel? #9

JaimeBadiola opened this issue Nov 12, 2018 · 6 comments · May be fixed by #23
Labels
question Further information is requested

Comments

@JaimeBadiola
Copy link

Hello!!

The program is awesome, congrats!

I am using the program to run very heavy queries that take a long time to be completed, and I was wondering if the program could be used to run queries on parallel. and if so, direction on what needs to be changed.

@Mottl
Copy link
Owner

Mottl commented Nov 12, 2018

It depends on your queries.

The algorithm is this:
query for results, get the cursor to the next result, iterate:

while active:
json = TweetManager.getJsonReponse(tweetCriteria, refreshCursor, cookieJar, proxy, user_agent, debug=debug)
if len(json['items_html'].strip()) == 0:
break
refreshCursor = json['min_position']

So in a common case you can't get the parallel execution since you don't know the next cursor (min_position).

But you can run queries in parallel if you can split your single query. For example, if you use --since or --until parameters, then simply split a long period of time to smaller chunks and run multiple GetOldTweets3 with different --since and --until params. You can also split your big query by usernames.

It seems that Twitter has some per IP limitations. Keep this in mind.

@Mottl Mottl added the question Further information is requested label Nov 12, 2018
@JaimeBadiola
Copy link
Author

Ok awesome! My queries are date based, so I will try to run multiple queries at the same time, as you said.

Thanks a lot!

@Mottl
Copy link
Owner

Mottl commented Nov 12, 2018

If you split by dates then --since param of the next interval must equal to --until param of the previous interval (do not add 1 day!)

@JaimeBadiola
Copy link
Author

JaimeBadiola commented Nov 12, 2018

You mean that i will have to send queries like:

GetOldTweets3 --lang en --querysearch "bitcoin" --since 2018-02-18 --until 2018-02-19
GetOldTweets3 --lang en --querysearch "bitcoin" --since 2018-02-19 --until 2018-02-20

?

@Mottl
Copy link
Owner

Mottl commented Nov 12, 2018

Yes, exactly.
That's because --until date is NOT INCLUDED in results — this is Twitter behaviour.

@JaimeBadiola
Copy link
Author

Ok perfect !!

Thanks a lot!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants