-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
还是主题贴获取问题 #243
Labels
bug
Something isn't working
Comments
等我回国了看看 |
你没有指定sort(排序方式)导致的吧,默认按回复方式排序的,你每次获取的新一页的时候,贴子的回复状况发生了变化,导致原来在这一页的,但是变动到其他页去了,因而导致了“有得有失”的情况。 |
既然提了那我也在这提另一个问题,代码不变的情况下,按发布时间排序获取就会导致获取到的主题帖数量直接腰斩 |
明白了 十分感谢 @Dilettante258 |
我想你可以按时间顺序抓帖子,抓完之后,再重新抓一下前几页的内容,再去重。毕竟在你抓这一页时,“消失”的帖子位置只会提前。🤔 你到时候解决好了可以分享一下思路,我也在做帖子批量爬取的工作。 @Misaka19327 |
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
我不确定这个问题之前有没有提过,如果有实在抱歉orz
代码如下,还是获取所有的主题贴想要筛选。
之前看着数量差不多我就没太注意,不过今天我想要筛选出万赞以上帖的时候发现,至少有3个我确实知道的万赞以上的主题贴没有获取到。我又检查了一下之前获取过的数据,发现每次获取的数据再筛选后得到的,字面意义上的有得有失。可能这一次缺的是某一主题帖,下一次再获取筛选后这个缺的就补上了,但又缺了别的主题贴。我想知道大概的原因?
The text was updated successfully, but these errors were encountered: