Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list index out of range #31

Open
hanyunxuan opened this issue Nov 9, 2017 · 4 comments
Open

list index out of range #31

hanyunxuan opened this issue Nov 9, 2017 · 4 comments

Comments

@hanyunxuan
Copy link

Traceback (most recent call last): File "crawler.py", line 163, in <module> crawler.run() File "crawler.py", line 90, in run for index, url in enumerate(self.parse_menu(self.request(self.start_url))): File "crawler.py", line 116, in parse_menu menu_tag = soup.find_all(class_="uk-nav uk-nav-side")[1]

@wzming
Copy link

wzming commented Nov 15, 2017

同样出现了越界问题
Traceback (most recent call last):
File "crawler.py", line 163, in
crawler.run()
File "crawler.py", line 90, in run
for index, url in enumerate(self.parse_menu(self.request(self.start_url))):
File "crawler.py", line 116, in parse_menu
menu_tag = soup.find_all(class_="uk-nav uk-nav-side")[1]
IndexError: list index out of range

@daolanfler
Copy link

在request 函数 return response那里加个断点,这时候response.content 的值为 ...503 Service Temporarily Unanaliable..,说明访问流量过大,list是空的。
我是这样理解的啊哈,但是我把源码下载到本地,oup.find_all(class_="uk-nav uk-nav-side")[1],还是报错,这一点我就不明白了。。。

@afetmin
Copy link

afetmin commented Dec 15, 2017

廖老师的网站有反爬技术,请求多了就给个503

@fw6669998
Copy link

廖老师的网站有反爬技术,请求多了就给个503

在发送请求那儿加上个请求头就可以了
headers={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3704.400 QQBrowser/10.4.3588.400'
}
response = requests.get(url,headers=headers, **kwargs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants