-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
获取gbk/gb2312编码的网页 #18
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
爬虫有时候会遇到网页编码为gbk/gb2312的网页,这些网页爬取后,里面的中文是全部乱码的,解决方案是用iconv-lite进行转码。例如这个网页 http://1212.ip138.com/ic.asp ,就是gb2312编码的,爬取到的数据就会是中文乱码。具体转码过程如下:
The text was updated successfully, but these errors were encountered: