Skip to content
This repository has been archived by the owner on Dec 8, 2023. It is now read-only.

价格元素尚未出现 #16

Open
cesaryuan opened this issue Aug 3, 2020 · 3 comments
Open

价格元素尚未出现 #16

cesaryuan opened this issue Aug 3, 2020 · 3 comments

Comments

@cesaryuan
Copy link

我这边看了一下,好像每次请求的京东网站都会重定向到登录页面,不知道你们有没有这个问题,我是直接运行的crawler_selenium.py ,自己也测试了几个其他的商品,都是这样。
另外我用其他几个无头浏览器测试都是在开启无头模式的时候就会跳转到登录页。
DevTools listening on ws://127.0.0.1:1846/devtools/browser/97cd85d8-2f68-4b74-be31-e2304fbfc7a4 2020-08-03 16:40:19 | INFO | crawler_selenium.py 46 | Crawl: https://item.jd.com/6287165.html 2020-08-03 16:40:19 | INFO | crawler_selenium.py 61 | 价格元素未出现 [0803/164020.312:INFO:CONSOLE(1)] "Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user's experience. For more help, check https://xhr.spec.whatwg.org/.", source: https://passport.jd.com/new/misc/js/common_login_v20180829.js (1) [0803/164020.314:INFO:CONSOLE(1)] "Access to XMLHttpRequest at 'chrome-extension://gfgkebiommjpiaomalcbfefimhhanlfd/static/priceChart.js' from origin 'https://passport.jd.com' has been blocked by CORS policy: Cross origin requests are only supported for protocol schemes: http, data, chrome, https.", source: https://passport.jd.com/new/misc/js/common_login_v20180829.js (1) [0803/164020.327:ERROR:web_contents_delegate.cc(279)] WebContentsDelegate::CheckMediaAccessPermission: Not supported. [0803/164020.328:ERROR:web_contents_delegate.cc(279)] WebContentsDelegate::CheckMediaAccessPermission: Not supported. [0803/164020.590:INFO:CONSOLE(1)] "The AudioContext was not allowed to start. It must be resumed (or created) after a user gesture on the page. https://goo.gl/7K7WLu", source: (1) [0803/164020.976:INFO:CONSOLE(1)] "Uncaught TypeError: Cannot read property '1' of null", source: https://passport.jd.com/new/misc/js/common_login_v20180829.js (1) 2020-08-03 16:40:21 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:23 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:25 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:27 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:29 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:31 | INFO | crawler_selenium.py 61 | 价格元素未出现 [0803/164033.306:ERROR:socket_udp.cc(219)] Received unexpected data packet from [::ffff:5f1c:ac57]:49849 before STUN binding is finished. 2020-08-03 16:40:33 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:35 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:37 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:39 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:41 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:43 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:45 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:47 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:49 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:51 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:53 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:55 | INFO | crawler_selenium.py 61 | 价格元素未出现 [0803/164057.306:ERROR:socket_udp.cc(219)] Received unexpected data packet from [::ffff:5f1c:ac57]:49849 before STUN binding is finished. 2020-08-03 16:40:57 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:40:59 | INFO | crawler_selenium.py 61 | 价格元素未出现 [0803/164100.788:ERROR:stun_port.cc(96)] Binding request timed out from 0.0.0.x:62983 (any) 2020-08-03 16:41:01 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:03 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:05 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:07 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:09 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:12 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:14 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:16 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:18 | INFO | crawler_selenium.py 61 | 价格元素未出现 2020-08-03 16:41:20 | WARNING | crawler_selenium.py 80 | Crawl name failure: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@class='name']"} (Session info: headless chrome=80.0.3987.163) 2020-08-03 16:41:20 | WARNING | crawler_selenium.py 91 | Crawl plus_price failure: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@class='p-price-plus']"} (Session info: headless chrome=80.0.3987.163) 2020-08-03 16:41:20 | WARNING | crawler_selenium.py 105 | Crawl subtitle failure: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@class='name-s']"} (Session info: headless chrome=80.0.3987.163) 2020-08-03 16:41:20 | WARNING | crawler_selenium.py 117 | Crawl price failure: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@class='p-price']"} (Session info: headless chrome=80.0.3987.163) 2020-08-03 16:41:22 | INFO | crawler_selenium.py 134 | Found body element: 404 Not Found /productSense was not found on this server. Resin-3.0.21 (built Thu, 10 Aug 2006 12:03:19 PDT) 2020-08-03 16:41:22 | WARNING | crawler_selenium.py 151 | Crawl failure: Extra data 执行时间: 78.0674524307251

@qqxx6661
Copy link
Owner

qqxx6661 commented Aug 3, 2020

跳转到登录页和跳转到首页,基本都是ip被ban了的意思。我有段时间没看这个爬虫了,晚点我试下是不是京东更新了反爬策略。你可以在本地试试,都是这样说明京东的反爬策略修改了。

@cesaryuan
Copy link
Author

跳转到登录页和跳转到首页,基本都是ip被ban了的意思。我有段时间没看这个爬虫了,晚点我试下是不是京东更新了反爬策略。你可以在本地试试,都是这样说明京东的反爬策略修改了。

是在本地试的,难道是我ip被BAN了?我看你网站还是正常的

@qqxx6661
Copy link
Owner

qqxx6661 commented Aug 6, 2020

我自己本地试了下,抛开不稳定的因素,网页可以访问的,价格可以拿到。你再试试呢?不过慧慧的最低最高价接口失效了,之后我删除掉。
image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants