How to Choose a Suitable CSS Selector for a Website #53

kongjining · 2023-11-22T07:57:09Z

Inspecting Web Page Structure:

Open the target website (e.g., https://www.google.com.hk/webhp?hl=zh-CN&sourceid=cnhp/).
Right-click on the page element you wish to crawl (such as a specific text or area) and select "Inspect" to open the browser's developer tools.
Analyzing the Element:

In the developer tools, examine the HTML code of the element.
Look for attributes that uniquely identify the element or its container, such as class, id, or other attributes.
Building a CSS Selector:

Create a CSS selector based on the attributes you observed.
For example, if an element has class="content", the selector could be .content.
If the element has multiple classes, you can combine them like .class1.class2.
Testing the Selector:

In the "Console" tab of the developer tools, use document.querySelector('YOUR_SELECTOR') to test if the selector accurately selects the target element.
Applying the Selector:

Once a suitable selector is found, apply it in the selector field of your crawler configuration.
Ensure that the chosen CSS selector accurately reflects the content you wish to extract from the webpage. An incorrect selector might result in the crawler not being able to retrieve the desired data.

bigshirtjonny · 2023-12-02T20:22:28Z

Something I've seen is that the selector doesn't exist on one (or first) page of the crawl then the crawl will end with error. How can we configure the crawl so that if a selector doesn't exist for one page that GPT will continue to try the next page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Choose a Suitable CSS Selector for a Website #53

How to Choose a Suitable CSS Selector for a Website #53

kongjining commented Nov 22, 2023

bigshirtjonny commented Dec 2, 2023 •

edited

Loading

How to Choose a Suitable CSS Selector for a Website #53

How to Choose a Suitable CSS Selector for a Website #53

Comments

kongjining commented Nov 22, 2023

bigshirtjonny commented Dec 2, 2023 • edited Loading

bigshirtjonny commented Dec 2, 2023 •

edited

Loading