Skip to content

Commit

Permalink
Merge pull request #3216 from seleniumbase/cdp-mode-for-uc-mode-and-more
Browse files Browse the repository at this point in the history
Add CDP Mode to UC Mode, and more
  • Loading branch information
mdmintz authored Oct 23, 2024
2 parents e6e3c97 + 52a5822 commit c92181b
Show file tree
Hide file tree
Showing 50 changed files with 7,136 additions and 169 deletions.
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,30 +19,30 @@
<p align="center">
<a href="#python_installation">🚀 Start</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/features_list.md">🏰 Features</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/ReadMe.md">📚 Examples</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/customizing_test_runs.md">🎛️ Options</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/ReadMe.md">📚 Examples</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/seleniumbase/console_scripts/ReadMe.md">🌠 Scripts</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/mobile_testing.md">📱 Mobile</a>
<br />
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/method_summary.md">📘 APIs</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/syntax_formats.md"> 🔡 Formats</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/example_logs/ReadMe.md">📊 Dashboard</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/syntax_formats.md"> 🔠 Formats</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/recorder_mode.md">🔴 Recorder</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/example_logs/ReadMe.md">📊 Dashboard</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/locale_codes.md">🗾 Locales</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/seleniumbase/utilities/selenium_grid/ReadMe.md">🌐 Grid</a>
<a href="https://seleniumbase.io/devices/?url=seleniumbase.com">💻 Farm</a>
<br />
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/commander.md">🎖️ GUI</a> |
<a href="https://seleniumbase.io/demo_page">📰 TestPage</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/case_plans.md">🗂️ CasePlans</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/uc_mode.md">👤 UC Mode</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/master_qa/ReadMe.md">🧬 Hybrid</a> |
<a href="https://seleniumbase.io/devices/?url=seleniumbase.com">💻 Farm</a>
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/cdp_mode/ReadMe.md">🐙 CDP Mode</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/chart_maker/ReadMe.md">📶 Charts</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/seleniumbase/utilities/selenium_grid/ReadMe.md">🌐 Grid</a>
<br />
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/how_it_works.md">👁️ How</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/tree/master/examples/migration/raw_selenium">🚝 Migrate</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/tree/master/examples/boilerplates">♻️ Templates</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/tree/master/integrations/node_js">🚉 NodeGUI</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/chart_maker/ReadMe.md">📶 Charts</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/case_plans.md">🗂️ CasePlans</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/tree/master/examples/boilerplates">♻️ Template</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/master_qa/ReadMe.md">🧬 Hybrid</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/tour_examples/ReadMe.md">🚎 Tours</a>
<br />
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/integrations/github/workflows/ReadMe.md">🤖 CI/CD</a> |
Expand Down
301 changes: 301 additions & 0 deletions examples/cdp_mode/ReadMe.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,301 @@
<!-- SeleniumBase Docs -->

## [<img src="https://seleniumbase.github.io/img/logo6.png" title="SeleniumBase" width="32">](https://github.com/seleniumbase/SeleniumBase/) CDP Mode 🐙

🐙 <b translate="no">SeleniumBase</b> <b translate="no">CDP Mode</b> (Chrome Devtools Protocol Mode) is a special mode inside of <b><a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/uc_mode.md" translate="no"><span translate="no">SeleniumBase UC Mode</span></a></b> that lets bots appear human while controlling the browser with the <b translate="no">CDP-Driver</b>. Although regular <span translate="no">UC Mode</span> can't perform <span translate="no">WebDriver</span> actions while the <code>driver</code> is disconnected from the browser, the <span translate="no">CDP-Driver</span> can still perform actions (while maintaining its cover).

👤 <b translate="no">UC Mode</b> avoids bot-detection by first disconnecting WebDriver from the browser at strategic times, calling special <code>PyAutoGUI</code> methods to bypass CAPTCHAs (as needed), and finally reconnecting the <code>driver</code> afterwards so that WebDriver actions can be performed again. Although this approach works for bypassing simple CAPTCHAs, more flexibility is needed for bypassing bot-detection on websites with advanced protection. (That's where <b translate="no">CDP Mode</b> comes in.)

🐙 <b translate="no">CDP Mode</b> is based on <a href="https://github.com/HyperionGray/python-chrome-devtools-protocol" translate="no">python-cdp</a>, <a href="https://github.com/HyperionGray/trio-chrome-devtools-protocol" translate="no">trio-cdp</a>, and <a href="https://github.com/ultrafunkamsterdam/nodriver" translate="no">nodriver</a>. <code>trio-cdp</code> was an early implementation of <code>python-cdp</code>, whereas <code>nodriver</code> is a modern implementation of <code>python-cdp</code>. (Refactored CDP code is imported from <a href="https://github.com/mdmintz/MyCDP" translate="no">MyCDP</a>.)

🐙 <b translate="no">CDP Mode</b> includes multiple updates to the above, such as:

* Sync methods. (Using `async`/`await` is not necessary!)
* The ability to use WebDriver and CDP-Driver together.
* Backwards compatibility for existing UC Mode scripts.
* More configuration options when launching browsers.
* More methods. (And bug-fixes for existing methods.)
* Faster response time for support. (Eg. [Discord Chat](https://discord.gg/EdhQTn3EyE))

--------

### 🐙 <b translate="no">CDP Mode</b> initialization:

* `sb.activate_cdp_mode(url)`

> (Call that from a **UC Mode** script)
--------

### 🐙 <b translate="no">CDP Mode</b> examples:

> [SeleniumBase/examples/cdp_mode](https://github.com/seleniumbase/SeleniumBase/tree/master/examples/cdp_mode)
### 🔖 Example 1: (Pokemon site using Incapsula/Imperva protection with invisible reCAPTCHA)

> [SeleniumBase/examples/cdp_mode/raw_pokemon.py](https://github.com/seleniumbase/SeleniumBase/tree/master/examples/cdp_mode/raw_pokemon.py)
<div></div>
<details>
<summary> ▶️ (<b>Click to expand code preview</b>)</summary>

```python
from seleniumbase import SB

with SB(uc=True, test=True, locale_code="en") as sb:
url = "https://www.pokemon.com/us"
sb.activate_cdp_mode(url)
sb.sleep(1)
sb.cdp.click_if_visible("button#onetrust-reject-all-handler")
sb.cdp.click('a[href="https://www.pokemon.com/us/pokedex/"]')
sb.sleep(1)
sb.cdp.click('b:contains("Show Advanced Search")')
sb.sleep(1)
sb.cdp.click('span[data-type="type"][data-value="electric"]')
sb.cdp.click("a#advSearch")
sb.sleep(1)
sb.cdp.click('img[src*="img/pokedex/detail/025.png"]')
sb.cdp.assert_text("Pikachu", 'div[class*="title"]')
sb.cdp.assert_element('img[alt="Pikachu"]')
sb.cdp.scroll_into_view("div.pokemon-ability-info")
sb.sleep(1)
sb.cdp.flash('div[class*="title"]')
sb.cdp.flash('img[alt="Pikachu"]')
sb.cdp.flash("div.pokemon-ability-info")
name = sb.cdp.get_text("label.styled-select")
info = sb.cdp.get_text("div.version-descriptions p.active")
print("*** %s: ***\n* %s" % (name, info))
sb.sleep(2)
sb.cdp.highlight_overlay("div.pokemon-ability-info")
sb.sleep(2)
sb.cdp.click('a[href="https://www.pokemon.com/us/play-pokemon/"]')
sb.cdp.click('h3:contains("Find an Event")')
location = "Concord, MA, USA"
sb.cdp.type('input[data-testid="location-search"]', location)
sb.sleep(1)
sb.cdp.click("div.autocomplete-dropdown-container div.suggestion-item")
sb.cdp.click('img[alt="search-icon"]')
sb.sleep(2)
events = sb.cdp.select_all('div[data-testid="event-name"]')
print("*** Pokemon events near %s: ***" % location)
for event in events:
print("* " + event.text)
sb.sleep(2)
```

</details>

### 🔖 Example 2: (Hyatt site using Kasada protection)

> [SeleniumBase/examples/cdp_mode/raw_hyatt.py](https://github.com/seleniumbase/SeleniumBase/tree/master/examples/cdp_mode/raw_hyatt.py)
<div></div>
<details>
<summary> ▶️ (<b>Click to expand code preview</b>)</summary>

```python
from seleniumbase import SB

with SB(uc=True, test=True, locale_code="en") as sb:
url = "https://www.hyatt.com/"
sb.activate_cdp_mode(url)
sb.sleep(1)
sb.cdp.click_if_visible('button[aria-label="Close"]')
sb.sleep(0.5)
sb.cdp.click('span:contains("Explore")')
sb.sleep(1)
sb.cdp.click('a:contains("Hotels & Resorts")')
sb.sleep(2.5)
location = "Anaheim, CA, USA"
sb.cdp.press_keys("input#searchbox", location)
sb.sleep(1)
sb.cdp.click("div#suggestion-list ul li a")
sb.sleep(1)
sb.cdp.click('div.hotel-card-footer button')
sb.sleep(1)
sb.cdp.click('button[data-locator="find-hotels"]')
sb.sleep(4)
hotel_names = sb.cdp.select_all(
'div[data-booking-status="BOOKABLE"] [class*="HotelCard_header"]'
)
hotel_prices = sb.cdp.select_all(
'div[data-booking-status="BOOKABLE"] div.rate-currency'
)
sb.assert_true(len(hotel_names) == len(hotel_prices))
print("Hyatt Hotels in %s:" % location)
print("(" + sb.cdp.get_text("ul.b-color_text-white") + ")")
if len(hotel_names) == 0:
print("No availability over the selected dates!")
for i, hotel in enumerate(hotel_names):
print("* %s: %s => %s" % (i + 1, hotel.text, hotel_prices[i].text))
```

</details>

### 🔖 Example 3: (BestWestern site using DataDome protection)

* [SeleniumBase/examples/cdp_mode/raw_bestwestern.py](https://github.com/seleniumbase/SeleniumBase/tree/master/examples/cdp_mode/raw_bestwestern.py)

<div></div>
<details>
<summary> ▶️ (<b>Click to expand code preview</b>)</summary>

```python
from seleniumbase import SB

with SB(uc=True, test=True, locale_code="en") as sb:
url = "https://www.bestwestern.com/en_US.html"
sb.activate_cdp_mode(url)
sb.sleep(1.5)
sb.cdp.click_if_visible("div.onetrust-close-btn-handler")
sb.sleep(0.5)
sb.cdp.click("input#destination-input")
sb.sleep(1.5)
location = "Palm Springs, CA, USA"
sb.cdp.press_keys("input#destination-input", location)
sb.sleep(0.6)
sb.cdp.click("ul#google-suggestions li")
sb.sleep(0.6)
sb.cdp.click("button#btn-modify-stay-update")
sb.sleep(1.5)
sb.cdp.click("label#available-label")
sb.sleep(4)
print("Best Western Hotels in %s:" % location)
summary_details = sb.cdp.get_text("#summary-details-column")
dates = summary_details.split("ROOM")[0].split("DATES")[-1].strip()
print("(Dates: %s)" % dates)
flip_cards = sb.cdp.select_all(".flipCard")
for i, flip_card in enumerate(flip_cards):
hotel = flip_card.query_selector(".hotelName")
price = flip_card.query_selector(".priceSection")
if hotel and price:
print("* %s: %s => %s" % (
i + 1, hotel.text.strip(), price.text.strip())
)
```

</details>

(<b>Note:</b> Extra <code translate="no">sb.sleep()</code> calls have been added to prevent bot-detection because some sites will flag you as a bot if you perform actions too quickly.)

(<b>Note:</b> Some sites may IP-block you for 36 hours or more if they catch you using regular <span translate="no">Selenium WebDriver</span>. Be extra careful when creating and/or modifying automation scripts that run on them.)

--------

### 🐙 CDP Mode API / Methods

(Some method args have been left out for simplicity. Eg: <code translate="no">timeout</code>)

```python
sb.cdp.get(url)
sb.cdp.reload()
sb.cdp.refresh()
sb.cdp.add_handler(event, handler)
sb.cdp.find_element(selector)
sb.cdp.find_all(selector)
sb.cdp.find_elements_by_text(text, tag_name=None)
sb.cdp.select(selector)
sb.cdp.select_all(selector)
sb.cdp.click_link(link_text)
sb.cdp.tile_windows(windows=None, max_columns=0)
sb.cdp.get_all_cookies(*args, **kwargs)
sb.cdp.set_all_cookies(*args, **kwargs)
sb.cdp.save_cookies(*args, **kwargs)
sb.cdp.load_cookies(*args, **kwargs)
sb.cdp.clear_cookies(*args, **kwargs)
sb.cdp.sleep(seconds)
sb.cdp.bring_active_window_to_front()
sb.cdp.get_active_element()
sb.cdp.get_active_element_css()
sb.cdp.click(selector)
sb.cdp.click_active_element()
sb.cdp.click_if_visible(selector)
sb.cdp.mouse_click(selector)
sb.cdp.nested_click(parent_selector, selector)
sb.cdp.get_nested_element(parent_selector, selector)
sb.cdp.flash(selector)
sb.cdp.focus(selector)
sb.cdp.highlight_overlay(selector)
sb.cdp.remove_element(selector)
sb.cdp.remove_from_dom(selector)
sb.cdp.remove_elements(selector)
sb.cdp.scroll_into_view(selector)
sb.cdp.send_keys(selector, text)
sb.cdp.press_keys(selector, text)
sb.cdp.type(selector, text)
sb.cdp.evaluate(expression)
sb.cdp.js_dumps(obj_name)
sb.cdp.maximize()
sb.cdp.minimize()
sb.cdp.medimize()
sb.cdp.set_window_rect()
sb.cdp.reset_window_size()
sb.cdp.get_window()
sb.cdp.get_text()
sb.cdp.get_title()
sb.cdp.get_current_url()
sb.cdp.get_origin()
sb.cdp.get_page_source()
sb.cdp.get_user_agent()
sb.cdp.get_cookie_string()
sb.cdp.get_locale_code()
sb.cdp.get_screen_rect()
sb.cdp.get_window_rect()
sb.cdp.get_window_size()
sb.cdp.get_window_position()
sb.cdp.get_element_rect(selector)
sb.cdp.get_element_size(selector)
sb.cdp.get_element_position(selector)
sb.cdp.get_gui_element_rect(selector)
sb.cdp.get_gui_element_center(selector)
sb.cdp.get_document()
sb.cdp.get_flattened_document()
sb.cdp.get_element_attributes(selector)
sb.cdp.get_element_html(selector)
sb.cdp.set_attributes(selector, attribute, value)
sb.cdp.internalize_links()
sb.cdp.is_element_present(selector)
sb.cdp.is_element_visible(selector)
sb.cdp.assert_element(selector)
sb.cdp.assert_element_present(selector)
sb.cdp.assert_text(text, selector="html")
sb.cdp.assert_exact_text(text, selector="html")
sb.cdp.save_screenshot(name, folder=None, selector=None)
```

--------

### 🐙 CDP Mode WebElement API / Methods

```python
element.clear_input()
element.click()
element.flash()
element.focus()
element.highlight_overlay()
element.mouse_click()
element.mouse_drag(destination)
element.mouse_move()
element.query_selector(selector)
element.querySelector(selector)
element.query_selector_all(selector)
element.querySelectorAll(selector)
element.remove_from_dom()
element.save_screenshot(*args, **kwargs)
element.save_to_dom()
element.scroll_into_view()
element.select_option()
element.send_file(*file_paths)
element.send_keys(text)
element.set_text(value)
element.type(text)
element.get_position()
element.get_html()
element.get_js_attributes()
```

--------

<img src="https://seleniumbase.github.io/cdn/img/sb_text_f.png" alt="SeleniumBase" title="SeleniumBase" align="center" width="335">

<div><a href="https://github.com/seleniumbase/SeleniumBase"><img src="https://seleniumbase.github.io/cdn/img/sb_logo_gs.png" alt="SeleniumBase" title="SeleniumBase" width="335" /></a></div>
Empty file added examples/cdp_mode/__init__.py
Empty file.
47 changes: 47 additions & 0 deletions examples/cdp_mode/raw_async.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import asyncio
import time
from seleniumbase.core import sb_cdp
from seleniumbase.undetected import cdp_driver


async def main():
driver = await cdp_driver.cdp_util.start()
page = await driver.get("https://www.priceline.com/")
time.sleep(3)
print(await page.evaluate("document.title"))
element = await page.select('[data-testid*="endLocation"]')
await element.click_async()
time.sleep(1)
await element.send_keys_async("Boston")
time.sleep(2)

if __name__ == "__main__":
# Call an async function with awaited methods
loop = asyncio.new_event_loop()
loop.run_until_complete(main())

# Call everything without using async / await
driver = loop.run_until_complete(cdp_driver.cdp_util.start())
page = loop.run_until_complete(driver.get("https://www.pokemon.com/us"))
time.sleep(3)
print(loop.run_until_complete(page.evaluate("document.title")))
element = loop.run_until_complete(page.select("span.icon_pokeball"))
loop.run_until_complete(element.click_async())
time.sleep(1)
print(loop.run_until_complete(page.evaluate("document.title")))
time.sleep(1)

# Call CDP methods via the simplified CDP API
page = loop.run_until_complete(driver.get("https://www.priceline.com/"))
sb = sb_cdp.CDPMethods(loop, page, driver)
sb.sleep(3)
sb.internalize_links() # Don't open links in a new tab
sb.click("#link_header_nav_experiences")
sb.sleep(2)
sb.remove_element("msm-cookie-banner")
sb.sleep(1)
sb.press_keys('input[data-test-id*="search"]', "Amsterdam")
sb.sleep(2)
sb.click('span[data-test-id*="autocomplete"]')
sb.sleep(5)
print(sb.get_title())
Loading

0 comments on commit c92181b

Please sign in to comment.