Rod provides various ways to emulate the environment for pages.
To set the viewport, user-agent, orientation, etc at the same time for a page, you can use the predefined devices:
page.MustEmulate(devices.IPhone6or7or8Plus)
Or define your own device:
page.MustEmulate(devices.Device{
Title: "iPhone 4",
Capabilities: []string{"touch", "mobile"},
UserAgent: "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X)",
AcceptLanguage: "en",
Screen: devices.Screen{
DevicePixelRatio: 2,
Horizontal: devices.ScreenSize{
Width: 480,
Height: 320,
},
Vertical: devices.ScreenSize{
Width: 320,
Height: 480,
},
},
})
Check the source code of the predefined devices, the fields should self explain themselves.
You can also set the default device for all pages by using Browser.DefaultDevice.
Emulation is activated by default (using the Devices.LaptopWithMDPIScreen device), which overrides some of the default browser settings, which is better in terms of coherence (i.e., it helps to reproduce tests).
You can disable the Device Emulation feature passing the special Clear device to the Browser.DefaultDevice
.
browser.DefaultDevice(devices.Clear)
Or you can just use the Browser.NoDefaultDevice helper.
If you want to specify a user-agent for a specific page, use Page.SetUserAgent.
If you want to specify the viewport for a specific page, use Page.SetViewport.
You can use the launch env to set for all pages:
u := launcher.New().Env(append(os.Environ(), "TZ=America/New_York")...).MustLaunch()
rod.New().ControlURL(u).MustConnect()
Or you can use EmulationSetTimezoneOverride or EmulationSetLocaleOverride to set for a specific page:
page := browser.MustPage()
_ = proto.EmulationSetTimezoneOverride{TimezoneID: "America/New_York"}.Call(page)
Use EmulationSetGeolocationOverride
page := browser.MustPage()
_ = proto.EmulationSetEmulatedMedia{
Media: "screen",
Features: []*proto.EmulationMediaFeature{
{Name: "prefers-color-scheme", Value: "dark"},
},
}.Call(page)
Usually it's better to make the headless browser completely transparent for the page so that the page cannot tell if it's controlled by a human or robot. In some cases, some page could use client js to detect if the page is control by a human or a robot, such web WebGL, WebDriver, or http request headers. You can handcraft a js lib to hide all the traces, or just use lib stealth: code example.
If stealth
lib doesn't work for you, you can just launch the regular user browser with launcher.NewUserMode
: User mode.
You can use tools like https://bot.sannysoft.com to test your configuration.
Browser fingerprinting is not bot detection. It uses various tricks to collect unique browser attributes to identify browsers. Website can use it to track users even when they are not logged in, it's also widely used to mark headless scrapers. For example, different users usually will install different fonts on their OS, we can use this to distinguish different users. Another example would be using the canvas to render text, different users usually will have different GPUs, graphic drivers, or OSes, they all will affect the result of the rendered image.
Usually you can launch multiple browser instances to have different fingerprints. If you want to use a single browser to save memory and CPU, you have to manually overriding the API for canvas, fonts, etc.
You can use open-source projects like FingerprintJS to test your configuration.