Ora

How do I set useragent in Puppeteer?

Published in Puppeteer User-Agent 6 mins read

To set the user agent in Puppeteer, the most straightforward and common method is by using the page.setUserAgent() method. This allows you to customize the user agent string for a specific page instance.

The Primary Method: page.setUserAgent()

The page.setUserAgent() method is designed to set a custom user agent for an individual page object within your Puppeteer script. This is particularly useful when you need different pages to present different identities to web servers or when you want to emulate various devices or browsers for specific testing or scraping tasks.

How to Implement page.setUserAgent()

To use this method, you simply call it on your page object, passing the desired user agent string as an argument.

Example:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Set a custom user agent for this specific page
  await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36');

  // Navigate to a page that displays your user agent
  await page.goto('https://www.whatismybrowser.com/detect/what-is-my-user-agent');

  // You can then assert or view the user agent
  const userAgentOnPage = await page.$eval('#detected_user_agent', el => el.textContent);
  console.log('User Agent displayed on page:', userAgentOnPage);

  await browser.close();
})();

Practical Insights:

  • The user agent string should accurately reflect the browser, operating system, and device you wish to emulate.
  • You can find valid user agent strings by inspecting network requests in your browser's developer tools or by using online databases.
  • This method modifies the user agent after the page has been created, but before any navigation or requests are made if called early enough.

Setting User-Agent Globally for a Browser Instance

While page.setUserAgent() is for individual pages, you might want to set a default user agent for all pages created within a browser instance or even for the browser launch itself.

Option 1: Using puppeteer.launch() arguments

You can pass a --user-agent argument when launching Puppeteer to set a global user agent for the entire browser session. This will be the default for all new pages unless overridden by page.setUserAgent().

Example:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
    args: [
      '--user-agent=Mozilla/5.0 (iPad; CPU OS 13_5 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/83.0.4103.88 Mobile/15E148 Safari/604.1'
    ]
  });
  const page = await browser.newPage(); // This page will use the iPad UA

  await page.goto('https://www.whatismybrowser.com/detect/what-is-my-user-agent');
  const userAgentOnPage = await page.$eval('#detected_user_agent', el => el.textContent);
  console.log('Global User Agent:', userAgentOnPage);

  await browser.close();
})();

Option 2: Setting User-Agent for the Default Browser Context

You can also retrieve the default browser context and set its user agent, which will then apply to all new pages created within that context.

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const defaultContext = browser.defaultBrowserContext();

  await defaultContext.overridePermissions('https://example.com', ['clipboard-read']); // Example of another context setting

  // Set the user agent for the default browser context
  await defaultContext.setUserAgent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.88 Safari/537.36');

  const page = await browser.newPage(); // This page will inherit the context UA

  await page.goto('https://www.whatismybrowser.com/detect/what-is-my-user-agent');
  const userAgentOnPage = await page.$eval('#detected_user_agent', el => el.textContent);
  console.log('Context User Agent:', userAgentOnPage);

  await browser.close();
})();

Why Change the User-Agent?

Modifying the user agent in Puppeteer is a powerful technique for various web automation and testing scenarios:

  • Bypassing IP Blocks and Bot Detection: Many websites employ measures to detect and block automated scripts. Changing the user agent to mimic a legitimate browser can help evade some of these basic checks.
  • Emulating Different Devices: Test how a website renders and behaves on various mobile devices, tablets, or specific desktop browser versions without needing the actual hardware or software.
  • Accessing Mobile-Specific Content: Some websites serve different content or layouts based on whether the request comes from a desktop or mobile user agent. Changing it allows access to the desired version.
  • Web Scraping and Data Collection: To prevent being identified as a bot, scrapers often rotate user agents to appear as diverse, genuine users.
  • Testing Browser Compatibility: Verify that your web application functions correctly across a range of browser versions and operating systems.

Crafting Effective User-Agent Strings

A user agent string is a unique identifier that a browser sends with every HTTP request, containing information about the browser, its version, the operating system, and sometimes the device type.

Where to Find User Agent Strings

  • Browser Developer Tools: In Chrome, open DevTools (F12), go to the "Network" tab, refresh a page, click on any request, and look for the User-Agent header in the "Headers" section.
  • Online Databases: Websites like WhatIsMyBrowser.com or UserAgentString.com provide extensive lists of current and historical user agent strings.

Common User Agent Examples

Here's a table of common user agent strings you might use:

Device/Browser Type User Agent String (Example)
Desktop Chrome Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
Desktop Firefox Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0
iPhone Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/120.0.6099.119 Mobile/15E148 Safari/604.1
Android Phone Mozilla/5.0 (Linux; Android 14; Pixel 7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36
iPad Mozilla/5.0 (iPad; CPU OS 17_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/120.0.6099.119 Mobile/15E148 Safari/604.1
Bing Bot Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

Best Practices and Troubleshooting

  • Match User Agent with Viewport: For realistic emulation, combine setting the user agent with page.setViewport() to match the screen dimensions of the emulated device.
  • Rotate User Agents: For scraping at scale, avoid using the same user agent for all requests. Implement a rotation strategy using a list of diverse user agents.
  • Avoid Common Bot User Agents: Some user agents are commonly associated with bots (e.g., cURL, Python requests). Using these will likely result in immediate blocking.
  • Check Other Headers: Sometimes, websites look beyond just the User-Agent header. Consider also setting Accept-Language, Referer, and other headers via page.setExtraHTTPHeaders() for more comprehensive stealth.
  • User Agent String Accuracy: Ensure the user agent string is syntactically correct and represents a real browser/device combination to avoid raising suspicion.

By understanding these methods and best practices, you can effectively manage and modify the user agent in your Puppeteer projects, enhancing your automation capabilities.