Headless browser
A headless browser is a web browser without a graphical user interface.
Headless browsers provide automated control of a web page in an environment similar to popular web browsers, but they are executed via a command-line interface or using network communication. They are particularly useful for testing web pages as they are able to render and understand HTML the same way a browser would, including styling elements such as page layout, color, font selection and execution of JavaScript and Ajax which are usually not available when using other testing methods.[1]
Since version 59 of Google Chrome[2][3] and version 56[4] of Firefox,[5] there is native support for remote control of the browser. This made earlier efforts obsolete, notably PhantomJS.[6]
Use cases
The main use cases for headless browsers are:
- Test automation in modern web applications (web testing)
- Taking screenshots of web pages.
- Running automated tests for JavaScript libraries.
- Automating interaction of web pages.
Other uses
Headless browsers are also useful for web scraping.[7] Google stated in 2009 that using a headless browser could help their search engine index content from websites that use Ajax.[8]
Headless browsers have also been misused in various ways:
- Perform DDoS attacks on web sites.[9]
- Increase advertisement impressions.[10]
- Automate web sites in unintended ways[11] e.g. for credential stuffing.[12]
However, a study of browser traffic in 2018 found no preference by malicious actors for headless browsers.[3] There is no indication that headless browsers are used more frequently than non-headless browsers for malicious purposes, like DDoS attacks, SQL injections or cross-site scripting attacks
Usage
As several major browsers natively support headless mode through APIs, some software exists to perform browser automation through a unified interface. These include:
- W3C compliant implementation of WebDriver[13]
- Playwright - a Node.js library to automate Chromium, Firefox and WebKit[14]
- Puppeteer - a Node.js library to automate Chrome[15]
Test Automation
Some test automation software and frameworks include headless browsers as part of their testing apparati.[3]
- Capybara uses headless browsing, either via WebKit or Headless Chrome to mimic user behavior in its testing protocols.[16]
- Jasmine uses Selenium by default, but can use WebKit or Headless Chrome, to run browser tests.[17]
- Cypress, a frontend testing framework
- QF-Test, a software tool for automated testing of programs via the graphical user interface where a headless browser can also be used for testing.
Alternatives
Another approach is to use software that provides browser APIs. For example,
Another is HtmlUnit, a headless browser written in Java. HtmlUnit uses the Rhino engine to provide JavaScript and Ajax support as well as partial rendering capability.[22][23]
List of headless browsers
These are various software that provide headless browser APIs.
- Splash is a headless web browser written in
- Zombie.js is a simulated browser environment for Node.js.[26]
- SimpleBrowser is a headless web browser written in C# supporting .NET Standard 2.0[27]
- DotNetBrowser is a proprietary .NET Chromium-based library that provides the off-screen rendering mode and can be used without embedding or displaying windows.[28][29]
Another noted earlier effort was envjs in 2008 from John Resig, which was a simulated browser environment written in JavaScript for the Rhino engine.[30]
See also
References
- ^ "What is a headless browser?". arhg.net. 7 October 2009.
- ^ "Getting Started with Headless Chrome". developers.google.com. 27 April 2017.
- ^ a b c Bekerman, Dima (2018-11-28). "Headless Chrome: DevOps Love It, So Do Hackers, Here's Why | Imperva". Blog. Retrieved 2021-02-22.
- ^ "Firefox 56 release notes". developer.mozilla.org. 26 February 2023.
- ^ "Headless mode - browser support". developer.mozilla.org. Archived from the original on 2018-06-03. Retrieved 2017-08-31.
- ^ "Quick Start". phantomjs.org.
- ^ Staff Reporter (2024-04-04). "Browsing Without a Browser: The Rise of Headless Technology". thearabianpost.com.
- ^ Mueller, John (2009-10-07). "Official Google Webmaster Central Blog: A proposal for making AJAX crawlable". Official Google Webmaster Central Blog.
- ^ Rawlings, Matt (2013-11-20). "Headless Browser Botnet Used in 150 hour DDoS attack". Business 2 Community.
- ^ Mello Jr., John P. (2014-03-25). "Headless Web Traffic Threatens Internet Economy". ecommercetimes.com.
- ^ Raywood, Dan (2014-04-01). "Headless browsers: legitimate software that enables attack". ITProPortal.
- ^ Mueller, Neal. "Credential stuffing". owasp.org.
- ^ Sheth, Himanshu (2020-11-17). "Selenium 4 Is Now W3C Compliant: All You Need To Know".
- ^ "GitHub - Playwright". GitHub. Retrieved 2021-04-11.
- ^ "Github - Puppeteer". GitHub. Retrieved 2021-04-11.
- ^ Silva, Francisco (2019-05-29). "From capybara-webkit to Headless Chrome and ChromeDriver". Blog | Imaginary Cloud. Retrieved 2021-02-22.
- ^ Bintz, John. "jasmine-headless-webkit -- The fastest way to run your Jasmine specs!". johnbintz.github.io. Retrieved 2021-02-22.
- ^ "JSDOM at GitHub - Pretending to be a visual browser". GitHub. Retrieved 2021-04-18.
- ^ "assaf/zombie". GitHub.
- ^ "ヘルペスが口や目からうつる?感染した時の症状と病院の治療方法とは". www.envjs.com. Archived from the original on 2015-02-23. Retrieved 2015-03-13.
- ^ "JavaScriptMVC - EnvJS". javascriptmvc.com.
- ^ Mike Bowler. "HtmlUnit – Welcome to HtmlUnit". sourceforge.net.
- ^ "Platform (Vaadin 7.3.4 API)". vaadin.com. 6 November 2014.
- ^ "scrapinghub/splash". GitHub. 20 December 2021.
- ^ "DARPA - Open Catalog". Archived from the original on 2015-05-28. Retrieved 2015-05-28.
- ^ "Zombie". labnotes.org.
- ^ SimpleBrowserDotNet/SimpleBrowser, SimpleBrowserDotNet, 2021-02-10, retrieved 2021-02-22
- ^ DotNetBrowser Examples, TeamDev, 2021-03-12, retrieved 2021-03-12
- ^ "DotNetBrowser". TeamDev. 2021-05-05.
- ^ Resig, John (2008-10-12). "env-js: A pure-JavaScript browser environment" – via GitHub.