URL redirection
URL redirection, also called URL forwarding, is a
URL redirection is done for various reasons:
- for URL shortening;
- to prevent broken links when web pages are moved;
- to allow multiple domain names belonging to the same owner to refer to a single web site;
- to guide navigation into and out of a website;
- for privacy protection (such as redirecting YouTube and Twitter links to Invidious and Nitter respectively or to turn AMP links into normal links); and
- for hostile purposes such as phishing attacks or malware distribution.
Purposes
There are several reasons to use URL redirection:
Forcing HTTPS
A website may potentially be accessible over both a secure HTTPS URI scheme and plain HTTP (an insecure URI beginning with "http://").
If a user types in a URI or clicks on a link that refers to the insecure variant, the browser will automatically redirect to the secure version in case the website is contained in the HSTS preload list shipped with the application or if the user had already visited the origin in the past.
Otherwise the website will be contacted over HTTP. A website operator may decide to serve such requests by redirecting the browser to the HTTPS variant instead and hopefully also priming HSTS for future accesses.
Similar domain names
A user might mistype a URL. Organizations often register these misspelled domains and redirect them to the intended location. This technique is often used to "reserve" other top-level domains (TLD) with the same name, or make it easier for a ".edu" or ".net" site to accommodate users who type ".com".
Moving pages to a new domain
Web pages may be redirected to a new domain for three reasons:
- a site might desire, or need, to change its domain name;
- an author might move their individual pages to a new domain;
- two web sites might merge.
With URL redirects, incoming links to an outdated URL can be sent to the correct location. These links might be from other sites that have not realized that there is a change or from bookmarks/favorites that users have saved in their browsers. The same applies to search engines. They often have the older/outdated domain names and links in their database and will send search users to these old URLs. By using a "moved permanently" redirect to the new URL, visitors will still end up at the correct page. Also, in the next search engine pass, the search engine should detect and use the newer URL.
Logging outgoing links
The access logs of most web servers keep detailed information about where visitors came from and how they browsed the hosted site. They do not, however, log which links visitors left by. This is because the visitor's browser has no need to communicate with the original server when the visitor clicks on an outgoing link. This information can be captured in several ways. One way involves URL redirection. Instead of sending the visitor straight to the other site, links on the site can direct to a URL on the original website's domain that automatically redirects to the real target. This technique bears the downside of the delay caused by the additional request to the original website's server. As this added request will leave a trace in the server log, revealing exactly which link was followed, it can also be a privacy issue.[1] The same technique is also used by some corporate websites to implement a statement that the subsequent content is at another site, and therefore not necessarily affiliated with the corporation. In such scenarios, displaying the warning causes an additional delay.
Short aliases for long URLs
Web applications often include lengthy descriptive attributes in their URLs which represent data hierarchies, command structures, transaction paths and session information. This practice results in a URL that is aesthetically unpleasant and difficult to remember, and which may not fit within the size limitations of microblogging sites. URL shortening services provide a solution to this problem by redirecting a user to a longer URL from a shorter one.[1]
Meaningful, persistent aliases for long or changing URLs
Sometimes the URL of a page changes even though the content stays the same. Therefore, URL redirection can help users who have bookmarks. This is routinely done on Wikipedia whenever a page is renamed.
Post/Redirect/Get
Post/Redirect/Get (PRG) is a
Device targeting and geotargeting
Redirects can be effectively used for targeting purposes like geotargeting. Device targeting has become increasingly important with the rise of mobile clients. There are two approaches to serve mobile users: Make the website responsive or redirect to a mobile website version. If a mobile website version is offered, users with mobile clients will be automatically forwarded to the corresponding mobile content. For device targeting, client-side redirects or non-cacheable server-side redirects are used. Geotargeting is the approach to offer localized content and automatically forward the user to a localized version of the requested URL. This is helpful for websites that target audience in more than one location and/or language. Usually server-side redirects are used for Geotargeting but client-side redirects might be an option as well, depending on requirements.[2]
Manipulating search engines
Redirects have been used to manipulate search engines with unethical intentions, e.g.,
Manipulating visitors
URL redirection is sometimes used as a part of phishing attacks that confuse visitors about which web site they are visiting.[5] Because modern browsers always show the real URL in the address bar, the threat is lessened. However, redirects can also take you to sites that will otherwise attempt to attack in other ways. For example, a redirect might take a user to a site that would attempt to trick them into downloading antivirus software and installing a Trojan of some sort instead.
Removing referrer
information
When a link is clicked, the browser sends along in the
https://externalsite.com/page
into https://redirect.company.com/https://externalsite.com/page
. This technique also eliminates other potentially sensitive information from the referrer URL, such as the session ID, and can reduce the chance of phishingImplementation
Several different kinds of response to the browser will result in a redirection. These vary in whether they affect
Manual redirect
The simplest technique is to ask the visitor to follow a link to the new page, usually using an HTML anchor like:
Please follow <a href="https://www.example.com/">this link</a>.
This method is often used as a fall-back — if the browser does not support the automatic redirect, the visitor can still reach the target document by following the link.
HTTP status codes 3xx
In the
HTTP/1.1 defines several status codes for redirection (RFC 7231):
- 300 multiple choices(e.g. offer different languages)
- 301 moved permanently (redirects permanently from one URL to another passing link equity to the redirected page)
- 302 found (originally "temporary redirect" in HTTP/1.0 and popularly used for CGI scripts; superseded by 303 and 307 in HTTP/1.1 but preserved for backward compatibility)
- 303 see other (forces a GET request to the new URL even if original request was POST)
- 305 use proxy(indicates that the client's requested resource is only available through a proxy)
- 307 temporary redirect(provides a new URL for the browser to resubmit a GET or POST request)
- 308 permanent redirect(provides a new URL for the browser to resubmit a GET or POST request)
Status codes
HTTP Status Code | HTTP Version | Temporary / Permanent | Cacheable | Request Method Subsequent Request |
---|---|---|---|---|
301 | HTTP/1.0 | Permanent | Yes | GET / POST may change |
302 | HTTP/1.0 | Temporary | not by default | GET / POST may change |
303 | HTTP/1.1 | Temporary | never | always GET |
307 | HTTP/1.1 | Temporary | not by default | may not change |
308 | HTTP/1.1 | Permanent | by default | may not change |
All of these status codes require the URL of the redirect target to be given in the Location: header of the HTTP response. The 300 multiple choices will usually list all choices in the body of the message and show the default choice in the Location: header.
Example HTTP response for a 301 redirect
A HTTP response with the 301 "moved permanently" redirect looks like this:
HTTP/1.1 301 Moved Permanently
Location: https://www.example.org/
Content-Type: text/html
Content-Length: 174
<html>
<head>
<title>Moved</title>
</head>
<body>
=Moved=
<p>This page has moved to <a href="https://www.example.org/">https://www.example.org/</a>.</p>
</body>
</html>
Using server-side scripting for redirection
Web authors producing HTML content can't usually create redirects using HTTP headers as these are generated automatically by the web server program when serving an HTML file. The same is usually true even for programmers writing CGI scripts, though some servers allow scripts to add custom headers (e.g. by enabling "non-parsed-headers"). Many web servers will generate a 3xx status code if a script outputs a "Location:" header line. For example, in PHP, one can use the "header" function:
header('HTTP/1.1 301 Moved Permanently');
header('Location: https://www.example.com/');
exit();
More headers may be required to prevent caching.[7] The programmer must ensure that the headers are output before the body. This may not fit easily with the natural flow of control through the code. To help with this, some frameworks for server-side content generation can buffer the body data. In the ASP scripting language, this can also be accomplished using response.buffer=true
and response.redirect "https://www.example.com/"
HTTP/1.1 allows for either a relative URI reference or an absolute URI reference.[8] If the URI reference is relative the client computes the required absolute URI reference according to the rules defined in RFC 3986.[9]
Apache HTTP Server mod_rewrite
The Apache HTTP Server mod_alias extension can be used to redirect certain requests. Typical configuration directives look like:
Redirect permanent /oldpage.html https://www.example.com/newpage.html
Redirect 301 /oldpage.html https://www.example.com/newpage.html
For more flexible
RewriteEngine on
RewriteCond %{HTTP_HOST} ^([^.:]+\.)*oldsite\.example\.com\.?(:[0-9]*)?$ [NC]
RewriteRule ^(.*)$ https://newsite.example.net/$1 [R=301,L]
Such configuration can be applied to one or all sites on the server through the server configuration files or to a single content directory through a .htaccess
file.
nginx rewrite
Nginx has an integrated http rewrite module,[10] which can be used to perform advanced URL processing and even web-page generation (with the return
directive). A showing example of such advanced use of the rewrite module is mdoc.su Archived 3 April 2022 at the Wayback Machine, which implements a deterministic URL shortening service entirely with the help of nginx configuration language alone.[11][12]
For example, if a request for /DragonFlyBSD/HAMMER.5
were to come along, it would first be redirected internally to /d/HAMMER.5
with the first rewrite directive below (only affecting the internal state, without any HTTP replies issued to the client just yet), and then with the second rewrite directive, an
location /DragonFly {
rewrite ^/DragonFly(BSD)?([,/].*)?$ /d$2 last;
}
location /d {
set $db "https://leaf.dragonflybsd.org/cgi/web-man?command=";
set $ds "§ion=";
rewrite ^/./([^/]+)\.([1-9])$ $db$1$ds$2 redirect;
}
Refresh Meta tag and HTTP refresh header
Netscape introduced the meta refresh feature which refreshes a page after a certain amount of time. This can specify a new URL to replace one page with another. This is supported by most web browsers.[14][15] A timeout of zero seconds effects an immediate redirect. This is treated like a 301 permanent redirect by Google, allowing transfer of PageRank to the target page.[16]
This is an example of a simple HTML document that uses this technique:
<html>
<head>
<meta http-equiv="Refresh" content="0; url=https://www.example.com/" />
</head>
<body>
<p>Please follow <a href="https://www.example.com/">this link</a>.</p>
</body>
</html>
This technique can be used by
The same effect can be achieved with an HTTP refresh
header:
HTTP/1.1 200 OK
Refresh: 0; url=https://www.example.com/
Content-Type: text/html
Content-Length: 78
Please follow <a href="https://www.example.com/">this link</a>.
This response is easier to generate by CGI programs because one does not need to change the default status code.
Here is a simple CGI program that effects this redirect:
# !/usr/bin/perl
print "Refresh: 0; url=https://www.example.com/\r\n";
print "Content-Type: text/html\r\n";
print "\r\n";
print "Please follow <a href=\"https://www.example.com/\">this link</a>!"
Note: Usually, the HTTP server adds the status line and the Content-Length header automatically.
The W3C discourage the use of meta refresh, since it does not communicate any information about either the original or new resource, to the browser (or search engine). The W3C's Web Content Accessibility Guidelines (7.4)[17] discourage the creation of auto-refreshing pages, since most web browsers do not allow the user to disable or control the refresh rate. Some articles that they have written on the issue include W3C Web Content Accessibility Guidelines (1.0): Ensure user control of time-sensitive content changes, Use standard redirects: don't break the back button![18] and Core Techniques for Web Content Accessibility Guidelines 1.0 section 7.[19]
JavaScript redirects
JavaScript can cause a redirect by setting the window.location
attribute, e.g.:
window.location='https://www.example.com/'
Normally JavaScript pushes the redirector site's URL to the browser's history. It can cause redirect loops when users hit the back button. With the following command you can prevent this type of behaviour.[20]
window.location.replace('https://www.example.com/')
However, HTTP headers or the refresh meta tag may be preferred for security reasons and because JavaScript will not be executed by some browsers and many web crawlers.
Frame redirects
A slightly different effect can be achieved by creating an inline frame:
<iframe height="100%" width="100%" src="https://www.example.com/">
Please follow <a href="https://www.example.com/">link</a>.
</iframe>
One main difference to the above redirect methods is that for a frame redirect, the browser displays the URL of the frame document and not the URL of the target page in the URL bar. This cloaking technique may be used so that the reader sees a more memorable URL or to fraudulently conceal a phishing site as part of website spoofing.[21]
Before HTML5,
<frameset rows="100%">
<frame src="https://www.example.com/">
<noframes>
<body>Please follow <a href="https://www.example.com/">link</a>.</body>
</noframes>
</frameset>
Redirect chains
One redirect may lead to another in a redirect chain. If a redirect leads to another redirect, this may also be known as a double redirect.[23] For example, the URL "https://wikipedia.com" (with "*.com" as domain) is first redirected to https://www.wikipedia.org/ (with domain name in .org), where you can navigate to the language-specific site. This is unavoidable if the different links in the chain are served by different servers though it should be minimised by rewriting the URL as much as possible on the server before returning it to the browser as a redirect.
Redirect loops
Sometimes a mistake can cause a page to end up redirecting back to itself, possibly via other pages, leading to an infinite sequence of redirects. Browsers should stop redirecting after a certain number of hops and display an error message.
The HTTP/1.1 Standard states:[24]
A client SHOULD detect and intervene in cyclical redirections (i.e., "infinite" redirection loops).
Note: An earlier version of this specification recommended a maximum of five redirections ([RFC 2068], Section 10.3). Content developers need to be aware that some clients might implement such a fixed limitation.
Services
There exist services that can perform URL redirection on demand, with no need for technical work or access to the web server your site is hosted on.
URL redirection services
A redirect service is an information management system, which provides an internet link that redirects users to the desired content. The typical benefit to the user is the use of a memorable domain name, and a reduction in the length of the URL or web address. A redirecting link can also be used as a permanent address for content that frequently changes hosts, similarly to the Domain Name System. Hyperlinks involving URL redirection services are frequently used in spam messages directed at blogs and wikis. Thus, one way to reduce spam is to reject all edits and comments containing hyperlinks to known URL redirection services; however, this will also remove legitimate edits and comments and may not be an effective method to reduce spam. Recently, URL redirection services have taken to using
History
The first redirect services took advantage of
Referrer masking
Redirection services can hide the
Here is a simplistic example of such a service, written in PHP.
<?php
$url = htmlspecialchars($_GET['url']);
header('Refresh: 0; url=https://' . $url);
?>
<!-- Fallback using meta refresh. -->
<html>
<head>
<title>Redirecting...</title>
<meta http-equiv="refresh" content="0;url=https://<?= $url; ?>">
</head>
<body>
Attempting to redirect to <a href="https://<?= $url; ?>">https://<?= $url; ?></a>.
</body>
</html>
The above example does not check who called it (e.g. by referrer, although that could be spoofed). Also, it does not check the URL provided. This means that a malicious person could link to the redirection page using a URL parameter of his/her own selection, from any page, which uses the web server's resources.
Security issues
URL redirection can be abused by attackers to perform
URL redirection also provides a mechanism to perform
See also
References
- ^ ISSN 1797-1993. Archived from the originalon 17 August 2011.
- ^ "Redirects & SEO - The Total Guide". Audisto. Retrieved 29 November 2015.
- ^ "SEO advice: discussing 302 redirects". Matt Cutts, former Head of Google Webspam Team. 4 January 2006.
- ^ "Sneaky Redirects". Google Inc. 3 December 2015.
- ^ "Unvalidated Redirects and Forwards Cheat Sheet". Open Web Application Security Project (OWASP). 21 August 2014.
- ^ "Redirects & SEO - The Complete Guide". Audisto. Retrieved 29 November 2015.
- ^ "PHP Redirects: 302 to 301 Rock Solid Robust Solution". WebSiteFactors.co.uk. Archived from the original on 12 October 2012.
- .
- .
- ^ "Module ngx_http_rewrite_module - rewrite". nginx.org. Retrieved 24 December 2014.
- ^ Murenin, Constantine A. (18 February 2013). "A dynamic web-site written wholly in nginx.conf? Introducing mdoc.su!". [email protected] (Mailing list). Retrieved 24 December 2014.
- ^ Murenin, Constantine A. (23 February 2013). "mdoc.su – Short manual page URLs for FreeBSD, OpenBSD, NetBSD and DragonFly BSD". Retrieved 25 December 2014.
- ^ Murenin, Constantine A. (23 February 2013). "mdoc.su.nginx.conf". Retrieved 25 December 2014.
- ^ "HTML meta tag". www.w3schools.com.
- ^ "An Exploration of Dynamic Documents". 2 August 2002. Archived from the original on 2 August 2002.
{{cite web}}
: CS1 maint: bot: original URL status unknown (link) - ^ "Google and Yahoo accept undelayed meta refreshs as 301 redirects". Sebastian's Pamphlets. 3 September 2007.
- ^ "Web Content Accessibility Guidelines 1.0". www.w3.org.
- ^ Team, the QA. "Use standard redirects". www.w3.org.
- ^ "Core Techniques for Web Content Accessibility Guidelines 1.0". www.w3.org.
- ^ "Cross-browser client side URL redirect generator". Insider Zone. Archived from the original on 26 July 2020. Retrieved 27 August 2015.
- ^ Aaron Emigh (19 January 2005). "Anti-Phishing Technology" Archived 27 September 2007 at the Wayback Machine (PDF). Radix Labs.
- ^ "HTML 5.2: 11. Obsolete features". www.w3.org.
- ^ Schwartz, Barry (18 December 2007). "Double Redirects May Take Google More Time To Pick Up On". Search Engine Roundtable. Retrieved 28 January 2024.
- .
- ^ "Net gains for tiny Pacific nation". BBC News. 14 September 2007. Archived from the original on 12 May 2014. Retrieved 27 May 2010.
- ^ ISBN 979-8-4007-0886-2.
- ^ "Open Redirect". OWASP. 16 March 2014. Retrieved 21 December 2014.
- ^ "Covert Redirect". Tetraph. 1 May 2014. Retrieved 21 December 2014.
- ^ "Serious security flaw in OAuth, OpenID discovered". CNET. 2 May 2014. Retrieved 21 December 2014.
- ^ Mike Williams (5 June 2022). "What is an Open Redirect vulnerability, why is it dangerous and how can you stay safe?". TechRadar. Retrieved 8 April 2024.
- ^ "CWE - CWE-601: URL Redirection to Untrusted Site ('Open Redirect') (4.14)". cwe.mitre.org. Retrieved 8 April 2024.
- ISBN 978-1-4503-8454-4.
External links
- Mapping URLs to Filesystem Locations - Apache HTTP Server Version 2.4
- Taxonomy of JavaScript Redirection Spam (Microsoft Live Labs)
- Security vulnerabilities in URL Redirectors The Web Application Security Consortium Threat Classification