Archive site
This article needs additional citations for verification. (January 2016) |
In web archiving, an archive site is a website that stores information on webpages from the past for anyone to view.
Common techniques
Two common techniques for archiving websites are using a web crawler or soliciting user submissions:
- Using a robots.txt).
- User submissions: While it can be difficult to start user submission services due to potentially low rates of user submissions, this system can yield some of the best results. By crawling web pages one is only able to obtain the information the public has chosen to post online; however, potential content providers may not bother to post certain information, assuming no one would be interested in it, because they lack a proper venue in which to post it, or because of copyright concerns.[1] However, users who see someone wants their information may be more apt to submit it.
Examples
Google Groups
On 12 February 2001,
Internet Archive
The Internet Archive is building a compendium of websites and digital media. Starting in 1996, the Archive has been employing a web crawler to build up their database. It is one of the best known archive sites.
NBCUniversal Archives
NBCUniversal Archives offer access to exclusive content from NBCUniversal and its subsidiaries. Their NBCUniversal Archives website provides easy viewing of past and recent news clips, and it is a prime example of a news archive.[3]
Nextpoint
Nextpoint offers an automated cloud-based, SaaS for marketing, compliance, and litigation related needs including electronic discovery.
PANDORA Archive
PANDORA (
textfiles.com
See also
- Internet Archive
- Pandora Archive
- WebCite
- Web archiving