Frequently Asked Questions
- What is included in the archives?
- What file types can be stored and accessed?
- What does it mean when there's an asterisk (*) next to a date on the search results page?
- Is there an easy way to compare two archived versions of a web site?
- How do I know the date a site was archived?
- Why does an archived page display today's date?
- Why isn't the site I am looking for in the archives?
- What types of web content cannot be harvested?
- Can I suggest a site for inclusion in the archives?
You can find terms used on this site in the glossary.
Visit our contact page here.
The collection consists of over 360 domain names that have been identified and appraised for frequent capture. We crawl every link that is part of the originating domain name including images, text, and video.
Typically, if the file can be downloaded from the web without direct user intervention, then it can be stored and accessed. We sometimes cannot provide password protected files, databases, or files that require filling out a form for access. To learn more about file types that cannot be harvested, click here.
Some web pages are not updated very frequently while others are updated often. When our automated system crawls the web, only about half of all pages on the web have changed from our previous visit. The asterisk indicates that the content has been updated from the previously archived copy. If you don't see an asterisk next to an archived page, then the content on the archived page is probably identical to the previously archived copy.
Yes. First, search for a page. On the results page for a particular url, click “compare archive pages” at the top of the screen.
The page will reload with check boxes next to each date the site was archived. Choose the two versions you would like to compare and hit the “compare two dates” button (remember that if you don't see an asterisk next to an archived document, then the content on the archived page is probably identical to the previously archived copy). Deletions will appear in blue with a line through the text and additions will appear in green.
The yellow band at the top of an archived page lists the date and time when a site was captured.
If a site contains code to calculate the current date, the current date will appear on the site regardless of the date it was actually added to the collection. You should check the yellow band at the top of an archived page for the date and time when a site was captured.
There are a couple of reasons why the site may not be in the archives.
- It may not meet our criteria for capture.
- Content or technological reasons may impede harvest.
- The site owner may have requested that the site not be included in the collection.
If you believe that the site should be in the collection, click here for information on how you can recommend the site.
As a crawler visits a site, it will gather and organize the contents it encounters. This is known as harvesting. However, there are certain types of content that our crawler cannot harvest. These are:
Robots.txt —A site owner puts a robots.txt file on a site to keep crawlers from crawling the site. Our crawler will not harvest a site that has a robots.txt file.
Date Displays — If a site contains code to calculate the current date, the current date will appear on the site regardless of the date it was actually added to the collection. You should check the yellow band at the top of the archived site to determine the date the page was archived.
Server Side Image Maps — If the site needs to contact the originating server in order to work, it will fail when archived.
Streaming Media — This is a one-way transmission over a data network that is played as it is received and is not stored permanently on the requesting computer. While we can’t harvest streaming media, we can harvest downloadable media files.
Password Protected Sites — The crawler cannot collect any site that requires a password or that is database driven because it requires user input. This includes https sites.
Form Driven Content — If you need to fill in a form to get access to the content, the crawler typically cannot retrieve this content.
Yes. To do so, please contact us with the proposed url and an explanation of why you believe the site should be included. We will evaluate the site using the Collection Procedures for State Government Web Sites Using Archive-It (pdf) and then contact you regarding your request.
The pages in the archives are made available to the public for use in research, teaching, and private study, pursuant to the U.S. Copyright Law. The user must assume full responsibility for any use of the materials, including but not limited to, infringement of copyright and publication rights of reproduced materials. Click here to see our full copyright statement.
We provide full text search capability for the archives. Alternatively, if you know the site you are looking for, you can enter the url into the search box and view all instances of that archived url. For more help with searching, consult this document.
We do not prohibit downloading from our collection, however, the user must assume full responsibility for any use of the materials, including but not limited to, infringement of copyright and publication rights of reproduced materials. View our full copyright statement here. Whenever materials from our collection are used in a publication or other product we request that the copy carry a credit line stating “Courtesy of the North Carolina Department of Cultural Resources.”
You will need the following information to set up proxy mode to browse the NC Web Site Archives.
Host = wayback.archive-it.org
Port = 9194
Now, follow the link below that corresponds to the type of browser you wish to use.
- Firefox (HTTP Proxy), with the foxyproxy add-on
- Internet Explorer 7 or 8
- Internet Explorer 6 (or earlier)
- Safari (Web Proxy HTTP)
After you have changed to proxy mode, open a browser window and type in the URL whose archived version you are interested in viewing. You will see the "archived website disclaimer" bar at the top of the screen. Keep in mind that (1) browsing in proxy mode will only display the most recent capture date of the website you are browsing and (2) using the settings given in this FAQ will only work with sites from the NC Web Site Archives.
To return your browser to normal, you will need to follow the same instructions used above and then unselect the option to enter proxy mode.
Most of the sites captured display best using either Mozilla Firefox or Internet Explorer 7, so check the page in both browswers. Download Firefox here and IE7 here. Other display issues result from frames on a Web page; in this case it is just a bug in the archives. Please note, however, that we are regularly working on resolving these issues.
Below is a list of common error messages you may encounter while searching the archives. If you see an error message that does not have the Internet Archive Wayback Machine logo in the upper left corner, you are most likely looking at an archived error page or the live web.
Failed Connection — The server that the particular piece of information is stored on is down. Generally these errors clear up within two weeks.
Robots.txt Query Exclusion —A site owner puts a robots.txt file on a site to keep crawlers from crawling the site. Our crawler will not harvest a site that has a robots.txt file.
Blocked Site Error — Site owners or copyright holders have requested that the site be excluded from the collection. It is possible that the State Archives obtained a copy of the web site you are looking for directly from the agency without using the automated crawler. Please contact us to determine if the web site is available.
Path Index Error — A path index error message refers to a problem in our database. These errors may take time to fix. If you encounter this error message please alert us to the problem by contacting us and identifying the link that you were trying to reach and the page that you were trying to link from.
Not in Archive — The page you are trying to access is not part of the archives. Refer to this question for reasons why a site may not be included in the archives.
If you are following links from one domain to another domain, both in the collection, it is possible the new domain was captured on a different date. In that case, we will display the closest available capture date of the new domain. To make sure you know what version of the web you are viewing, pay attention to the date listed in the yellow band at the top of the archived page.
Most images display properly in the archives. When there is a small red "x" where the image should be it means that technological issues prevented the capture of the image content. When an image is grayed out it means that the site owner used robots.txt exclusions to block access to the images directory.
All web sites in the collection have been carefully selected using the Collection Procedures for State Government Web Sites Using Archive-It (pdf). In general we will not honor requests to remove sites unless they come directly from the site owner. Site owners can request manual exclusions of their web sites by contacting us. However, if site owners choose not to participate through this automated method, they will need to arrange for a copy of the site to be delivered to the State Archives. Please see the Guidelines for Maintaining and Preserving Records of Web-Based Activities for more information.