Old web pages

Steve steve at advocate.net
Mon Nov 30 08:55:01 PST 1998


x-no-archive: yes


Web Pages Must Live Forever

Jakob Nielsen's Alertbox 11/29/98


Once you have put a page on the Web, you need to keep it there
indefinitely: 

	Other sites may link to your page, so removing it will cause
linkrot and lost business opportunities as you turn away new users.

	Users may have bookmarked the page because they want to go directly
to a relevant part of your site instead of starting at the home page
every time.

	Search engines are slow in updating their databases, so they too
will lead users astray if you remove pages.

	Old content adds value to your site: some users will benefit from
the old pages, so why not keep serving these customers?

The first three reasons are really arguments why URLs must stay
active forever: any URL that has ever been exposed to the outside
world must continue to bring up something reasonable when people go
to it. Because they will. It is common experience among webmasters
that they keep getting hits on URLs that were put out of service
several years ago. 

Even if you believe that the old page has zero value, the old URL
should be supported and made into a redirect to the closest related
page on the site. 

Value of Old Content

Most old pages do have value for users, so I recommend keeping the
pages themselves alive forever. Sure, new content is probably more
valuable than old content, but there is more old content to choose
from. As an example, consider a site that publishes new content on a
weekly basis. After a year, this site will consist of 51 old editions
and one new edition. Assuming that new content is ten times as
valuable as old content, 84% of the site's value comes from old
content. 

I still get about 50 visitors per week who follow the link to my site
from an article about usability in The New York Times four months
ago.  Adjusting for link click-through, this means that the newspaper
provides extra value to many more readers simply by leaving this old
article on their server. A great way to establish a reputation as a
substantial online service of record. 

A typical Alertbox accumulates about 80,000 page views over time,
only 20,000 of which are received while it is the "current" column. 

Users benefit from old content because: 

	It may be intrinsically interesting and worth reading even when
it's not news (say, a well-written essay)

	It can become of renewed interest due to later events (what did the
new CEO of your main competitor do two jobs ago?)

	It can have historical interest (how did reviewers view Gone With
the Wind when it opened?)

	It helps with old products (your neighbor has an HP printer from
1995 for sale: will it satisfy your needs?)

	It provides background information and a richer texture for a
website: the true killer app for the Web is diversity (Amazon.com
gets many sales from listing a huge number of old books that each
sell only a few copies per year; listing out-of-print books that
they don't sell adds to the value of the service and makes users more
likely to come back) 

Cost of Old Content

>From a site management perspective, the cost of keeping old content
is trivial: the cost of hard disk space is close to zero, and the
cost of maintaining old files can be very low if they are developed
according to the HTML standards or kept in a publishing database. 

In order to enhance the value of the old content, I recommend
investing a small amount of resources on content gardening: 

	Have new articles link to old content for background or
supplementary information: since the new content may be written by
people who don't know the old stuff, it is often an editorial
function to add these links.

	Maintain the links in the old files: kill or replace outdated ones.

	Add forward links to the old pages so that they point to newer
pages: otherwise users will never discover follow-on products and
more recent developments in a case.

	Remove obsolete or misleading information and replace with current
data or a current link (for example, the announcement of a
conference or product launch may be replaced with the proceedings or
a report from the event; also add a forward link to this year's
conference).

The cost of maintaining old content may be about 10% of the original
cost of developing this content, but since doing so more than doubles
the value of the website, it is a good investment. 

Make Time Explicit

It can be confusing for users to stumble across old content if they
are looking for current information. Confusion can be minimized by: 

	Explicitly mention the date the page was originally written.

	Add a prominent disclaimer to point out ways in which the page does
no longer apply (e.g., "This product is no longer being
manufactured").

	Forward-pointing links to the most recent pages about the same
topic.

Downplay Old Content in Search Listings

After a few years of accumulating old content, search results
listings can be dominated by pointers to old stuff unless steps are
taken to increase the priority of new content. 

The simplest solution is to have the search engine give a lower
weight to old pages. Note that the weight should be computed
relative to the creation date and not to the latest modification date
(which will often be very recent if the old content has been
maintained properly).

A more advanced solution is to think of search as more of an index to
the site and less of a simple keyword-counting operation. In this
model, the search weight of old content will change based on its
changing value as a resource for each query. It is a difficult
research challenge to fully do this, but a manual approximation would
be to have the content gardener change the search weights for each
meta-keyword based on its current relevance. 




* * * * * * * * * * * * * *  From the Listowner  * * * * * * * * * * * *
.	To unsubscribe from this list, send a message to:
majordomo at scn.org		In the body of the message, type:
unsubscribe scn
==== Messages posted on this list are also available on the web at: ====
* * * * * * *     http://www.scn.org/volunteers/scn-l/     * * * * * * *




More information about the scn mailing list