Web Counters and Server Logs

Web Counters

As Jeff Goldberg says in a slightly different context, there is a wrong way and a very wrong way to use web counters.

The very wrong way, from the standpoint of accuracy, is to use a counter which relies on Java or JavaScript. The numbers from such counters will wildly underestimate the number of times your page has been viewed and the variety of browsers that your visitors are using. Many people do not or cannot use Java or JavaScript.

Counters which rely on cookies are flawed for a similar reason. Some browsers cannot accept cookies, and some visitors refuse all cookies.

The wrong way to gauge the number of your page's visitors is to use a counter which depends on loading an image. What if the image breaks? What if the visitor is using a text browser, or has images turned off? What if the visitor interrupts the data transfer or simply leaves before the image is loaded? I had a graphical counter on my page HTML Editors for Windows 3.x Reviewed, and it undercounted by about half in comparison to our web server statistics.

Server Access Statistics

Server logs, and the statistics extrapolated from them, are more accurate than web counters. But they are meant to assess the load on the server, and cannot tell you how many people saw your page, how many times your page was viewed, or which browsers your visitors are using in what proportion.

Server logs cannot tell you which browsers visitors are using because, quite simply, many browsers lie about their identity. Most versions of Microsoft Internet Explorer, Opera, and MSN TV announce themselves as Netscape. Lynx users can configure their browser to call itself anything they want. Now there are even browsers pretending to be Internet Explorer pretending to be Netscape. And why this subterfuge? Because historically some sites have barred or discriminated against the "wrong" browsers.

Server access logs cannot tell you how many people have seen your page, for several reasons.

  1. There can be multiple users per computer, even simultaneously.
  2. Your visitors may have access to more than one computer that's connected to the Net.
  3. Your visitors may have dynamic rather than static IP numbers. If this is the case, their computers do not have fixed Internet addresses, but are assigned a number from a pool as needed.
  4. Your visitors may be going through proxy servers--more on which momentarily.

Because of caching, server statistics cannot tell you how many times your page was viewed. Most browsers store some web pages in an internal cache or in the computer's memory. If someone fetches a page from their internal browser cache, no contact is made with the server, and thus there is no access statistic for the server logs to record.

Similarly, some corporations and Internet Service Providers use proxy servers. In part, you could think of this as a regional cache, shared by several individual users. A proxy server acts as an intermediary between groups of browsers and web page servers. Rather than the page request going directly from the browser to the web page's host server, if a proxy is used, page requests go from browsers to the proxy server, which relays the request if the proxy does not already have the page cached. So a web page server's logs will record the first request for a particular page from a proxy server, but the web page server is not even contacted for subsequent requests of the same page. Again, no access to the web page server, and no additional entry in the server logs. Incidentally, this is another reason you cannot rely on server logs to tell you which browsers your visitors are using, in what proportion.

So what good are server logs to web page owners? Though they are not a reliable gauge of popularity, server statistics can provide several kinds of useful information to web page owners. You can tell from which domains your visitors are coming, find out some pages that are linked to yours, learn when the search engines have indexed your page, and spot errors. In some cases you can determine which words were used as search terms to locate your site. Though the caveats about caching apply, you can sometimes track the visitor's path through your pages.

Further Reading:

ObComputing Directory | Elizabeth T. Knuth's Home Page

Comments to: eknuth@unix.csbsju.edu

Valid HTML 3.2!

Web Counters and Server Logs / Revised 4 January 2004 / © Copyright 1998-2004, Elizabeth T. Knuth / URL: http://www.users.csbsju.edu/~eknuth/obcomp/hitcount.html