WEB STATISTICS DESCRIPTIONS -------------------------------------------------------------------------------- The yearly (index) report shows statistics for a 12 month period, and links to each month. The monthly report has detailed statistics for that month with additional links to any URL's and referrers found. The various totals shown are explained below. Hits Any request made to the server which is logged, is considered a 'hit'. The requests can be for anything: html pages, graphic images, audio files and CGI scripts. Each valid line in the server log is counted as a hit. This number represents the total number of requests that were made to the server during the specified report period. Files Some requests made to the server, require that the server then send something back to the requesting client, such as a html page or graphic image. When this happens, it is considered a 'file' and the files total is incremented. The relationship between 'hits' and 'files' can be thought of as 'incoming requests' and 'outgoing responses'. Pages Pages are, any HTML document, or anything that generates an HTML document. This does not include graphic images, audio clips, etc. For example, '.htm', 'html', 'php', 'cdip', 'csh*' or 'cgi' are all considered a page. Sites Each request made to the server comes from a unique 'site', which can be referenced by a name or ultimately, an IP address. Unique sites are broken down by month and day. This DOES NOT mean the number of unique individual users (real people) that visited, which is impossible to determine using just logs and the HTTP protocol (however, this number might be about as close as you will get). Visits Whenever a request for a page (see above) is made to the server from a given IP address (site), the amount of time since a previous request by the address is calculated (if any). If the time difference is greater than 30 minutes, or has never made a previous visit during the current month, it is considered a 'new visit', and this total is incremented both for the site, and the IP address. Note that you cannot add up the daily visit totals and compare them to the monthly total, they are different reporting periods. For example, if someone visits the site at 11:45pm and stays until 12:15am, the monthly total would show one visit, while the daily totals will show two (one for each day). Note also that our site often shows more unique sites than unique visits for the daily totals. Visits are only triggered when a valid request is found for a page (see Pages above for a definition of a page). Since we host some non-pagetype URLS that are linked to from outside sites (e.g. /models/socal_now.gif), this is a common occurrence for us. Also note that monthly total unique visits is greater than monthly total unique sites. This is because visits approximately add up across days, whereas the number of unique sites for the complete month will be much less than the sum of the unique sites each day. KBytes The KBytes (kilobytes) value shows the amount of data, in KB, that was sent out by the server during the specified reporting period. Note: A kilobyte is 1024 bytes, not 1000. Top Entry and Exit Pages The Top Entry and Exit tables give a rough estimate of what URL's are used to enter the site, and what the last pages viewed are. Because of limitations in the HTTP protocol, log rotations, etc... this number should be considered a good "rough guess" of the actual numbers, however will give a good indication of the overall trend in where users come into, and exit, the site. Notes on Visits/Entry/Exit Figures ---------------------------------- The majority of data analyzed and reported on by The Webalizer is as accurate and correct as possible based on the input log file. However, due to the limitation of the HTTP protocol, the use of firewalls, proxy servers, multi-user systems, the rotation of log files, and a myriad of other conditions, some of these numbers cannot, without absolute accuracy, be calculated. In particular, Visits, Entry Pages and Exit Pages are suspect to random errors due to the above and other conditions. The reason for this is twofold, 1) Log files are finite in size and time interval, and 2) There is no way to distinguish multiple individual users apart given only an IP address. Because log files are finite, they have a beginning and ending, which can be represented as a fixed time period. There is no way of knowing what happened previous to this time period, nor is it possible to predict future events based on it. Also, because it is impossible to distinguish individual users apart, multiple users that have the same IP address all appear to be a single user, and are treated as such. This is most common where corporate users sit behind a proxy/firewall to the outside world, and all requests appear to come from the same location (the address of the proxy/firewall itself). Dynamic IP assignment (used with dial-up internet accounts) also present a problem, since the same user will appear as to come from multiple places. For the most part, the numbers shown for visits, entry and exit pages are pretty good 'guesses', even though they may not be 100% accurate. They do provide a good indication of overall trends, and shouldn't be that far off from the real numbers to count much. You should probably consider them as the 'minimum' amount possible, since the actual (real) values should always be equal or greater in all cases.