Web Server Log Analyzer

Note: Please Scroll Down to See the Download Link.

ABOUT THE PROJECT

Vision

The vision of the Web server Log Analyzer with on demand Reporting is to maintain the network traffic details in a log file, which derives indicators about who, when and how a web server is visited.

Scope

The scope of the Real-time Web server Log Analyzer with on-demand Reporting is as follows:

ü  Daily Traffic - Displays downloads per day.

ü  Hourly Traffic - Displays downloads per hour.

ü  Referrer - Displays URLs that were active before files were downloaded.

ü  Browser - Displays the referring URL when you click on a row in the Referrer report.

ü  DLs (Downloads) - Displays the number of times that files have been downloaded.

ü  UAs (User Agents) - Displays the User Agents that are accessing the website. A User Agent is the name of the program that is requesting pages on a web site. Usually User Agents refer to web browsers.

ü  Accesses - Displays the number of accesses and bytes downloaded by each user.

ü  Searches - Displays the search queries that users have submitted to search engines.

ü  Search Words - Displays words used in searches.

ü  Visitors - Displays the number of unique visitors to the website.

ü  Countries - Displays visitors' countries and the number of requests and bytes downloaded.

ü  Status Codes - Displays status codes for HTTP requests.

ü  Errors - Displays error codes for HTTP requests.

Definition, Acronyms, Abbreviations

URL               – Uniform Resource Locator

Browser          – A web browser is a software application that enables a user to display and interact with text, images, videos, music and other information typically located on a Web page at a website on the World Wide Web or a local area network.

Web server     – A computer program that is responsible for accepting HTTP requests from clients, which are known as web browsers, and serving them HTTP responses along with optional data contents, which usually are web pages such as HTML documents and linked objects (images, etc.).

Log                 - usually web servers have also the capability of logging some detailed information, about client requests and server responses, to log files; this allows the webmaster to collect statistics by running log analyzers on log files.

Overview

There are two types of log analyzers:

Post parsing reporting – The log files are parsed and all the reports are generated after that - usually on a scheduled basis. This can put great strain on a computer as the parsing and reporting are done in one go.

Real-time, on-demand Reporting - The log files is parsed to a database in the background. A report is only generated when requested. This type of analyzer is usually more suited for many users as it places less strain on a server.

The term web server can mean one of two things:

A computer program that is responsible for accepting HTTP requests from clients, which are known as web browsers, and serving them HTTP responses along with optional data contents, which usually are web pages such as HTML documents and linked objects (images, etc.).

A computer that runs a computer program as described above.

Common features

 The rack of web servers hosting the My Opera Community site on the Internet. The Opera Community rack, as seen to the left. From the top, user file storage (content of files.myopera.com), "bigma" (the master MySQL database server), and two IBM blade centers containing multi-purpose machines (Apache front ends, Apache back ends, slave MySQL database servers, load balancers, file servers, cache servers and sync masters.

Although web server programs differ in detail, they all share some basic common features.

HTTP: every web server program operates by accepting HTTP requests from the client, and providing an HTTP response to the client. The HTTP response usually consists of an HTML document, but can also be a raw file, an image, or some other type of document (defined by MIME-types); if some error is found in client request or while trying to serve the request, a web server has to send an error response which may include some custom HTML or text messages to better explain the problem to end users.

Logging: usually web servers have also the capability of logging some detailed information, about client requests and server responses, to log files; this allows the webmaster to collect statistics by running log analyzers on log files.

In practice many web servers implement the following features also:

Authentication, optional authorization request (request of user name and password) before allowing access to some or all kind of resources.

Handling of static content (file content recorded in server's filesystem(s)) and dynamic content by supporting one or more related interfaces (SSI, CGI, SCGI, FastCGI, JSP, PHP, ASP, ASP .NET, Server API such as NSAPI, ISAPI, etc.).

HTTPS support (by SSL or TLS) to allow secure (encrypted) connections to the server on the standard port 443 instead of usual port 80.

Content compression (i.e. by gzip encoding) to reduce the size of the responses (to lower bandwidth usage, etc.).

Virtual hosting to serve many web sites using one IP address.

Large file support to be able to serve files whose size is greater than 2 GB on 32 bit OS.

Bandwidth throttling to limit the speed of responses in order to not saturate the network and to be able to serve more clients.

Main key performance parameters (measured under a varying load of clients and requests per client), are:

ü  Number of requests per second (depending on the type of request, etc.);

ü  Latency response time in milliseconds for each new connection or request;

ü  Throughput in bytes per second (depending on file size, cached or not cached content, available network bandwidth, etc.).

Existing System

The current system is the log files parsed and all the reports are generated after that - usually on a scheduled basis. This can put great strain on a computer as the parsing and reporting are done in one go.

Limitations in Existing System

The limitations of the existing system:

ü  Find out the number of requests per second

ü  Doesn’t find the Overload causes

ü  Doesn’t provide security

Proposed System

The proposed system is aimed to develop a Web Server Log Analyzer, which parses a log file from a web server (like Apache), and based on the values contained in the log file, derives indicators about who, when and how a web server is visited.

Problem Definition

This project is aimed at to develop a web server log analyser, which can analyze the web server access information. This is useful to maintain the network traffic details in a log file, which derives indicators about who, when and how a web server is visited.

Advantages over Existing System

ü  Daily Traffic - Displays downloads per day.

ü  Hourly Traffic - Displays downloads per hour.

ü  Referrer - Displays URLs that were active before files were downloaded.

ü  Browser - Displays the referring URL when you click on a row in the Referrer report.

ü  Downloads - Displays the number of times that files have been downloaded.

ü  User Agents - Displays the User Agents that are accessing the website. A User Agent is the name of the program that is requesting pages on a web site. Usually User Agents refer to web browsers.

ü  Accesses - Displays the number of accesses and bytes downloaded by each user.

MODULE DESCRIPTION

Real-time, on-demand Reporting

The log files is parsed to a database in the background. A report is only generated when requested. This type of analyzer is usually more suited for many users as it places less strain on a server.

The log files are parsed to a database in the background. A report is only generated when requested. This type of analyzer is usually more suited for many users as it places less strain on a server.

Logging:

usually web servers have also the capability of logging some detailed information, about client requests and server responses, to log files; this allows the webmaster to collect statistics by running log analyzers on log files.

In practice many web servers implement the following features also:

Authentication, optional authorization request (request of user name and password) before allowing access to some or all kind of resources. Handling of not only static content (file content recorded in server's file system(s)) but of dynamic content too by supporting one or more related interfaces (SSI, CGI, SCGI, FastCGI, JSP, PHP, ASP, ASP .NET, Server API such as NSAPI, ISAPI, etc.).

HTTPS support (by SSL or TLS) to allow secure (encrypted) connections to the server on the standard port 443 instead of usual port 80.

Content compression (i.e. by gzip encoding) to reduce the size of the responses (to lower bandwidth usage, etc.).

Virtual hosting to serve many web sites using one IP address.

Large file support to be able to serve files whose size is greater than 2 GB on 32 bit OS.

Bandwidth throttling to limit the speed of responses in order to not saturate the network and to be able to serve more clients.

Functions

The functions involved in the development of Real-time Web server Log Analyzer with on-demand Reporting are:

ü  Daily Traffic Report

ü  Countries Report

ü  Accesses Report

ü  Searches Report

ü  User Agents Report  

Software Requirements

Operating System                                           :                       Windows XP/2003

Programming Language                                  :                       C#.net

Frame Work                                                    :                       ASP.net

Workbench                                                      :                       Visual Studio

Database                                                         :                       Access

Hardware Requirements

Processor                                                         :                       Pentium IV

Hard Disk                                                       :                       40GB

RAM                                                                :                       256MB            

Click here to download Web Server Log Analyzer source code