Monday, June 15, 2009

Web server







The Basic Process
Let's say that you are sitting at your computer, surfing the Web, and you get a call from a friend who says, "I just read a great article! Type in this URL and check it out. It's at http://www.howstuffworks.com/web-server.htm." So you type that URL into your browser and press return. And magically, no matter where in the world that URL lives, the page pops up on your screen.
--------------------------------------------------------------------------------------------
Your browser formed a connection to a Web server, requested a page and received it.
Behind the Scenes
If you want to get into a bit more detail on the process of getting a Web page onto your computer screen, here are the basic steps that occurred behind the scenes:
The browser broke the URL into three parts:
The protocol ("http")
The server name ("http://www.howstuffworks.com/")
The file name ("web-server.htm")
The browser communicated with a name server to translate the server name "http://www.howstuffworks.com/" into an IP Address, which it uses to connect to the server machine.
The browser then formed a connection to the server at that IP address on port 80. (We'll discuss ports later in this article.)
Following the HTTP protocol, the browser sent a GET request to the server, asking for the file "http://www.howstuffworks.com/web-server.htm." (Note that cookies may be sent from browser to server with the GET request -- see How Internet Cookies Work for details.)
The server then sent the HTML text for the Web page to the browser. (Cookies may also be sent from server to browser in the header for the page.)
The browser read the HTML tags and formatted the page onto your screen.
If you've never explored this process before, that's a lot of new vocabulary. To understand this whole process in detail, you need to learn about IP addresses, ports, protocols... The following sections will lead you through a complete explanation.
------------------------------------------------------------------------------------------------
Name Servers
The whois Command On a UNIX machine, you can use the whois command to look up information about a domain name. You can do the same thing using the whois form at VeriSign. If you type in a domain name, like "howstuffworks.com," it will return to you the registration information for that domain, including its IP address.
A set of servers called domain name servers (DNS) maps the human-readable names to the IP addresses. These servers are simple databases that map names to IP addresses, and they are distributed all over the Internet. Most individual companies, ISPs and universities maintain small name servers to map host names to IP addresses. There are also central name servers that use data supplied by VeriSign to map domain names to IP addresses.
If you type the URL "http://www.howstuffworks.com/web-server.htm" into your browser, your browser extracts the name "http://www.howstuffworks.com,/" passes it to a domain name server, and the domain name server returns the correct IP address for http://www.howstuffworks.com/. A number of name servers may be involved to get the right IP address. For example, in the case of http://www.howstuffworks.com/, the name server for the "com" top-level domain will know the IP address for the name server that knows host names, and a separate query to that name server, operated by the HowStuffWorks ISP, may deliver the actual IP address for the HowStuffWorks server machine.
On a UNIX machine, you can access the same service using the nslookup command. Simply type a name like "http://www.howstuffworks.com/" into the command line, and the command will query the name servers and deliver the corresponding IP address to you.
So here it is: The Internet is made up of millions of machines, each with a unique IP address. Many of these machines are server machines, meaning that they provide services to other machines on the Internet. You have heard of many of these servers: e-mail servers, Web servers, FTP servers, Gopher servers and Telnet servers, to name a few. All of these are provided by server machines.

-----------------------------------------------------------------------------------------------
Ports
Any server machine makes its services available to the Internet using numbered ports, one for each service that is available on the server. For example, if a server machine is running a Web server and an FTP server, the Web server would typically be available on port 80, and the FTP server would be available on port 21. Clients connect to a service at a specific IP address and on a specific port.
Each of the most well-known services is available at a well-known port number. Here are some common port numbers:
echo 7
daytime 13
qotd 17 (Quote of the Day)
ftp 21
telnet 23
smtp 25 (Simple Mail Transfer, meaning e-mail)
time 37
nameserver 53
nicname 43 (Who Is)
gopher 70
finger 79
WWW 80
If the server machine accepts connections on a port from the outside world, and if a firewall is not protecting the port, you can connect to the port from anywhere on the Internet and use the service. Note that there is nothing that forces, for example, a Web server to be on port 80. If you were to set up your own machine and load Web server software on it, you could put the Web server on port 918, or any other unused port, if you wanted to. Then, if your machine were known as xxx.yyy.com, someone on the Internet could connect to your server with the URL http://xxx.yyy.com:918/. The ":918" explicitly specifies the port number, and would have to be included for someone to reach your server. When no port is specified, the browser simply assumes that the server is using the well-known port 80
------------------------------------------------------------------------------------------------

Most protocols are more involved than daytime and are specified in Request for Comment (RFC) documents that are publicly available (see http://www.howstuffworks.com/framed.htm?parent=web-server.htm&url=http://sunsite.auc.dk/RFC/ for a nice archive of all RFCs). Every Web server on the Internet conforms to the HTTP protocol, summarized nicely in The Original HTTP as defined in 1991. The most basic form of the protocol understood by an HTTP server involves just one command: GET. If you connect to a server that understands the HTTP protocol and tell it to "GET filename," the server will respond by sending you the contents of the named file and then disconnecting. Here's a typical session:
%telnet http://www.howstuffworks.com/ 80Trying 216.27.61.137...Connected to howstuffworks.com.Escape character is '^]'.GET http://www.howstuffworks.com/ ...Connection closed by foreign host.
In the original HTTP protocol, all you would have sent was the actual filename, such as "/" or "/web-server.htm." The protocol was later modified to handle the sending of the complete URL. This has allowed companies that host virtual domains, where many domains live on a single machine, to use one IP address for all of the domains they host. It turns out that hundreds of domains are hosted on 209.116.69.66 -- the HowStuffWorks IP address.

-----------------------------------------------------------------------------------------
Putting It All Together
Now you know a tremendous amount about the Internet. You know that when you type a URL into a browser, the following steps occur:
The browser breaks the URL into three parts:
The protocol ("http")
The server name ("http://www.howstuffworks.com/")
The file name ("web-server.htm")
The browser communicates with a name server to translate the server name, "http://www.howstuffworks.com,/" into an IP address, which it uses to connect to that server machine.
The browser then forms a connection to the Web server at that IP address on port 80.
Following the HTTP protocol, the browser sends a GET request to the server, asking for the file "http://www.howstuffworks.com/web-server.htm." (Note that cookies may be sent from browser to server with the GET request -- see How Internet Cookies Work for details.)
The server sends the HTML text for the Web page to the browser. (Cookies may also be sent from server to browser in the header for the page.)
The browser reads the HTML tags and formats the page onto your screen.

No comments: