Thursday, January 20, 2005

Lazy Man's Ethernet Tutorials 1

This article and the next after it, are basically designed for my friend Esther Effiom, who is trying to merge what she’s learnt in school into real life practice. Well… I also had similar issues when I first started reading and then practicing networking. I found that the best thing is to have a working conceptual mental model that is more or less accurate, then to that model you’ll easily be able to add more information. These articles are written with that goal in mind. Not to be precise on every little technical detail, but more of a mid-level overview with enough low-level information to assemble into a mental picture. It will stay low-level enough to get the basics in, but high-level enough to easily assemble.


Lazy Man’s Ethernet Tutorials 1

In the course of these tutorial series, we’ll attempt to answer some basic questions about Ethernet.

1. What is Ethernet?
2. What is a packet and how does it carry information?
3. How does a system send information to another system on its subnet?
4. How does a system send information to another system on another subnet?
5. Network borders - Proxies, Packet Filters, Gateways and Routers.


1. What is Ethernet?

Ethernet is a nutshell is a broadcast mechanism that allows networked systems to talk to themselves. The keyword here is broadcast. Of course Ethernet is much more than this if you're trying to get extremely technical, but for the purpose of this tutorial, this definition will do all rite.

In Ethernet networks, all nodes have:
a. A network interface card also has a unique address called a MAC address.
b. A network interface card driver software, responsible for interpreting all the data that come in on the card, and passing it upwards to the Operating system's networking stack.
C. An IP Address which is used to identify the system at a higher level of communication than the MAC address can do. This may not be the reason, but I have observed that IP Addresses by their very design allows grouping of addresses into network groups, sub networks etc. This (in fact any kind of grouping), can not be done with the MAC addresses, so for any serious communication, a higher level of addressing is needed. But as we will see, when push comes to shove, we ALWAYS need to know the MAC address of the systems involved.
d. All systems will be directly or indirectly connected together either with a HUB or a SWITCH (Layer 2 switch, if you've heard this term before).

Keeping this in mind. Let us attempt to answer the next question.


2. What is a packet and how does it carry information?

First and foremost, the term packet in this document is generic, and refers to all incarnations of data at the various layers of the network stack. Usually, depending on which layer of the network data currently is at, it is given a different name. Hence it can be called a frame, a packet, a datagram, etc. In this tutorial, I’ll just call them all a packet. To make it easier to understand ( well, actually, to make it easier for me not to make a mistake :) ).

The packet is interesting because it has to be self sufficient on the network. Self sufficient in that any node receiving should know where it came from, and where it is going to. From what we've seen, Ethernet hosts have at least two addresses that are unique to it, the IP Address and the MAC address. This means that the packet will carry at least 4 addresses. IP addresses of both Source and Destination, and MAC addresses of both Source and Destination. For the purposes of illustration:

[src-mac-addr]
[dst-mac-addr]
[other-info ]
[src-ip-addr ]
[dst-ip-addr ]
[other-info ]
[Data Carried]



Well... forgive the diagram :)

What is going on here (in brief), is that before the packet leaves a network node, the IP address of its source and destination and the MAC address of its source and destination are prepended to it in the HEADER information area. This means that before the Data Carried is retrieved or sent over the network, the header area must be built (when sending) or interpreted/parsed (when receiving). We'll see how this works below.

Now onto the next question.



3. How does a system send information to another system on its subnet?

Firstly, what is a subnet? Well, if you have any experience with IP Addresses, you probably know that they look of the form X.X.X.X. Well, that's not actually all about them. These IP addresses have some kind of visual pattern, for instance, 192.168.1.1, 192.168.1.2, 192.168.3.4, 192.168.3.7, 10.0.1.1, 10.0.2.4, etc. By visual inspection, you can actually pick out that 192.168.1.1 and 192.168.1.2 differ only at the .1 and .2. 10.0.1.1 and 10.0.2.4 are similar up to 10.0. . AT this point, without delving into the mechanisms of how this is gotten, believe me when I say to you that there are two parts to every IP address. These two parts are:

a. The parts that looks alike for various machines.
b. The part that actually is unique for various machines.

Actually, this is not purely visual, as these comparisons are really done at binary level, but forget about that at least for right now. Just think simple in terms that (a) is the network address and (b) is the host address, and when you combine (a) and (b), you have a complete address for a host on a network. As more in-depth information, there is a number called the subnet mask, which actually mathematically separates these two numbers. But for now, you can forget about that in detail.

The main point here is that machines that have the same network part of their address for are said to be on the same subnet, and these are the only machines that can talk to themselves without help.

So anytime a machine wants to talk to another machine it checks if they are on the same subnet, if they are, communication can go on, if not, some help will be needed.

Lets paint a little picture.


[A-----B-----C]<----D----->[E-----F-----G]

A = 192.168.1.1 (11:22:33:44:55:66)
B = 192.168.1.2 (22:33:44:55:66:77)
C = 192.168.1.3 (33:44:55:66:77:88)

(FYI: the numbers with dots are the IP addresses, while the numbers with colons are the MAC addresses. To get this info on UNIX do 'ifconfig' on Windows do 'ipconfig/all')



Forgive this diagram ( :P ), but imagine with me the scenario. A,B and C are on a single network. This would mean from our point (1) above that they are connected via a single (or series of) hub(s) or switch(es).


For our explanation, assume A wants to send data to C. The first thing A does (at least in our context), is to verify that the systems are on the same subnet then to build the HEADER and prepend to the actual data to send.

Before sending the packet, A knows all its own addresses, but it probably has no information on C, but its IP address. But it has to relate the IP Address to a MAC address. To do this, A first checks its ARP table, which contains a mapping of known IP-address to MAC address pairs. If it doesn't find an entry for 192.168.1.3 there, A issues an ARP broadcast request.

ARP means Address Resolution Protocol, and in a nutshell what happens is that A sets loose a that everyone will identify as a broadcast packet (which means they can all accept it), the packet contains a single question, "Who owns IP Address 192.168.1.3". C, The system that owns that IP address, will reply with its own ARP Reply packet, saying "192.168.1.3 is at MAC 33:44:55:66:77:88", thus supplying its MAC address. Now A knows which MAC address belongs to C, and its IP Address. A also updates it ARP table, so if it has to send a packet to C again before its ARP table cache expires, it wont need to do a full ARP request again.


The final packet will loosely resemble this:

[11:22:33:44:55:66]
[33:44:55:66:77:88]
[.. other info ..]
[ 192.168.1.1 ]
[ 192.168.1.3 ]
[.. other info ..]
[.. Data Carried..]

This packet will now be broadcast on to the A-B-C network. What this means is that A,B and C will see the packet. The network interface cards on A,B and C will receive this packet and send it to the network interface card driver software. (Is it not odd that though A sends the packet, A also receives the packet? Well.... such is the broadcast world of Ethernet).

The driver software for each of the network nodes will proceed to examine the packet. A will discover that though the source MAC address is itself, the destination MAC address is another node. Under normal circumstances (uhh... there are some abnormal circumstances), A will ignore the packet, and will not send it further up the network stack for processing.

When B examines the packet, it will see that neither the source nor destination MAC addresses are itself. It will also ignore the packet.

C will examine the packet and see that the destination MAC address is indeed its own. It will then proceed to send the packet upwards the network stack.

For the purposes of this illustration, lets assume that there is a generated reply. To send the reply packet, C will examine the original packet, to get the source IP address on the packet. This source IP address will become the destination IP address of the recipient of the reply.

C then proceeds to do the same thing that A did above.

If you don't understand the preceding, stop at this point, and read it again slowly, pausing to make sure you get a mental picture of the scenario described. It's important, since every other networking information builds on the preceding. In the next Installment, we'll answer the remaining questions.

Obviously, there may be errors in this document caused by my own misinformation, oversimplification or plain omissions. So questions, suggestions, kudos, flames and 'are you nut's should be directed to essiene at datavibe dot net or essiene at gmail dot com


0 Comments:

Post a Comment

<< Home