Showing posts with label router. Show all posts
Showing posts with label router. Show all posts

Friday, September 7, 2012

Tunneling traffic through your OpenFlow controller - Building a POX-based OpenFlow router

Why would you do this?

If we want to make an OpenFlow router, we need to be able to communicate with other non-OpenFlow routers. Normally, you would assign an IP address to your router, turn on BGP/OSPF, and then configure these protocols to talk to other routers using this IP address. With OpenFlow, the controller has the brains, but no obvious way to talk to other network devices. If only we could pretend that the controller was in the router somehow...

Can't we just look at the OpenFlow messages?

Sure, and we looked at this last week, but it's clumsy and means we need to reinvent the wheel to make software routers talk to POX. RouteFlow abstracts this by loading software routers in virtual machines, last week's demonstration hardcodes everything into the controller, but tunnelling gives us a middle-of-the-road solution: no virtual machines needed, but we can still bind stuff to a network interface on the controller and let the linux network stack handle already-solved problems like TCP and the like.

Building a tunnel

Linux has a fantastic tool called TUN/TAP, which lets you create virtual network interfaces. One end talks to the Linux network stack and lets any application use it, and the other end talks to our program. In the spirit of keeping things modular, and minimising opportunities for me to write bad code, I've used the PyTap library to set this up. PyTap has a PIP package, which means we can easily add it to a virtualenv and continue to keep everything self-contained.

Protip: TUN interfaces take IP packets, TAP interfaces take Ethernet packets

If you haven't used virtualenvs, here's the basic idea:

virtualenv tundemo
cd tundemo
source bin/activate
pip install pytap
git clone http://github.com/noxrepo/pox

This will set you up with a virtualenv that has POX and PyTap ready to go. Despite being in a virtualenv, PyTap still needs root privileges, so you'll need to be root before source'ing into your virtualenv to make this work. If anyone can show me how to make this work without root privileges I'll be happy to hear (presumably some trickery with the /dev/net/tun device)

As with my other modules, I've hacked code into a copy of forwarding.l2_learning - this time I've renamed it to tundemo, and changed the name of the class all through the source.

Here are all my imports, add these at the top:

from pytun import TunTapDevice, IFF_TAP
from pox.lib.addresses import *
from pox.lib.packet import *
from threading import Thread
import subprocess

In the __init__() function, I've put the following code to make the TAP device:

    # Our table
    self.macToPort = {}
    
    # TAP device
    self.tap = TunTapDevice(flags=IFF_TAP)
    self.tap.addr = '10.1.1.13'
    self.tap.netmask = '255.255.255.0'
    self.tap.mtu=1300
    print "hwaddr for " + self.tap.name + ": " + str(EthAddr(self.tap.hwaddr))
    
    # Bring tap interface up
    subprocess.check_call("ifconfig " + self.tap.name + " up", shell=True)

PyTap chooses a random MAC address when it creates the interface, so printing it out lets us debug things a bit easier.

Tunneling fron TAP to switch

Once we have our TAP interface up, we need to handle packets that we receive on it. Let's set up a thread to handle this

# Create thread to read from tap and send to switch
    self.th = Thread(target=handle_tap_in, args=(self))
    self.th.daemon = True
    self.th.start()

    # Set max packet size to 1400 bytes
    self.connection.send(of.ofp_set_config(miss_send_len=1400))

Our handler function is fairly straightforward

def handle_tap_in(switch):
  while True:
    packettap = switch.tap.read(switch.tap.mtu+24)
    print "Packet read from tap"
    e = ethernet()
    e.parse(packettap[4:])
    
    port = of.OFPP_ALL
    if e.dst in switch.macToPort:
        port = switch.macToPort[e.dst]
    
    msg = of.ofp_packet_out()
    msg.data = packettap[4:]
    msg.actions.append(of.ofp_action_output(port =
                                          port))
    switch.connection.send(msg)

This will send all packets that come up on the tap0 interface to the switch, and either floods them or sends them on the right port, depending on what MAC addresses we've already learned.

Tunneling from switch to TAP

We already get sent packets from the switch by default, and these go to the _handle_PacketIn() function. We just need to get the raw data out and send this to the TAP interface

My switch always sends VLAN-tagged packets, so if yours doesn't then you'll want to change this a bit. Here is the SendToTap() function:

def SendToTap():
     # remove vlan header and rebuild
      print "Forwarding packet"
      v = packet.next
      i = v.next
      eth = ethernet(src=packet.src, dst=packet.dst, type=v.eth_type)
      print type(i)
      eth.set_payload(i)
      # first 4 bytes are 00 00 08 00 (null short, then IPv4 ethertype)
      totap = struct.pack('!bbbb', 0, 0, 8, 0) + eth.pack()
      #print totap.encode('hex')
      self.tap.write(totap)

And we call this when a packet comes to us with a multicast MAC or our MAC:

if packet.dst == EthAddr(self.tap.hwaddr):
      print "Packet for us!"
      SendToTap()
      return

if packet.dst.isMulticast():
      SendToTap()
      flood() # 3a

Now the tunnel is all good to go. Just make sure any devices plugged into the switch have an MTU of 1300, and you can talk to the controller, transfer files off with SCP (30 minutes to copy an Ubuntu ISO at around 4Mb/s)

A couple of hiccups


Packet sizes

My switch doesn't seem to handle having the packet-size value changed. POX by default tells the switch to send the first 128 bytes of packets, and while we can send messages to increase this, they're ignored. The work-around is to change DEFAULT_MISS_SEND_LEN to 1400 in pox/openflow/libopenflow01.py

Jitter

Latency varies from 1ms to 50ms, and TCP really, really doesn't like this. UDP routing protocols like OSPF shouldn't notice this, and even TCP-based routing protocols like BGP should be fine - but TCP gets really confused and this means you shouldn't expect any large data flows to work well with this.

MTU sizes

This stuff confuses me. I'm a network engineer, and I'm supposed to know this stuff, but I don't. When we read from the TAP device, we read the MTU + 24 bytes. There's 14 bytes for the Ethernet header, 4 bytes for the TAP header, and another 6 bytes in there for no obvious reason. 24 bytes just seems to work, and I have no idea why.

TAP device

Two things bug me about this - there doesn't seem to be a nice way to bring it up (apart from using ifconfig), and you need root to create it in the first place - I'd want to fix both of these for a nicer solution

Next steps

  • TAP devices could be created for each physical port on an OpenFlow device, or as routed interfaces for each VLAN - limitless opportunities here
  • BIRD or Quagga could bind to a TAP device, and the controller could turn routes into flows. BIRD has a python interface, but since both use standard routing protocols, you could easily sniff the traffic and build routing tables out of these. Sniffing BGP updates is still way easier than trying to build a Python TCP stack
  • VRFs? Traffic injection? Just another example of how easy it is to grab POX and do novel things with inexpensive hardware

Friday, August 31, 2012

ARP and ping in POX - Building a POX-based OpenFlow router

What are we doing?

Today, we're going to look at how to handle ARP and ICMP ping messages in the OpenFlow controller POX. The results aren't amazing - latency is between 5 and 50 milliseconds (using pypy makes no difference) - but it's an important feature for any layer 3 device.

Why is it important?

If we want to make native router modules in OpenFlow, we need to be able to assign IP addresses to interfaces on our device. This means the router can talk IP to other devices on the network, a vital step towards building an OpenFlow router.

What about RouteFlow?

RouteFlow is a fully-functional OpenFlow router that you can use today, that translates the physical ports on your OpenFlow device to interfaces on a virtual machine. This virtual machine runs a software router daemon like Quagga or BIRD, meaning you can leverage a mature software router instead of making your own.

RouteFlow represents an important step in OpenFlow routing, but I think we can do better. RouteFlow polls the RIB on a virtual machine and translates that to OpenFlow, which means the router daemons don't know that they're talking to the controller.

If we build a clean interface, we can write POX modules for OSPF, IS-IS, BGP and the like, and let them talk directly to the controller.

How to make packets in POX

I love the packet library in POX, it's clean and easy to use. To make your own packets, just do what your network stack normally does - create the payload, wrap that in the layer below, then the layer below that, and once you're at Ethernet you're finished.

Step 1: ARP replies

I've started with the forwarding.l2_learning module from POX, and added some code to the _handle_PacketIn function, just under self.macToPort[packet.src] = event.port (so that MAC addresses are still stored for each new port).

match = of.ofp_match.from_packet(packet)
if ( match.dl_type == packet.ARP_TYPE and
match.nw_proto == arp.REQUEST and
match.nw_dst == IPAddr("10.1.1.253")):
  self.RespondToARP(packet, match, event)
  return

This checks for ARP requests for our hardcoded IP 10.1.1.253, and responds. The code to respond is as follows:

  def RespondToARP(self, packet, match, event):
    # reply to ARP request
    r = arp()
    r.opcode = arp.REPLY
    r.hwdst = match.dl_src
    r.protosrc = IPAddr("10.1.1.253")
    r.protodst = match.nw_src
    r.hwsrc = EthAddr("00:12:34:56:78:90")
    e = ethernet(type=packet.ARP_TYPE, src=r.hwsrc, dst=r.hwdst)
    e.set_payload(r)
    log.debug("%i %i answering ARP for %s" %
     ( event.dpid, event.port,
       str(r.protosrc)))
    msg = of.ofp_packet_out()
    msg.data = e.pack()
    msg.actions.append(of.ofp_action_output(port =
                                          of.OFPP_IN_PORT))
    msg.in_port = event.port
    event.connection.send(msg)

We build an ARP packet by calling the arp() function from pox.lib.packet, and it initialises the packet as follows:

def __init__(self, raw=None, prev=None, **kw):
        packet_base.__init__(self)

        self.prev = prev

        self.hwtype     = arp.HW_TYPE_ETHERNET
        self.prototype  = arp.PROTO_TYPE_IP
        self.hwsrc      = ETHER_ANY
        self.hwdst      = ETHER_ANY
        self.hwlen      = 6
        self.opcode     = 0
        self.protolen   = 4
        self.protosrc   = IP_ANY 
        self.protodst   = IP_ANY
        self.next       = b''

        if raw is not None:
            self.parse(raw)

        self._init(kw)

We just need to set the OPCODE, HWSRC, HWDST, PROTOSRC and PROTODST fields of this. I've done this in the body of the code, but we can simplify it by passing extra arguments as follows:

r = arp( opcode=arp.REPLY, 
         hwsrc=EthAddr("00:12:34:56:78:90"),
         hwdst=match.dl_src,
         protosrc = IPAddr("10.1.1.253"),
         protodst = match.nw_src)

Once we've created the ARP packet, we need to create an Ethernet packet to put this into. This isn't perfect (we should check for VLAN tags and add them, or steal the body of the original packet and modify that), but it works if we're just dealing with a straight Ethernet network.

e = ethernet(type=packet.ARP_TYPE, src=r.hwsrc, dst=r.hwdst)
e.set_payload(r)

Then we send this off to the controller, which sends it out the port it came through. Now we have an IP address that people can find, let's make it respond to something.

Step 2: Ping replies

If we can reply to ARP requests, we can reply to pings. This has a few more layers, but that just makes the code a little longer, not any more complicated.

The ARP reply is easy - we make an ARP packet, then put that in an Ethernet packet. For ping reply, this is what we do:
  1. Get the payload from the echo request (ping)
  2. Create an echo reply packet, insert the old payload
  3. Create an ICMP packet, insert the echo reply
  4. Create an IPv4 packet, insert the ICMP
  5. Create an Ethernet packet, insert the IPv4
Here's what the code looks like:

  def RespondToPing(self, ping, match, event):
    p = ping
    # we know this is an ICMP Echo packet, so loop through
    # maybe this needs a try... except?
    while not isinstance(p, echo):
      p = p.next
    
    r = echo(id=p.id, seq=p.seq)
    r.set_payload(p.next)
    i = icmp(type=0, code=0)
    i.set_payload(r)
    ip = ipv4(protocol=ipv4.ICMP_PROTOCOL,
              srcip=IPAddr("10.1.1.253"),
              dstip=match.nw_src)
    ip.set_payload(i)
    e = ethernet(type=ping.IP_TYPE,
                 src=match.dl_dst,
                 dst=match.dl_src)
    e.set_payload(ip)
    log.debug("%i %i answering PING for %s" % (
              event.dpid, event.port,
              str(match.nw_src)))
    msg = of.ofp_packet_out()
    msg.data = e.pack()
    msg.actions.append(of.ofp_action_output(port =
                                          of.OFPP_IN_PORT))
    msg.in_port = event.port
    event.connection.send(msg)

Simple, just slightly longer than the ARP code.

Pictures

Here's a look at the controller output, and the view from Wireshark.

The OpenFlow dissector for Wireshark is part of the OpenFlow reference switch. It's a few years old, and uses an obselete API call - I can put up a patch if anyone gets stuck.

Next steps

  • ARP tables - if we're going to route traffic, we need to find the MAC addresses of destination IPs so that we send traffic to them
  • Routing protocol - RIP and OSPF will be fairly easy, BGP will be a bit harder due to relying on TCP. These can all be added to POX as modules
  • TUN/TAP support - we can create TUN/TAP interfaces and let the linux TCP stack do the hard work for us. This means a BGP module would create the TUN/TAP interface and handle OpenFlow encapsulation/decapsulation, but could offload the TCP to the Linux stack.