Thursday, March 29, 2012

Polishing pyswitch

Polishing pyswitch

I've had my modified version of pyswitch running on NOX for a couple of weeks, and it's working fine. The key to OpenFlow is the controller - if your controller is processing a lot of packets, then it's a bottleneck; but if all your traffic is matching flows in the switch, then it will work at line speed.

As I've been using the switch for more and more test servers, I've noticed that my modifications have oversimplified things a little. Here's a summary of the current pyswitch logic:

  1. If a packet doesn't match a flow in the switch, send to the controller
  2. For each packet sent to the controller, save the source address and source port
  3. If the controller gets a packet with a destination address it knows, it sends it to that port and installs a new flow into the switch
Do you see the problem? It's fine with two computers on the switch, but here's how it works with three:
  1. PC A sends a packet to PC B. No flows in the switch so the controller gets the packet, saves the address and port of A, and floods the packet
  2. PC B replies. No flow matched, controller gets the packet, saves the address and port of B, and recognises PC A. Controller then forwards the packet to the port that PC A was seen on, and installs a flow into the switch
  3. PC A sends another packet to PC B. No flow matched, controller gets the packet, recognises address of PC B so it forwards the packet and stores a flow in the switch.
  4. Flows are in the switch for both PC A and PC B, so packets to them are sent at line speed without touching the controller
What happens when PC C comes along?
  1. PC C sends a packet to PC A. There is a flow for this, so it is forwarded at line speed in the switch
  2. PC A replies to PC C. No flow, so the controller gets the packet, saves the source details (address and port of PC A), doesn't have details of PC C so it floods the packet
Do you see the problem? The source details of PC C never get stored, because all its outbound packets match flows in the switch. This is a serious problem - it means that all of the traffic back to PC C goes through the OpenFlow controller at about 10 packets per second, breaking the network.

The original pyswitch didn't have this problem - it created very specific flows based on all the source and destination attributes. I could have fixed it up to handle VLANS better (by making it recognise ethertype 0x8100 as VLAN and move up the header for the actual ethertype), but this isn't efficient - a connection to a website would have 2 flows for the original arp requests, another 2 for the dns lookup, and another 2 for the TCP connection - 6 flows for a single web page?

We could strike a compromise and set flows based on the source and destination MAC addresses, but I still don't like that. It means that for N MAC addresses on the switch, you go from N flows to NxN flows; for a 48-port switch, this is from 48 flows to 2,304 flows. It may be a case of trading extra flows for simpler code, but I think I have a better solution.

My new addition to pyswitch adds a flow to the switch whenever it has to flood a packet. The idea is, when PC C comes along and sends a packet, we want that to go to the controller, even if we know the destination. Here's the new code:

# --
# If we've learned the destination MAC set up a flow and
# send only out of its inport.  Else, flood.
# --
def forward_l2_packet(dpid, inport, packet, buf, bufid):    
    dstaddr = packet.dst.tostring()
    if not ord(dstaddr[0]) & 1 and inst.st[dpid].has_key(dstaddr):
        prt = inst.st[dpid][dstaddr]
        if  prt[0] == inport:
            log.err('**warning** learned port = inport', system="pyswitch")
            logger.info('**warning** learned port = inport')
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
        else:
            # We know the outport, set up a flow
            log.msg('installing flow for ' + mac_to_str(packet.dst), system="pyswitch")
            logger.info('installing flow for ' + mac_to_str(packet.dst))
            # delete src flow if exists
            delflow = {}
            delflow[core.DL_SRC] = packet.dst
            inst.delete_datapath_flow(dpid, delflow)
            # sam edit - just load dest address, the rest doesn't matter
            flow = create_l2_out_flow(packet)
            actions = [[openflow.OFPAT_OUTPUT, [0, prt[0]]]]
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT, 
                                       openflow.OFP_FLOW_PERMANENT, actions,
                                       bufid, openflow.OFP_DEFAULT_PRIORITY,
                                       inport, buf)
    else:    
        # haven't learned destination MAC. Flood 
        if ord(dstaddr[0]) & 1:
            logger.info('broadcast/multicast packet to ' + mac_to_str(packet.dst) + ', flooding')
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
        else:
            logger.info('no MAC known for ' + mac_to_str(packet.dst) + ', flooding')
            # set up flow to capture source packet
            flow = {}
            flow[core.DL_SRC] = packet.dst
            actions = [[openflow.OFPAT_OUTPUT, [65535, openflow.OFPP_CONTROLLER]]]
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT,
                                       1, actions,
                                       None, openflow.OFP_DEFAULT_PRIORITY+1,
                                       None, None)

Pay attention to the install_datapath_flow() functions. If we start from the bottom, you'll see that the else statement is a lot larger. Broadcast/multicast packets get flooded, but unknown packets also install a flow (at default priority+1) so that any packets from this unknown host come to the controller. This is matched by a delete_datapath_flow() call further up the function, so that when a new flow is installed, it tries to delete any flows that match the source address.

How does it perform? Each new flow sends roughly 3 packets to the controller (the first unknown, and a couple because of our source-match flow - it doesn't get deleted before the next queued packet comes through), but we get our O(N) amount of flows in the table. If we look at our ARP + UDP + TCP example from before, it performs way better - for 6 flows the controller gets 6 packets, but for our 2 flows the controller also gets 6 packets. This means it uses the controller as much as the old, specific pyswitch, but uses a fraction of the flows.

OFPP_FLOOD vs OFPP_ALL

One extra note for those of you who haven't spotted it - I've changed the action from OFPP_FLOOD to OFPP_ALL. The Pronto 3290 we have at work has always responded to FLOOD messages weirdly - it looks like it sets up individual flows for each active port, and after trolling through the OpenFlow spec I've figured out why:

OpenFlow-only switches support only the required actions below, while OpenFlow-
enabled switches, routers, and access points may also support the NORMAL
action. Either type of switch can also support the FLOOD action.
Required Action: Forward. OpenFlow switches must support forwarding
the packet to physical ports and the following virtual ones:
• ALL: Send the packet out all interfaces, not including the incoming in-
terface.
• CONTROLLER: Encapsulate and send the packet to the controller.
• LOCAL: Send the packet to the switchs local networking stack.
• TABLE: Perform actions in flow table. Only for packet-out messages.
• IN PORT: Send the packet out the input port.
Optional Action: Forward. The switch may optionally support the following
virtual ports:
• NORMAL: Process the packet using the traditional forwarding path
supported by the switch (i.e., traditional L2, VLAN, and L3 processing.)
The switch may check the VLAN field to determine whether or not to
forward the packet along the normal processing route. If the switch can-
not forward entries for the OpenFlow-specific VLAN back to the normal
processing route, it must indicate that it does not support this action.
• FLOOD: Flood the packet along the minimum spanning tree, not includ-
ing the incoming interface.

See the difference? FLOOD is an optional action, that activates any spanning-tree code in the switch. It's not as intensive as NORMAL (which only true hybrid switches will support), but it isn't what pyswitch is supposed to do. Changing the code to use OFPP_ALL instead of OFPP_FLOOD seems to make the switch work less on each packet that comes back from the controller - and this means the controller can handle even more flows per second!

Here's a code dump of my latest version, I may polish it and send it back to the NOX dudes later if I get the time:

# Copyright 2008 (C) Nicira, Inc.
# This file is part of NOX. Additions from Sam Russell for
# compatibility with OVS on Pronto 3920
# NOX is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# NOX is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with NOX.  If not, see <http://www.gnu.org/licenses/>.
# Python L2 learning switch 
#
# ----------------------------------------------------------------------
#
# This app functions as the control logic of an L2 learning switch for
# all switches in the network. On each new switch join, it creates 
# an L2 MAC cache for that switch. 
#
# In addition to learning, flows are set up in the switch for learned
# destination MAC addresses.  Therefore, in the absence of flow-timeout,
# pyswitch should only see one packet per flow (where flows are
# considered to be unidirectional)
#

from nox.lib.core     import *

from nox.lib.packet.ethernet     import ethernet
from nox.lib.packet.packet_utils import mac_to_str, mac_to_int

from twisted.python import log

import logging
from time import time
from socket import htons
from struct import unpack

logger = logging.getLogger('nox.coreapps.examples.pyswitch')

# Global pyswitch instance 
inst = None

# Timeout for cached MAC entries
CACHE_TIMEOUT = 5

# Modified extract_flow except just dest address - another sam edit
def create_l2_out_flow(ethernet):
    attrs = {}
    attrs[core.DL_DST] = ethernet.dst
#    attrs[core.DL_SRC] = ethernet.src
    return attrs

# --
# Given a packet, learn the source and peg to a switch/inport 
# --
def do_l2_learning(dpid, inport, packet):
    global inst 
    logger.info('learning MAC for incoming packet...' + mac_to_str(packet.src))
    # learn MAC on incoming port
    srcaddr = packet.src.tostring()
    if ord(srcaddr[0]) & 1:
        log.msg('MAC is null', system='pyswitch')
        logger.info('MAC is null')
        return
    if inst.st[dpid].has_key(srcaddr):
        dst = inst.st[dpid][srcaddr]
        if dst[0] != inport:
            log.msg('MAC has moved from '+str(src)+'to'+str(inport), system='pyswitch')
            logger.info('MAC has moved from '+str(src)+'to'+str(inport))
        else:
            return
    else:
        logger.info('learned MAC '+mac_to_str(packet.src)+' on %d %d'% (dpid,inport))

    # learn or update timestamp of entry
    inst.st[dpid][srcaddr] = (inport, time(), packet)

    # Replace any old entry for (switch,mac).
    mac = mac_to_int(packet.src)

# --
# If we've learned the destination MAC set up a flow and
# send only out of its inport.  Else, flood.
# --
def forward_l2_packet(dpid, inport, packet, buf, bufid):    
    dstaddr = packet.dst.tostring()
    if not ord(dstaddr[0]) & 1 and inst.st[dpid].has_key(dstaddr):
        prt = inst.st[dpid][dstaddr]
        if  prt[0] == inport:
            log.err('**warning** learned port = inport', system="pyswitch")
            logger.info('**warning** learned port = inport')
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
        else:
            # We know the outport, set up a flow
            log.msg('installing flow for ' + mac_to_str(packet.dst), system="pyswitch")
            logger.info('installing flow for ' + mac_to_str(packet.dst))
            # delete src flow if exists
            delflow = {}
            delflow[core.DL_SRC] = packet.dst
            inst.delete_datapath_flow(dpid, delflow)
            # sam edit - just load dest address, the rest doesn't matter
            flow = create_l2_out_flow(packet)
            actions = [[openflow.OFPAT_OUTPUT, [0, prt[0]]]]
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT, 
                                       openflow.OFP_FLOW_PERMANENT, actions,
                                       bufid, openflow.OFP_DEFAULT_PRIORITY,
                                       inport, buf)
    else:    
        # haven't learned destination MAC. Flood 
        if ord(dstaddr[0]) & 1:
            logger.info('broadcast/multicast packet to ' + mac_to_str(packet.dst) + ', flooding')
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
        else:
            logger.info('no MAC known for ' + mac_to_str(packet.dst) + ', flooding')
            # set up flow to capture source packet
            flow = {}
            flow[core.DL_SRC] = packet.dst
            actions = [[openflow.OFPAT_OUTPUT, [65535, openflow.OFPP_CONTROLLER]]]
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_ALL, inport)
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT,
                                       1, actions,
                                       None, openflow.OFP_DEFAULT_PRIORITY+1,
                                       None, None)
        
# --
# Responsible for timing out cache entries.
# Is called every 1 second.
# --
def timer_callback():
    global inst

    curtime  = time()
    for dpid in inst.st.keys():
        for entry in inst.st[dpid].keys():
            if (curtime - inst.st[dpid][entry][1]) > CACHE_TIMEOUT:
                log.msg('timing out entry'+mac_to_str(entry)+str(inst.st[dpid][entry])+' on switch %x' % dpid, system='pyswitch')
                inst.st[dpid].pop(entry)

    inst.post_callback(1, timer_callback)
    return True

def datapath_leave_callback(dpid):
    logger.info('Switch %x has left the network' % dpid)
    if inst.st.has_key(dpid):
        del inst.st[dpid]

def datapath_join_callback(dpid, stats):
    logger.info('Switch %x has joined the network' % dpid)

# --
# Packet entry method.
# Drop LLDP packets (or we get confused) and attempt learning and
# forwarding
# --
def packet_in_callback(dpid, inport, reason, len, bufid, packet):

    if not packet.parsed:
        log.msg('Ignoring incomplete packet',system='pyswitch')
        
    if not inst.st.has_key(dpid):
        log.msg('registering new switch %x' % dpid,system='pyswitch')
        inst.st[dpid] = {}

    # don't forward lldp packets    
    if packet.type == ethernet.LLDP_TYPE:
        return CONTINUE

    # learn MAC on incoming port
    do_l2_learning(dpid, inport, packet)

    forward_l2_packet(dpid, inport, packet, packet.arr, bufid)

    return CONTINUE

class pyswitch(Component):

    def __init__(self, ctxt):
        global inst
        Component.__init__(self, ctxt)
        self.st = {}

        inst = self

    def install(self):
        inst.register_for_packet_in(packet_in_callback)
        inst.register_for_datapath_leave(datapath_leave_callback)
        inst.register_for_datapath_join(datapath_join_callback)
        inst.post_callback(1, timer_callback)

    def getInterface(self):
        return str(pyswitch)

def getFactory():
    class Factory:
        def instance(self, ctxt):
            return pyswitch(ctxt)

    return Factory()

Monday, March 26, 2012

Thugs at New Zealand skateparks

This is totally off-topic, but I saw this and it made me really angry. I used to love skateboarding when I was a kid, and the thought of grown adults shoulder-charging young kids off skateboards is just disgusting.

 
Today At Vic Park from NZskate.com on Vimeo.

In other news, I've been playing with the source code to BIRD to see if I can put in some hooks to make it work with NOX. It has a set of OS-specific "kernel" modules which install routes into the OS routing tables, so it shouldn't be hard to make a NOX version of this.

Another option would be to monitor the MRT dump file, or point it at a named pipe - then get NOX to use that to pick up updates.

One thing that Nick Buraglio pointed out was that BIRD is in need of an IS-IS implementation. IPv6 has been a good excuse to move from OSPF to IS-IS because it all runs in one instance - not requiring OSPFv2 and v3 instances for IPv4 and IPv6. Time will tell if there's still a place for decentralised IGP's in an OpenFlow world, but if we're going to see an influx of software routers, then IS-IS will definitely be added in the next year or two. (edit - somebody's beaten me to add this to quagga http://code.google.com/p/google-quagga/)

More details once I get BIRD talking to NOX though!

Saturday, March 17, 2012

Multicasts and Broadcasts and Flows, oh my!

Background
If you've set up pyswitch and NOX with Open vSwitch then you'll notice that any packets that don't match a flow get sent to the Openflow controller. If you set no flows, the controller receives every packet, until either you or the controller adds flows. Pyswitch will set flows for unicast traffic, but what happens when you start getting a substantial amount of background multicast/broadcast traffic?

A standard NOX setup can handle 10 flows per second. This means it can set up flows to handle 10 new hosts, or 10 different protocols, or it can simply return 10 packets to the switch and tell it to flood them.

Can you see the potential problem here? Any medium-to -large sized network will have all sorts of background multicast/unicast traffic, here are some of the things that will generate broadcast/multicast traffic on your network:

  • ARP requests
  • DHCP requests
  • SSDP messages (from any UPnP-enabled device)
  • SMB/NetBIOS (windows machines)
  • Bonjour/mDNS (Apple / anything with iTunes)
  • IGP routing protocols
  • Spanning tree
  • IPv6 router-advertisement messages
Taking a closer look
If you fire up wireshark you can filter on these messages

Just right-click on the IG bit, then go Apply as Filter -> Selected, and from now on, you'll only see multicast/broadcast packets. Here are some examples of what you might see on your network



What's worse is that if you sit and watch, you'll see groups of packets show up in large groups at a time - SSDP, mDNS and NBNS all send 5-10 packets at a time, and with a standard Openflow controller-switch setup, these 10 packets will pause your network for a whole second.

The solution
With Open vSwitch, you have a few options - you could add all your flows manually, or you can delegate that to an Openflow controller. For something like this however, you can add a flow that makes your switch automatically flood any multicast/broadcast traffic, leaving your Openflow controller to focus on unicast traffic.

The ovs-ofctl documentation gives us an easy answer - set a flow that masks the group address bit as follows:

ovs-ofctl add-flow br0 priority=65500,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00,actions=flood

That was easy! If you want to do IGMP or MLD snooping, you can add flows with higher priorities - but first have a look at how much IP multicast traffic is on your network already - remember, 10 flows per second is probably your limit.

ovs-ofctl add-flow br0 priority=65500,dl_dst=01:00:5e:00:00:16,actions=controller
ovs-ofctl add-flow br0 priority=65500,dl_dst=33:33:00:00:00:16,actions=controller

The first flow will match IGMP traffic, and the second will match MLDv2 (IPv6 version of IGMPv3) traffic, but both versions of MLD unfortunately need more complicated flows, MLDv1 uses the all-local-nodes address, and even though MLDv2 has its own address, the MAC address 33:33:00:00:00:16 is valid for any IPv6 multicast address that ends in :0:16.

Has anyone done IGMP/MLD snooping on an Openflow controller yet? It's probably outside the scope of my current project, but it should be easy enough to build into Pyswitch if someone had the time. Let me know if you've done this, my twitter is @samrussellnz

Thursday, March 15, 2012

Dead drops: breaking open USB flash drives

I love it when people come up with interesting ways of using technology, and when I came across dead drops, I was immediately hooked. The idea is to concrete USB flash drives into walls in public places, and then see what people use them for. Unlike conventional networks, such as the internet, it's not immediately obvious what dead drops would be useful for, but given the last few years of restrictive new laws, teenagers and fat Germans being extradited for running websites, and now NTIA playing silly buggers about who gets to run DNS, dead drops are a green-field opportunity that hasn't yet been tainted by money and lawyers.

Here's how it works. You buy a USB key (this one was $20NZD and is 8GB), and admire it in its shiny new packaging
Once you're satisfied with how awesome your purchase was, you use scissors or a screwdriver to pry off the cover.

 For protection, we'll tape around the circuit board

And finally, we test that it still works (it does). Your dead drop is ready to be installed somewhere - just make sure you get permission first! Given it will cost your local council nothing, and is a novel new type of street art, it's quite possible the answer will be yes, as long as you ask first.


Wednesday, March 14, 2012

Pyswitch bugfix, and DoS vulnerability in open vSwitch

Pyswitch
I had a bit of time to work on Pyswitch today, and I've cut it back so that it only sets the destination MAC and out port, and that was enough for it to start setting flows properly. You can look at the source if you like, or just focus on the part I've changed:

The function I've modified is forward_l2_packet - as the name suggests, it either floods all ports with the packet it has received, or sends the packet out the correct port and installs a flow in the switch. Here is the function:


def forward_l2_packet(dpid, inport, packet, buf, bufid):    
    dstaddr = packet.dst.tostring()
    if not ord(dstaddr[0]) & 1 and inst.st[dpid].has_key(dstaddr):
        prt = inst.st[dpid][dstaddr]
        if  prt[0] == inport:
            log.err('**warning** learned port = inport', system="pyswitch")
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_FLOOD, inport)
        else:
            # We know the outport, set up a flow
            log.msg('installing flow for ' + str(packet), system="pyswitch")
            flow = extract_flow(packet)
            flow[core.IN_PORT] = inport
            actions = [[openflow.OFPAT_OUTPUT, [0, prt[0]]]]
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT, 
                                       openflow.OFP_FLOW_PERMANENT, actions,
                                       bufid, openflow.OFP_DEFAULT_PRIORITY,
                                       inport, buf)
    else:    
        # haven't learned destination MAC. Flood 
        inst.send_openflow(dpid, bufid, buf, openflow.OFPP_FLOOD, inport)

The key to creating t flow is the extract_flow function from util.py


def extract_flow(ethernet):
    """
    Extracts and returns flow attributes from the given 'ethernet' packet.
    The caller is responsible for setting IN_PORT itself.
    """
    attrs = {}
    attrs[core.DL_SRC] = ethernet.src
    attrs[core.DL_DST] = ethernet.dst
    attrs[core.DL_TYPE] = ethernet.type
    p = ethernet.next


    if isinstance(p, vlan):
        attrs[core.DL_VLAN] = p.id
        attrs[core.DL_VLAN_PCP] = p.pcp
        p = p.next
    else:
        attrs[core.DL_VLAN] = 0xffff # XXX should be written OFP_VLAN_NONE
        attrs[core.DL_VLAN_PCP] = 0


    if isinstance(p, ipv4):
        attrs[core.NW_SRC] = p.srcip
        attrs[core.NW_DST] = p.dstip
        attrs[core.NW_PROTO] = p.protocol
        p = p.next


        if isinstance(p, udp) or isinstance(p, tcp):
            attrs[core.TP_SRC] = p.srcport
            attrs[core.TP_DST] = p.dstport
        else:
            if isinstance(p, icmp):
                attrs[core.TP_SRC] = p.type
                attrs[core.TP_DST] = p.code
            else:
                attrs[core.TP_SRC] = 0
                attrs[core.TP_DST] = 0
    else:
        attrs[core.NW_SRC] = 0
        attrs[core.NW_DST] = 0
        attrs[core.NW_PROTO] = 0
        attrs[core.TP_SRC] = 0
        attrs[core.TP_DST] = 0
    return attrs

Now, if we're just making a basic switch, this does way more than we need - why would a switch care about layer 4 protocols? Fortunately, open vSwitch on the Pronto ignores most of it because it uses DL_TYPE=0x8100 (which means the packet is 802.1q VLAN tagged, and the actual ethertype is 4 bytes futher up), but having the wrong DL_TYPE is why nothing ends up matching the flow...

Util.py needs to be fixed to interpret VLANs properly, but in the meantime, pyswitch will work fine as a simple layer two switch if we use a cut-down version of the extract_flow function. And here it is:

def create_l2_out_flow(ethernet):
    attrs = {}
    attrs[core.DL_DST] = ethernet.dst
    return attrs

Simple, right? Now we use this instead of extract_flow, and then we can walk through what the function does in detail:

ddef forward_l2_packet(dpid, inport, packet, buf, bufid):    
    dstaddr = packet.dst.tostring()
    if not ord(dstaddr[0]) & 1 and inst.st[dpid].has_key(dstaddr):
[...]

    else:  
        # haven't learned destination MAC. Flood
        inst.send_openflow(dpid, bufid, buf, openflow.OFPP_FLOOD, inport)


This pulls the destination MAC address out of the packet, converts it to a string, and makes sure the first character is 0 = unicast. If this is the case, it checks to see if it's learnt it before, and if so, then we can proceed. Otherwise, it floods to all ports - correct for both broadcast/multicast and unknown MAC addresses.

        prt = inst.st[dpid][dstaddr]
        if  prt[0] == inport:
            log.err('**warning** learned port = inport', system="pyswitch")
            inst.send_openflow(dpid, bufid, buf, openflow.OFPP_FLOOD, inport)

If the destination MAC is assigned to the source port then something is weird (either a spoof or a loop in the network), so behave like a hub for this packet

        else:
            # We know the outport, set up a flow
            log.msg('installing flow for ' + str(packet), system="pyswitch")
            # sam edit - just load dest address, the rest doesn't matter
            flow = create_l2_out_flow(packet)
            actions = [[openflow.OFPAT_OUTPUT, [0, prt[0]]]]
            inst.install_datapath_flow(dpid, flow, CACHE_TIMEOUT,
                                       openflow.OFP_FLOW_PERMANENT, actions,
                                       bufid, openflow.OFP_DEFAULT_PRIORITY,
                                       inport, buf)

This is the switch part - we create our very specific flow with our new function (just destination MAC address - not all 10 or so parts to match on), set the action to output to the correct port, then call install_datapath_flow (part of nox::lib::core::Component), which sends back the new flow and instruction on where to send the packet. All done, and works well, except for one thing:

Open vSwitch DoS (probably one of many)
The problem with OpenFlow that everybody points out is that you can only really send 10 packets per second to your controller. You can try and optimise this if you want, but this switch-controller connection is where the battle will be fought to make OpenFlow perform better. I didn't think this would be a problem with the Pronto, because I assumed that open vSwitch would process packets somewhat like this:


  1. Find flow for packet - if found, follow the actions and go to next packet
  2. Send packet to controller
  3. Get packet and flow back from controller, follow instruction for this packet and install flow
  4. Go back to 1 for next packet.
Unfortunately, it appears that open vSwitch does things a little differently:

  1. Find flow for packet - if found, follow the actions and go to next packet
  2. Send packet to controller
  3. Get packet and flow back from controller, follow instruction for this packet and add flow to some queue somewhere
  4. Go back to 1 for next packet
  5. If no more packets waiting, look at the queue and install the flow
Surprisingly enough, this works fine for TCP - the 3-way handshake gives the switch enough downtime to install the flow, and get ready for the influx of data. However, if you surprise it with 500Mb/s of UDP iperf, you find the receiving server only getting ~150Kb/s, every single packet going to the controller, and no flow being installed!

Fortunately, the staff at Pronto have been awesome to work with, so I'm hoping we'll get a solution soon, and in the meantime, I'll try to find a workaround myself. If you're testing and stuck in a similar situation, either start off with a little UDP test first, or even ping the other host before starting your iperf - this will set the flows, and then you can send as much data as you like!

Monday, March 12, 2012

Openflow with NOX & Pronto/Pica8

We've got a Pronto 3290 at work, and with Josh Bailey's help I've been getting it talking Openflow to a NOX controller running pyswitch.

I figure the more I write about it, the more sense it'll make, so here's a summary of how far I've come:


  • The pronto runs Open vSwitch, which lets you add your own flows manually - makes it easy to see what flows your controller has added too. They're supposedly going to add Openflow v1.2 support soon, which means IPv6!
  • NOX doesn't find the Python bindings for OpenSSL on Ubuntu 11.10 (oneiric) in its current branch, but the destiny branch does - a bit of Git skill will sort this out for you
  • Wireshark has an OpenFlow dissector which is part of the OpenFlow code, but it doesn't work with newer versions of Wireshark, you'll need this patch to make it build - confirmed working on Ubuntu 11.10
  • Pyswitch (included as part of NOX) doesn't send back the right flows to the pronto - it sets the ethertype as 0x8100, so the flows look like this: idle_timeout=5,priority=65535,in_port=8,dl_vlan=1,dl_vlan_pcp=0,dl_src=00:XX:XX:XX:XX:XX,dl_dst=00:YY:YY:YY:YY:YY,dl_type=0x8100 actions=output:3 - this is where I'm going to start modding pyswitch
And this is where I am now. The plan for the next few weeks (which will probably change) is going to be something like this:
  1. Make pyswitch send correct Openflow data
  2. Mod pyswitch (or a demo router app) to do some basic routing and ACL
  3. Hope that someone has written a BGP Openflow app so that I don't have to - otherwise, look at options for this
I'll be back with more details