Network Operating System?

I’ve just begun dealing with Software Defined Networks (SDN) for my Master’s thesis, and I’m experimenting on top of Floodlight, an open source OpenFlow controller from Big Switch Networks. In OpenFlow, a logically centralised entity known as the controller can control the forwarding tables of a bunch of switches which speak OpenFlow. OpenFlow applications then talk to the controller using some controller-specific API to ‘program’ the network (manipulate forwarding tables on the switches). The high level architecture looks something like this:

Just like an operating system abstracts away the complexities of the underlying hardware for a user-space application, the controller abstracts away the complexities of the network for OpenFlow applications. For this reason, the controller is often referred to as a “network operating system”. Applications have some API to talk to the network-OS, and it translates those APIs into OpenFlow commands that control the switches.

For my thesis, the plan for my architecture was to have two applications that provide different services to the network, that are expected to run simultaneously. Both of them collect information from the OpenFlow switches and some other framework specific agents situated at the edges of the network to make some optimisation type decisions. But as soon as I implemented one of the applications, it was clear that I had no straightforward way of ensuring that both my applications wouldn’t make decisions that counteract each other. Although I really don’t like the idea of doing this, the easiest way to solve this is to wrap both applications into one. And from the looks of it, this is a problem that hasn’t been solved yet.

Controllers like NOX and Onix make the assumption that only one OpenFlow application is running on a given network at any point of time. This is a reasonable assumption from a systems perspective. But what’s gotten me confused is how OpenFlow applications fit into the “SDN for enterprises” picture. I was under the impression that a network operator using a particular controller could choose between different 3rd party OpenFlow applications to handle different complexities with the network: a load balancing application from vendor A for the edge, a routing daemon application from vendor B, and so forth. While these are relatively orthogonal applications, it looks like it’s possible for two OpenFlow applications to make decisions and choices that adversely affect each other (leading to oscillations in switch state). Floodlight allows you to run multiple applications at the same time, but leaves it to the developer (or user?) to ensure that applications can safely co-exist with each other.

So again, if my observation isn’t mistaken, how do OpenFlow applications fit cleanly into the SDN ecosystem?  How can I manage my network using building blocks of applications from different vendors? Will I need to rely on OneBigApplianceFromBigBadVendor per network? Does this necessitate something analogous to per-process resource allocation as in traditional operating systems? I can see that FlowVisor style slicing is one way to go about it, but will that suffice?

So what *should* the network operating system do here? Let the applications run wild and fight it out? Or provide some mechanism to enforce policies between applications?

If I am indeed mistaken in my assumption, please do let me know what I’m missing here! 🙂


8 thoughts on “Network Operating System?

  1. Mark Berly

    You have hit on one of many issues that need to be worked out with OpenFlow/SDN. There are different schools of thought on how to solve this problem.

    1) Create a network with tons of bandwidth and let the hypervisor provide the intelligence tunneling everything over the network, this is a server-centric view of things, the smart virtual edge solution.

    2) Create a network with intelligence at the edge that allows policies to be pushed to the first hop switch allowing for policy enforcement, the smart physical edge.

    3) Create tight integration at all layers aggregation, edge (both physical and virtual)

    The difficulty of increases as you move from #1 to #3, as you are exposing you need to either have the controller aware of not only the network but the application requirements OR you need to have integration with the network to allow policy enforcement to ensure required SLAs are met.

      1. Jason Edelman (@jedelman8)

        I’ll admit, I read your post earlier from the road, but now re-reading it, what Rexford was working on was making changes to a distributed network to multiple nodes, not changes from different applications to the same controller. However, there may be similarities.

  2. md1clv

    It seems to me that there will need to be an arbitrator application which talks to the controller, and the other applications will talk to the arbitrator instead of directly to the controller.

    The arbitrator’s job would be to make sure that the “right” switching decision gets passed to the controller when there are multiple conflicting options.

    1. lalithsuresh Post author

      This is what my architecture seems to have boiled down to pretty much. It looks something like this:

      | App1 | App2 |
      Master App
      network of switches

      The master app exposes a set of primitives and interfaces (for statistics collection, for instance) to the ‘upper’ applications, which operate solely based on those. Of course, I have no clue on how to generalise this, because all applications in my system including the master are doing something very scenario specific and are doing more than changing flow tables (which I guess is making it more complicated :)).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s