I ran into a switching problem in a vSphere 4.1 environment that highlights the challenges an IT organization faces when trying to deliver next generation Software Defined Networks (SDN’s). SDN’s have the potential to revolutionize data centers, cloud services and the applications riding on the networks. SDN’s allow the conveniences of hardware abstraction to applications similar to how operating systems offer abstraction to memory management. In theory SDN’s will allow applications and IT organizations to make calls to networks the same way applications make calls to the OS and the OS handles memory management, etc. A potential use case would be that the enterprise orders an access link and is able to have multiple vendor products running over the access link without actually doing any of the provisioning work. Requests are made by the application that may require a VPN to be established with a new business partner. The network team would not actually have to go and establish that VPN it would be done via SDN. I’ll save some detailed use cases for another blog post.
Before we get to that point most Cloud providers are looking at SDN for their own data centers. VMware’s approach to SDN is highlighted the purchase of Nicra which makes a virtual switch called OpenSwitch that runs on the hypervisor. From a logical perspective a virtual switch is just like a physical switch. All of your virtual hosts would utilize the switch. You can create Access Control Lists, Multicast configurations, VLAN’s for example. In addition, you can “trunk” your virtual switch to your physical network via the Ethernet port of your physical server. The Cisco Nexus 1000 is another example of a virtual switch. The Nexus 1000 integrates into your network management stack just as a physical switch would. If a Network Operations Center (NOC) engineer were to telnet to the switch and view the configuration, it would look just like any of the other switches on your network.
Simple concept – until something breaks. Most IT organizations have separate teams that manage the network and sever hardware. This has been a challenge even before the concept of virtual switches. As vendors started to integrate switches into their blade servers there has been integration and support issues that have moved from one stack to another and finding a single resource to take ownership has been difficult. Virtual switches and SDN’s complicate the issue further as now there is no physical hardware directly tied to switch.
What happens when your network team is trying to troubleshoot a performance issue that looks like a hardware bottleneck and your virtualization solution moves the control plane to another node in hardware your cluster? What about when the ARP cache of your physical switch is corrupt and the server team doesn’t have insight into that environment or has to prove that it’s not their virtual switches that are the issue? The bottom line is that organizations have to put some serious thought into training and organizational structure before implementing next general virtual switch environments. These teams more so than ever need to be highly integrated and virtualization savvy to support software controlled data centers. What challenges has your IT organization faced with virtualizing the data center?