InCNTRE Blog

May 7, 2013

April 11, 2013
BLOOMINGTON, Ind. — Indiana University shares its expertise in advanced networking with the launch of a new lecture series for state business and industry leaders. Hosted by IU's Indiana Center for Network Translational Research and Education (InCNTRE) and Global Research Network Operations Center (GlobalNOC), the IU NetTalk Series will kick off at 6 p.m. on May 7 in the ICTC building on the IUPUI campus.
NetTalk logo
"With the help of IU experts, Indiana companies will gain a greater understanding of computer networking and how it can drive their business," said Steve Wallace, InCNTRE acting executive director. "New technologies are constantly emerging, so we are excited to share our experiences and build relationships with business leaders across the state."
The first no-cost, public lecture will focus on OpenFlow and Software-Defined Networking (SDN)—both hot topics in the computing field. Companies like Google and Facebook have already adopted this technology to create more dynamic and personalized networks.
As a leader in the study and commercial application of SDN technology, IU will help Indiana businesses translate research into practical solutions that improve their networking capabilities. After the lecture, speakers will offer a guided tour of InCNTRE's SDN Interoperability Lab and the GlobalNOC.
The next lecture in the series will take place on July 30. Speakers will explore the challenges of sending large files over the Internet at high speeds and offer advice on how to alleviate these issues. Future topics will include cloud computing, building fiber networks and emerging network protocols.
Slides can be downloaded <a href="/%3Ca%20href%3D"http://goo.gl/1tXY0">http://goo.gl/1tXY0">here</a>.

February 2, 2013

FOR IMMEDIATE RELEASE
January 22, 2013
 
INDIANAPOLIS — The Indiana Center for Network Translational Research and Education (InCNTRE) at Indiana University will host its first Software-Defined Networking (SDN) Lab PlugFest Feb. 19-21 in Indianapolis. The event will allow SDN Lab members to test the interoperability of their products with Big Switch Networks’ platform for SDN, the Big Network Controller. Big Switch Networks is a founding member of the SDN Lab.
 
“The PlugFest is another way that InCNTRE works to advance SDN technologies while fostering community among vendors,” said Steve Wallace, acting InCNTRE executive director. “Big Switch's recent product launch is garnering a great deal of industry attention, and our members are expressing interest in testing their switches with the new controller. We anticipate this event being the first of many highlighting our members’ technology.”
 
InCNTRE seeks to advance SDN and to enable researchers, students and IT staff within the academic community to innovate with SDN. In recent years, SDN has emerged as a way for organizations to save money and time by building customized and dynamic networks. The center’s research, education and testing activities are advancing the industry while training the network engineers and developers of the future. In addition, InCNTRE's SDN Interoperability Lab is the world's only neutral, third-party SDN testing facility.
 
"The InCNTRE team has been providing independent interoperability testing since before the industry heard of software-defined networking or its effect on network virtualization,” said Rob Sherwood, principal architect at Big Switch Networks. “We are gratified that InCNTRE has selected the Big Network Controller for its first single controller interoperability showcase in response to an unprecedented number of requests.”
 
About Big Switch Networks
Big Switch Networks is a leader in open software-defined networking. The company’s open SDN platform embraces industry standards, open APIs, open source and vendor-neutral support for both physical and virtual networking infrastructure. The Big Switch Networks product suite supports a broad range of networking applications, including network virtualization for public and private cloud data centers built upon OpenStack, CloudStack and other platforms. For more, see http://www.bigswitch.com/.
 
Related links
http://incntre.iu.edu/

December 1, 2012

Press releases

InCNTRE SDN Lab to open doors for non-ONF members

December 8, 2014

The Indiana Center for Network Translational Research and Education (InCNTRE) Software-Defined Networking (SDN) Lab is set to reap the benefits of the Open Networking Foundation’s (ONF) recent decision to extend the OpenFlow Conformance Testing Program to non-members.

The conformance test gives networking vendors an opportunity to show that their product works with a particular version of OpenFlow. Verification by InCNTRE will now authorize non-ONF members to use the OpenFlow conformance logo and publicize their OpenFlow conformance.

"The more vendors that enter this space, the more momentum SDN and OpenFlow will have behind it — and that is good for the entire industry," said InCNTRE Director Ron Milford. The ONF selected the InCNTRE SDN Lab as the first independent lab approved for testing in 2012.

Read more on IT News


Open Networking Foundation Drives Commercialization of SDN and the OpenFlow™ Protocol at Third PlugFest

ONF Sees Increasing Implementation of OpenFlow 1.3 in Commercial and Test Controllers and Switches

PALO ALTO, Calif., June 19, 2013The Open Networking Foundation (ONF), a non-profit organization dedicated to promoting Software-Defined Networking (SDN), completed its third semi-annual PlugFest designed to drive interoperability, deployment, and commercialization of SDN and the OpenFlow™ protocol. Hosted June 3-7 at the Indiana Center for Network Translational Research and Education (InCNTRE), the first ONF certified lab for conformance testing, the event was attended by nearly 50 network engineers from 20 member companies with the common goal of ensuring that new SDN protocols work across all of their products. This year's event saw more than 90 percent of member companies participating in testing of OpenFlow 1.3.

Read more on OpenNetworkingFoundation.org


Indiana University and Orange Silicon Valley researchers unveil faster, more efficient method to move big data

December 3, 2012

BLOOMINGTON, Ind. — From predicting the path of severe weather to creating drugs that combat disease, big data is critical to the discoveries that improve human life. However, the current production of digital data exceeds the ability to move it over computer networks. A new Indiana University-business collaboration is changing that dynamic.

A recent networking breakthrough from IU researchers, in collaboration with Orange Silicon Valley and DataDirect Networks, showed that data sharing can be faster and more efficient over wide area networks (WAN). The team performed the world's first demonstration of RDMA (Remote Direct Memory Access) over Converged Ethernet (RoCE) across a wide area network using the Lustre file system.

The advancement came at the recent Supercomputing 12 (SC12) conference in Salt Lake City. SC12 is one of the most important events in the field of advanced computing, attracting thousands of attendees from around the world.

Read full story at the IU Newsroom


IU recognized by Internet2 for role in new 100G network

November 8, 2012

BLOOMINGTON, Ind.—Indiana University networking experts were recently recognized by Internet2 for their efforts to enhance broadband connectivity and support advanced services and cloud applications across the United States. Their efforts will help provide advanced networking features for more than 200,000 of the country's community anchor institutions, including libraries, hospitals, K-12 schools, community colleges and public safety organizations.

Leaders at Internet2, operators of the nation's fastest, coast-to-coast research and education network, praised IU's contributions at a recent ceremony that launched the United States' first 100 Gigabit per second (Gbps) open, transcontinental, software-defined network. IU's Chris Robb and Steve Wallace were both singled out for their efforts.

"We are honored that Internet2 chose to commend IU," said IU Associate Vice President of Networks David E. Jent, who was also recognized at the event. "The entire IU team has done a fantastic job helping to advance research and innovation across the United States with the lighting of this new network, and we look forward to many more collaborations with Internet2."

Read more on the IU Newsroom

 

April 24, 2012

Attending a side meeting during the Open Networking Summit, I asked an
industry insider (principal of a startup specializing in OpenFlow
controllers) which Ethernet switches we should be considering for our
next campus upgrade. His response included not vendors or model numbers,
but a description of the capabilities of a new chipset. He was talking
about merchant silicon, not the product road map of any switch vendor.
This conversation starts to mirror those between a hypervisor vendor
(e.g., VMware) and someone planning for their next data center upgrade.
It’s not which model Dell or HP to consider but rather which CPU
architecture provides the best performance and features.

Today Ethernet switch vendors are more likely to obscure their use of
merchant silicon, putting their baked-in network features and packaging
at the forefront. SDN, and open standards such as OpenFlow, are likely
to change both vendor and customer behavior. Want to see the future
dialog of switch vendors and customers? Observe how the data center
plans its next upgrade. I wonder the big vendors will carry a logo such
as “Broadcom inside” in the near future :-)

April 24, 2012

During the plenary session at the Open Networking Summit, Urs Hoeizle,
Google Senior Vice President, Technical Infrastructure & Google Fellow,
described Google’s 100 percent transition to OpenFlow for their
inter-data center network. Google maintains two networks, one through
which users access Google services such as search and Google Apps, and
one dedicated to transferring data among their data centers. The
inter-data center network is under Google’s control end-to-end, making
it the first choice for transitioning to a new technology.

Dr. Hoeizle laid out Google’s case for the transition to OpenFlow (e.g.,
better control, and greater efficiency), but Google’s purpose for such a
detailed disclosure became more clear during the question and answer
session. Google’s presentation included a rare photo of a google-built
Ethernet switch. For years many have speculated that Google was building
its own network equipment. Displaying a photograph of a google-built
switch was all the bait the audience needed. Shortly into the Q & A
session, the focus shifted to Google’s switch. With the hook set and
center stage the audience of 900 attendees, Dr. Hoeizle signaled that
the market was expecting high-density OpenFlow equipped switches. Google
only built these switches because they couldn’t buy them. They plan to
buy switches, rather than build their own, the next time around.

October 26, 2011
I spent last week at the Open Networking Summit and Open Networking Foundation Member Workday and came home with a head full of ideas to write about.   Here's my first set....

At ONS I heard the term DevOps mentioned many times.  In fact, Stacy  from GigaOM (who btw was an excellent moderator for the panel I participated in) wrote a post about it as well (See "Does networking need a devops movement").  Perhaps it's because I live in the corn fields of Indiana or because I really live in my own little world most of the time, but I actually had to Wikipedia the term DevOps (Can you use Wikipedia as a verb ?).   In any case, when I read the Wikipedia article I thought, DUH OF COURSE !!  

You see, this is how we've been developing network management software at the GlobalNOC at Indiana University for the past 12 years.  For those of you who don't know us, the GlobalNOC partners with universities and non-profits to help them manage their networks by providing NOC and network engineering support.  Very early on, as we grew from working with 1 network to 3 or 4 networks,  it became painfully obvious that we would need very good, highly customized set of software to help us manage these very diverse networks.

This started out with a few network engineers (like myself) who had CS backgrounds hacking together some open-source software with way too much homegrown Perl scripting for our own good.  We wrote the software, we supported the software and we used the software to manage networks !  Over time, as the team grew and the software became much more sophisticated, it was necessary to split the developer/sysadmin team from the network engineering team, but they are still very closely linked.

I'm no longer directly involved with the developer/sysadmin team (SYSENG as we call them), but the parts of the development process described in the DevOps article on Wikipedia, such as reduced release scope with smaller changes more often and automation of deployment, sounds very familiar :)  I'm told we rolled out something like 40 software releases into production last year.  In addition, the "users" of the software, ie the network operators and engineers, have a very close link to the developers/sysadmins making it easy to exchange ideas for new features and to track down bugs.

One of the big reasons I've been excited about SDN is the possibility that it could be used to apply these same DevOps principles to the development of control-plane software for networks.  If I'm not mistaken, this is how IP networking started in the first place (or so I'm told) !  I'm certainly not old enough to have been around during the early days of the Internet.  However, I was very fortunate to start my career with a company called ANS, which operated the NSFnet, and had the privilege of working with many of the people who were.

From what I've heard, the early IP networks were run by computer scientists who both operated the network and wrote the software that powered them.  The process of getting new features into the software didn't involve hundreds (or thousands) of network engineers submitting ideas for new features to sales reps, product managers trying to distill these requests into the actual features that will go into the software, developers building the software, QA teams testing the software, etc - only to have the QA teams at the customer sites do their own extensive QA tests to verify the product works properly in their unique environments - before the feature could be put into production.

Now, I don't know for sure, but I suspect the process "in the good ole days" led to a few more bumps in the road then would be acceptable in today's production networks.  However, I think SDN at least (re)opens the possibility of a tighter "loop" between the people who manage networks and the people who build the software the powers them and IMO that would be a good thing !  

At the Open Networking Summit last week, the Indiana University team demo'd our first example of SDN software developed using this methodology and it's already deployed on a nationwide, production backbone network called NDDI.  We'll be demo'ing our next piece of SDN software at GEC12 next week which solves a specific issue on campus LAN.  Our plan is to have it running in production in the Indiana University network by the end of November.  From the internal demo session this morning, it looks like were very much on track to make that happen.

In the GigaOM article I referenced earlier, the question was posed as to "where in the networking world these developers would come from".   Well, Stacy, we're growing them today, right here in the corn fields of Indiana !! 

  
March 16, 2011
After multiple cab rides and a scenic tour of several areas of San Juan I wouldn't dare go after dark, I made it to the Sheraton Conference Center (not to be confused with the Sheraton Old Town)  in time to setup for the Tuesday night demos.

IU had 3 demos last night:  GENI Meta Operations Center (GMOC), KetKarma and Openflow Campus Trials.  I was responsible for the Openflow Campus Trials Demo and, as luck would have it, nearly all of the monitoring work Chris Small has done is now rolled into the GMOC tools and was on display in the GMOC demo.  Also, John Meylor and Camillo Viecco were on hand and fully prepared to answer questions about Measurement Manager, so I was able to spend quite a bit of time checking out the other demos.  

Ericsson had a really compelling demo of their  MPLS implementation for Openflow that utilizes NOX and Quagga.  UQAM unveiled an EZ-Chip network processor based 100G switch that fully implements the Openflow 1.1 spec.  As far as I know, this is the first complete implementation of Openflow 1.1 and at 100Gbps no less !!   Sounds like they still have some more work to do before this is ready for use, but it looks like they could have a really compelling platform.


I had a number of really good discussions about Openflow that continued through dinner and a drink in Old San Juan.  Now on to the real work of the conference on Wednesday !
March 14, 2011
Openflow is really just one piece in a much broader architecture known as Software-Defined Networking (SDN).  The concept of SDN is actually very simple and is explained quite clearly in a number of presentations by Nick McKeown from Stanford - amongst others.  See http://tinyurl.com/4ekzb9w

The basic idea is that today's networking products mirror the mainframe computer industry of decades past.  Vendors deliver packages of proprietary hardware, operating systems and applications bundled together.  Unlike the PC industry of today, you cannot choose to buy network hardware from one vendor, an operating system from another vendor and applications from a third vendor.   In fact, in many cases it becomes difficult to distinguish a product's hardware from it's operating system from it's applications.

So the idea of SDN is to create open interfaces between these layers so the networking market resembles the PC market with competition at each layer of the stack.  Openflow is the open interface between hardware and the network operating system - perhaps roughly analogous to x86.  As network operating systems develop, as several currently are, there will hopefully be open APIs that application developers can use to build functionality on top of the operating systems.  In theory, this should be good for the consumers of networking products because they should have more options and more competition should lead to reduced costs.  As a consumer of networking products, I'm all in !   But will theory and reality actually match up or will something get lost in between ?

I'm a firm believer in learning from history, so let's take a look for a minute at what happened in the world of wireless controllers to see if we can glean any useful knowledge from that experience.

Of course there was CAPWAP which, in some ways, is analogous to Openflow in that it was meant to be an open interface between controllers and APs.  As I understand it, if CAPWAP had achieved success as a multi-vendor interoperable standard, you could have had a controller from vendor A and APs from vendors A, B, and C and they would have played nicely together.  Of course this didn't happen and consumers have to purchase APs and controllers from the same vendor.

However, the lack of controller-to-AP interoperability is not the most troubling interoperability problem associated with controller-based wireless systems.  What happens when I have 30 controllers and 4,000+ APs deployed on a large university campus and then decide to switch wireless vendors ?   Sure, the fact that there's not competition at both the AP and controller layer means I'm probably paying more than I otherwise might.  But, technically,  it's not really a problem deploying the new controllers and having the new APs associate with the new controllers.   That's easy enough.

The real problem is that the controllers from vendor A and vendor B do not interoperate in any meaningful way.  Sure, because they both do basic IP forwarding, a wireless client on one system can communicate with a client on the other system.  But none of the features (ie applications) that drove me to choose a controller-based system over stand-alone APs in the first place are supported across both systems.  Features like RF management and layer-3 roaming do not work across wireless systems.  So, if during the transition between vendors, two adjacent buildings are on different systems, users can't roam seamlessly between buildings, there can be RF interference issues, and captive portal logins aren't maintained forcing users to re-authenticate simply because they moved from one building to another.  As a result, many organizations have become locked-in to their current controller-based wireless vendor.  The amount of effort required to switch vendors is high enough, that they're willing to stay with their current vendor unless they're REALLY unhappy with them or can save a TON of money by switching.  Anecdotally, I hear lots of people complaining about their wireless systems and almost none of them are considering changing vendors !!

So what does this have to do with Software-Defined Networking and Openflow ?   Well, there are in fact a lot of similarities.  Openflow allows a single, software-based network operating system to control potentially hundreds or thousands of hardware devices - not unlike a wireless-controller.   Novel new applications can be rapidly developed on top of the network operating system to allow new functionality and more efficient management of networks - not unlike wireless controllers.  But, if the applications are tightly coupled to the network operating system (as is the case with wireless controllers) and if customers do not sufficiently compel the software vendors to make their products interoperate at an "application" level, consumers could be left in the same vendor lock-in situation they're currently experiencing with controller-based wireless systems.

At this point, those of you who know me are probably thinking, "Man, when did Matt get so down on Openflow ?".   Certainly it's not all gloom and doom.  The SDN product space is developing much differently than the controller-based wireless market.  Open-source projects are flourishing which will hopefully help drive the market is an open-standards direction.  But we, the consumers of networking equipment, need to be vigilant.  Don't just assume that creating competition at both the hardware and operating system layer is going to be good for the consumer.  What happens at the layers above that - the operating system and application layers - is probably much more important in the long run !!
January 31, 2011
Indiana University hosted a BoF session on Openflow this afternoon at the Internet2 Joint Techs Workshop in Clemson, SC.   Chris Small did an excellent job organizing the session and pulled off a great demo of VM mobility.  I presented the intro slides and did a short demo of an Openflow controller from Big Switch Networks.

We counted 60+ people in the room they scheduled for us.  It was packed and there were people standing in the hall.  I asked for a show of hands of people who had never heard of Openflow (0 hands) and of people who knew very little or nothing about Openflow (2-3 hands).  60 people, primarily campus network engineers, in the room and nearly all of them knew something about Openflow.  I was completely blown away !  They ended up moving us to the auditorium so more people could join.  I counted 88 people once we were settled in the auditorium !

Overall it was a good session with some excellent side discussions afterwards.  Next up is GEC10 in Puerto Rico !
January 27, 2011
Wow, it's been over 5 months since I made a post !   As I think back, most of my long gaps in blog posts are when I've been the most busy and particularly when I've been busy working on things I can't post on my blog - and this time was no different.
August 19, 2010
That pretty much sums up the last 4 weeks of my life.   GENI Solicitation 3 proposals are due by 5pm tomorrow.  The IU GlobalNOC is leading or partnering on several different proposals for various parts of the solicitation.  I'm personally working on 2 proposals, PI on one and Co-PI on the other.

I'm looking forward to getting back to "normal" after tomorrow and there's no shortage of other work to be done.  We're evaluating our options for 2011 router refreshes for both IU and I-Light.  Our pilot of the Summer of Networking internship program went extremely well and we're already preparing to continue, improve and hopefully expand the program for next year.  We're also working on a plan to expand the hands-on network training opportunities for networking staff both at IU and other universities.  GEC9 (9th GENI Engineering Conference) will be here before you know it and I suspect the preparations will kick into high gear after proposals are submitted tomorrow !
July 13, 2010
Internet2 is holding their summer Joint Techs Workshop at Ohio State this week and Openflow was featured prominently on yesterday's afternoon's agenda.  At 3:00 Srini Seeththaraman from Stanford gave an excellent overview of Openflow.  I followed that up with a talk at 4:30 that was focused on the practical aspects of  potential Openflow applications in R&E network and what network engineers can do to get started.  That was immediately followed by a presentation from Heidi Picher Dempsey from the GENI Project Office who talked about GENI and Openflow's application within the GENI infrastructure.  GENI and Openflow were also the primary topic among the regional networks at the Gigapop Geeks BoF with both Heidi and I leading discussions.  There were many good questions and excellent discussion about Openflow and GENI.

The slides and archived video from all of the presentations is available on the Internet2 Joint Techs Workshop agenda page:  


http://tinyurl.com/24rzjn5    
June 17, 2010
There are very smart people involved in the development of Openflow.  However, I suspect very few of them actively manage networks on a day-to-day basis.  Now that the code is in the hands of network engineers, we can see what's needed to actually get this running in production networks.

When it comes to emerging technologies, this space  between the development and actual production use - between developers and the network engineers in the trench -  is something I find incredibly interesting.  It's great to be involved in the development at the point you can provide substantive feedback into the actual product or technology.

And that is where we are today with Openflow.  We have Openflow deployed to 4 "production" switches and have a wireless SSID in 3 buildings across campus that feeds into an Openflow switch.  The cool thing is that it all pretty much works.  The problem is that, when it doesn't work, it's a pretty big pain to figure out why.  Yesterday I compared it to the early days of the GSRs when the tables on one of the linecards would get out of sync, but it's a bit worse because the "linecards" are spread across the whole campus and there are very few debugging tools available.

There are a number of debugging features that would be useful, but I think the most useful one would be a way to see the dataplane and control-plane packets at the same time.  One way to do this would be for the switch vendors to allow you to add Openflow control-plane packets into a port-mirroring configuration.  This would allow me to hook a sniffer up to a switch port and mirror both the traffic to/from a switch port and the Openflow control messages to the sniffer.  

Why would this be useful ?  One problem we're having right now is that some laptops take 1-2 minutes to get a DHCP lease on the Openflow network.  Is the switch taking a long time to encapsulate the first DHCP message into an Openflow message and send it to the controller ?  Is the controller taking a long time to send the packet-out and flow-add messages to the switch ?  Are the Openflow messages getting lost along the way ?  Today I have to run a tcpdump on the Openflow controller to capture control-plane packets and Wireshark on a laptop to capture the dataplane packets and then try to compare them without synchronized timestamps.   This one little feature would have saved us a lot of headaches !
May 27, 2010
A little trivia about myself. When I was much younger,  I taught myself how to juggle for a talent show.  The grand finale was to juggle 3 apples while taking a bite out of one of them each time it came around.  Well, right now I feel a bit like a juggler with a dozen apples in the air, desperately trying not to drop one, so here are some snippets of what I'm up to...

- Openflow.  We have our "production" Openflow network up which includes a switch in the Comm Services building, two in Wrubel and one that supports an Openflow SSID that is deployed in Lindley Hall (Computer Science), Informatics and the Innovation Center.  We need to get more "testers" moved onto the switches and SSID to stress test the system.   My laptop and IP phone have been on the Openflow network for almost 6 weeks without any problems.  We're hoping to have a Informatics grad student working on the project with us starting in September.

- Wireless.  When HP bought 3COM/H3C this spring they got what appears to be a very good controller-based wireless product from H3C.  We received some eval equipment yesterday which we'll be testing.  We're trying to determine the right path moving forward between HP's two different wireless systems.

- GlobalNOC Summer of Networking.  Students have started arriving with the remaining students starting on June 1st.   Our weekly training program will start the following week.  I think we have 8 total students.  Most of them are in the syseng area, but we also have one in the Service Desk and one in my area (Network Architecture), to help with the test lab.

- Test Lab.  Use of the testlab is really picking up.  We have a bunch of new equipment coming in on eval as well as some permanent equipment and a lot of people who want to do testing.  Ed Furia and I have been supporting this in our spare time (with help from an intern this summer), but we really need to get a full-time lab admin hired.

- Training.  At the GlobalNOC retreat last week there was a lot of interest in developing a training program.  We're hoping to develop a curriculum of hands-on network training that could benefit GlobalNOC staff, other UITS staff, interns, and potentially others.

- IU Health Science Network:  This is a design I proposed in 2007 to resolve many of the issues surrounding the IU School of Medicine and the Clarian hospital system.  It's finally gaining momentum and implementation will start very soon.

Well, those are the highlights !
May 4, 2010
Well, I'm spending another week in a hotel in California (Sunnyvale this time).   Juniper yesterday, HP today, and Cyan Optics tomorrow.  Cyan makes an interesting product in the packet/optical space.  It's potentially a very good match for RONs and Statenets that don't need a lot of wave services (
 
April 20, 2010
I started my morning with a nice walk across campus to Lindley Hall for a meeting with Rob Henderson. I have to interject that it was a great morning to walk across campus - 60 degrees and sunny !  It sure beats 10th and the Bypass !!   Anyway the primary topic of the meeting was our Openflow testbed, but we wondered off on a number of different topics.  We need to get people to help us test Openflow and Informatics & Computer Sciences seemed like a logical place to start.  Rob is onboard and we're ready to start rolling !

Our first step will be to deploy a wireless SSID for Openflow.  The SSID will function exactly like our 802.1x SSID (IU Secure) except the user traffic will be plumbed through a couple of Openflow enabled switches before it hits the first router.  The key advantages are (1) we can easily deploy an Openflow SSID to thousands of APs to get a lot of users and (2) users can opt-in and out easily simply be switching SSIDs (which will be helpful if something breaks).  

In parallel with the wireless deployment, we'll be deploying Openflow on production switches for wired users, first in the UITS complex at WCC and then at Informatics and CS.  If all goes well, we could have Openflow enabled on 15-20 switches by the end of July along with the wireless deployment.

    
April 8, 2010
I spent yesterday at the IUPUI Conference Center attending an NSF "Campus Bridging" workshop.  Your first response will likely be the same as everyone I've spoken to - "What the heck is that ?".

Well, this was the first one I attended, but I believe this is one track in a series of workshops to help the NSF decide how to structure it's future Cyberinfrastructure (CI) funding programs.  The focus was on how to get campuses ramped up to support the data deluge generated by scientific instruments from gene sequencers to the LHC.  Obviously networking is a big part of that equation, but certainly not the only part.  There was a lot of discussion about data storage and indexing, meta data, federated identity and so on.

Here are a couple of good presentations that I think hit the nail on the head in terms of how we should be building campus networks to handle big data science applications...

Network Architecture for High Performance (Joe Metzger - ESNET)
The Data Intensive Network (Guy Almes - TAMU)

Incidentally, IU started building our campus networks this way in about 2003-04 and I think this is one of the reasons we've been so successful with projects like the Data Capacitor.
April 1, 2010
I spent yesterday afternoon at Ball State University in Muncie, IN. For those non-Hoosiers out there, Ball State is named after te Ball Family as in the jars you can tomatoes in ! I met with Steve Jones who is the director of their CICS program. Hopefully I don't butcher the acronym, but IIRC it stands for Center for Information and Communications Sciences. It's a very cool program and as someone who grew up right down the road from the university, I had no idea it existed. They have some very bright and motivated students and hopefully some of them will eventually come join the team at the GlobalNOC !


-- Post From My iPhone
March 24, 2010
The GlobalNOC has a long history of hiring students to work on projects during the summer.  In fact, many of our software developers and system administrators started with us as students.  This summer we anticipate having about 8-10 students working in multiple areas of the GlobalNOC including Systems Engineering, Service Desk and Network Architecture.

With such a large group of students, we decided to pilot a program to provide additional training opportunities in a group forum.  Our plan is to have group training/seminar sessions one afternoon a week.  The initial sessions with include presentations and training by GlobalNOC staff to the students.  Towards the end of the summer, the sessions will be focused on the students presenting their work from the summer or a networking topic they've been researching during the summer to each other.  Since the students will be split between the IUPUI and IUB campuses, most of the group sessions will be conducted via high-definition video conferencing, but at least two of the sessions will be conducted face-to-face with all the students.

There will also be opportunities for students to shadow GlobalNOC staff in areas other than the area in which they are working.  So a student working on software development in the Systems Engineering group would have a chance to learn about the Service Desk, Network Engineering and Network Architecture groups by shadowing someone in each of those areas.

This is a pilot, so we may need to make adjusts during the summer, but I think this will be a great opportunity for students to get hands-on experience managing large-scale networks.
March 22, 2010
Nothing like back-to-back weeks of travel !   This week we're headed to Silicon Valley for a series of meetings related to Openflow including stops at Stanford University and HP Labs.  Actually, last week's trip to GEC7 at Duke University was related to Openflow as well.   If you haven't checked out Openflow yet, I'd encourage you to do so (www.openflowswitch.org).  It's a standard API that allows external systems (think PC servers) to manipulate the forwarding ASICs in switches and routers.  IU was recently awarded an NSF grant through the GENI program to help get Openflow deployed on campuses.
March 15, 2010
I'm heading to GEC 7, the 7th GENI Engineering Conference, tomorrow with several colleagues from the IU GlobalNOC.  IU has received multiple GENI grants so far including one for the Openflow Campus Trials which I'm working on along with the PI, Chris Small.  Tomorrow night we'll be doing a demo of our current Openflow deployment that includes 6 HP switches running Openflow capable code along with the NOX and SNAC Openflow controllers.  You can check out our project page on the GENI Wiki for more information.
March 9, 2010
Looks like we're going to end up with somewhere between 5 and 10 students working/interning at the GlobalNOC this summer.  These will be spread across the organization in the Systems area (programming and system administration), in the Service Desk (24x7 NOC), and potentially in my area helping with the testlab.  I'm working on plans fo
January 5, 2010
Happy New Year !   Hopefully everyone enjoyed the holidays.  I hardly looked at email for 2 full weeks which was very nice !  

2010 promises to be as busy and eventful as 2009, if not more !  We are in the midst of two separate beta testing programs right now along with an RFP.  I'm actively working on two grant proposals, a major project to provide a more seamless networking experience across the Clarian (hospital) and IU facilities, and I'm trying to finish up a Legacy RSA with ARIN.  In all, my group has about 20 active projects on our plate right now !  
December 10, 2009
A while back, HP was in town for a meeting and brought along a new product that was just about to go into beta and which is now shipping.  It's called the MSM317.   It's part of their wireless product family, but it does more than just wireless.  It's a 5-port 10/100 wall-plate switch that also includes an 802.11b/g access point.  Wall-plate switches are nothing new.  3Com (who HP recently announced an intention to buy) has been selling them for years.  It fits inside an electrical box, has 4 front-side ports for users and one port on the back to terminate the cable coming from the Ethernet switch in the wiring closet.  It's powered by PoE and only uses 7.5 watts so it can provide 7.5 watts of PoE to one of the front side ports.

So if there's nothing new about this then why am I writing about it ?

What I found very intriguing is that it's managed through the same controllers that manage HP's wireless access points.  So in addition to being able to configure wireless parameters such as SSID settings through the controllers, you can also configure the wired switch settings such as 802.1x and VLAN tagging through the controllers.  The HP controllers allow you to group devices and configure all the devices in a group once. Also, like the wireless access points, when an MSM317 boots up, it automatically finds it's controller, typically through DHCP options, and then downloads the appropriate firmware and configuration.  Also, the user traffic doesn't have to go through the controller at all.  It can pass right onto the wired network in the building.  I thought, man, if you're going to deploy hundreds or thousands of these, it makes management a whole lot easier !

But wait, we have over 1,500 enterprise-class Ethernet switches in our wiring closets ?   Couldn't something like this make it a whole lot easier to manage all of those switches too ?  So what if we had controllers for all our Ethernet switches ?  In a wireless environment where traffic is all forwarded locally instead of centrally through the controllers, what are controllers really used for ?

The answer is that the controller provides a tightly integrated mechanism for managing a large group of devices (wireless access points today) almost as a single device.  I can put the 80 access points in a building into a group and then simply configure a new SSID for 1 group and have it applied to all 80 access points.  Or I can upgrade code on 1 controller and have that push automatically to 200+ access points.  

The other powerful feature of wireless controllers is that they understand topology.  A controller talks to all the APs and knows which APs each AP can see (ie their topology).  It can then do very cool things like tell all the APs the best channel to select or help control which clients associate with which AP.  Why couldn't this be applied to wireless networks ?  For example, if a wired controller knew the switch topology, wouldn't it be very easy to provision new VLANs in a building ?   What about feature like DHCP snooping where you need to manually configure uplink and/or downlink ports ?   Or what about processes like upgrading firmware where you want to upgrade the "edge" switches first and move towards the core, making sure each switch comes up before you reboot it's upstream switch and potentially cut yourself off ?  

The bottom line is that we need controllers for wired switches for all the same reasons we have controllers for wireless access points !   In order to manage hundreds or thousands of devices that are all nearly identical, you need to manage them as groups (e.g. all the devices in a building) as if they were a single device.  

A few years ago I had the pleasure of having lunch with Dr. Doug Comer from Purdue University.  He said something that day that stuck with me.  He said that we need to get to the point that we manage whole networks the way we currently manage individual nodes in the network.  In theory, a centralized NMS package could do this, but in practice that has never happened.  The controller-based model, when done well, is a big step in the right direction.  Perhaps we need to think about expanding the model to include wired switches as well ?
December 10, 2009



I've mentioned our new testlab in a couple of tweets, so I thought I'd post some more information about what we're doing. The MDF in our new data center is quite spacious and well equipped. It includes 45 heavy-duty 2-post Panduit racks, overhead infrastructure for power cables, low-voltage copper cables (ie Cat5/6) and fiber, 36 inch raised floor and 1,800 AMPs of DC power. The production equipment is being built out from the front of the room toward the back, so we reserved the last couple of rows (10 racks total) for "test" equipment.

We've compiled a fair amount of equipment that can be used for testing and we also have a lot of equipment that moves through here to be "burned-in" and configured before it's sent into the field. All this equipment needs a place to live either temporarily or permanently. We have equipment from Ciena, Juniper, Infinera, Cisco, HP and others. Up until now it's be spread across several facilities, most of which had inadequate space, power and/or cooling. So we're very excited about having a wonderful new facility !



It's been amazing how much demand there is for this kind of testing environment. Equipment has been moved in quickly and as soon as people found out it was there, they wanted to use it. It's very clear that we'll need to designate a "lab czar" to make sure we maintain some semblance of organization in the lab - and it's clear that the lab czar better not be me ! The grand vision is to have a lab environment where engineers can "check out" specific devices, automatically build cross-connects between devices to create the topology they need and have the device configs reset to default when their work is completed. We're a long way from this, but will hopefully keep moving steadily in that direction over the next 12-24 months.



November 20, 2009
Do you ever read about a new technology and go, "Man, that's so cool ! We should be doing that !". Only to be disappointed once you started digging into it a bit ?

That's exactly what happened to me after I read the following whitepaper...

Connecting to the Cloud with F5 BIG-IP Solutions and VMware VMotion

Some of you may have read my post a while back about how cool Application Delivery Controllers (aka load-balancers) are. Everything I said is probably true (note to self: reread that post and edit if necessary), but man, once you start digging into what you can do with one of those things - it strikes fear into the heart of every decent network engineer !

And now it looks like these things may bring us the holy grail of virtualization - live migration across a wide-area network ! I'm onboard !!

F5 demo'd this at VM World in late August and it's now late November. We have 4 brand new F5's that aren't in production yet, 2 in each of our data centers separated by about 60 miles. And we have plenty of VM's to throw into the mix. So I figured I'd download the configuration guide and see what it takes to set this up....oh, there's no configuration guide. Hmmm, maybe the documentation is on F5's devcentral site.....no. Okay, well our F5 sales engineer is coming in today so I'll just ask him....well he didn't have very many details and he referenced the documentation on their website...which of course I can't find. And what he did tell me made me realize just how many moving parts are involved and how complex the whole setup really is. Well, this could end up being really cool stuff, but it looks like it's not quite soup yet.

And then there's the issue of whether this is the right way to solve this problem. I'm left with the feeling that this is a really ingenious solution to a problem using the tools we already have but that what we really need are some new tools !

In our case we can theoretically bridge VLANs between our data centers since we have dark fiber. This would theoretically simplify things, but we haven't done this yet because of concerns about bridging loops and broadcast storms taking down BOTH of our data centers! If we could essentially route Ethernet MAC addresses using TRILL or similar functionality developed by the IEEE - perhaps that would offer a simpler solution to this problem !
October 21, 2009
I'm in Dearborn Michigan this week for the NANOG and ARIN meetings. NANOG = North American Network Operators Group. NANOG is very much like the Internet2 Joint Techs Workshops for the commercial sector. It's where network engineers get together to discuss cool new things they're doing. And, like most of these things, it's a lot about social networking - a chance to meet face-to-face with the people you email and IM with every day. ARIN = American Registry of Internet Numbers. ARIN is the non-profit that is responsible for handing out Internet number resources - primarily IP addresses.

IPv6 is a huge topic of discussion this week. Yahoo presented on their IPv6 roll-out which they completed last week. Comcast just presented on their deployment. Google has IPv6 deployed as well. I saw a news story last week that the number of ISPs requesting IPv6 addresses from ARIN has gone way up. In fact, in the last quarter (last month maybe) ARIN received more requests for IPv6 addresses than IPv4 addresses for the first time ever. It seems that IPv6 is *finally* getting some traction. My sense is that this is the real deal and IPv6 is really going to happen now.

It's funny though to see all the hype around IPv6 in the commercial sector. We rolled out IPv6 on the Internet2 network in 2000 and had IPv6 enabled on every data jack at IU around 2001. WRT IPv6, attending a NANOG in 2009 is much like attending an Internet2 Joint Techs Workshop in 2000 or 2001.
September 28, 2009
I remember the first time I was in a meeting about the deployment of a computer system and there were plumbers at the meeting ! Now there's more plumbing under the raised floors than anything else. Well, last week I got to work with the guys from the sheet metal shop while they fabricated duct work for our Cisco Nexus 7018 switches
. This turns the side-to-side airflow into front-to-back airflow. The sheet metal shop did a great job on very short notice !!






-- Post From My iPhone
September 21, 2009
Okay, now I'm sure you're thinking, what the heck does that great Buggles hit from 1980 have to do with networking ? Or you may be cursing me because you'll be hearing "oh, uh oh" in that annoying high pitched female voice ALL DAY !!

Regardless, after years of anticipation (7 to be exact), on Friday Sept. 11th, the IEEE finally ratified the 802.11n standard. Of course, quite a few enterprises, including many university campuses, have been deploying 802.11n since at least 2007 when the WiFi alliance started certifying equipment to draft 2 of the standard. But long before the standard was ratified and even before there were many enterprise deployments, there was no shortage of articles heralding the end of wired Ethernet. I can't count the number of times I've been asked if we would stop pulling wiring into new buildings and go 100% wireless. My emphatic response has always been "No, wireless will be hugely popular, but wires are not going away any time soon".

So when I received an email notification from The Burton Group last week about a report entitled "802.11n The End of Ethernet", I was pretty sure what I would find inside the report. Still, I knew there was a good chance I would have to field questions about the report, so I thought I better check it out. What I found is that the report basically supported what I've been saying, although that may not be apparent on the surface.

One key thing to keep in mind is that network usage and requirements at a research university are NOT the same as your typical business. For example, the report points out that 802.11n will probably not have sufficient bandwidth for "large" file transfers. But how do they define "large" ? The report defines "moderate" file sizes as 2-8 MB, so presumably anything larger than 8-10MB or so would be considered "large". This is probably accurate for a corporate network where you typically have relatively small connections (1-10 Mbps) to the Internet. At IU we have a 10 Gbps (that's a 'G') to the Internet and it's quite common for people to very large (100MB+) files from the Internet. It's also common for people to load very large (100MB+) files such as Microsoft Office or Adobe Photoshop over the local network. The last time I downloaded Microsoft Office from IUWare (MacBook Pro on a Gigabit Ethernet data jack), I got well over 400 Mbps and it only took about 15-20 seconds to download ! Never mind the researchers who want to upload and download files that are 50-100 GBs and larger or IPTV with streams of 6-8 Mbps per user !

Typical, real-world performance for 802.11n is around 120-150 Mbps. But, keep in mind, this is half-duplex, shared bandwidth for each Access Point (AP), so performance for individual users can vary greatly depending on how many users are in the same area and what they are doing. At a recent Internet2 workshop in Indianapolis where we supplied 802.11n wireless, I often saw 50+ Mbps on downloads over 802.11n, but sometimes performance dropped down to around 10-15 Mbps. And if you're further away from the AP with lower signal strength, you could see even lower throughput.

Another important factor is that 802.11 uses unlicensed spectrum and therefore is subject to interference. Microwaves, baby monitors, cordless phones - there are many sources of potential interference. In a corporate environment, it might be easier to prevent sources of interference, but at a university, especially in student residences, it is quite difficult. I've been told that most students in our dorms connect their game systems to the wired network, even though they have wireless capabilities, because they have experienced drops in wireless connectivity that interrupted online games at inopportune moments. A 30 second wireless drop-out while your neighbor heats up some leftover pizza at 3am may not seem like a big deal, unless you've been playing on online game for the last 8 hours and are just about to win when the connection drops !

The third important factor, IMO, is the use of IP for what I'll generically call "appliances". Cash registers, card readers, security cameras, building automation systems, parking gates, exercise equipment...the list goes on and on and they all used wired connections. If the use of wired Ethernet for PC's decreases, it's possible the increase in wired connections for these "appliances" will more than make up for it !

IMO networking is not a fixed sized pie that is divided between wired and wireless such that when one slice gets bigger the other slice gets smaller. The pie is getting much bigger all the time - it just so happens that going forward, growth in the wireless slice will probably dwarf the growth in the wired slice !

So, just as radio is still alive and well almost 30 years after the introduction of the video, I suspect wired Ethernet will be alive a well many years from now.
April 13, 2009
I suppose the term "load-balancer" is out of date and has been replaced by the term "Application Delivery Controller", but regardless of what you call them, they are pretty powerful and can do a lot of cool things ! Sysadmin types have known this for years, but as a network guy who just recently started digging into these, I'm a bit geeked about what you can do with these.

The background here is that we use load-balancers from both Zeus and F5 depending on the application. In preparing for the move to our new data center, we're testing some new F5 hardware and software and reconsidering how these things get connected into the network.

One goal we have is to enable failover between our data centers in Indianapolis and Bloomington (see my previous post on this). We had been looking at DNS based solutions (Global Server Load-Balancers), but for a number of reasons Route Health Injection (RHI) is a much better option for us. A couple of weeks ago we got together with our Messaging team to setup and test RHI. Without too much manual reading and just a little bit of poking around, we were able to get RHI working within about 15 minutes and boy was it slick. We injected a /32 BGP route for a DNS Virtual IP from our F5's at Indy and Bloomington and weighted the routing preferences so the Bloomington path was preferred. DNS queries resolved on the Bloomington server until we shutdown 'named' on the Bloomington server. Within a few second, queries to the same IP address were resolved by the server in Indy. Turned 'named' back up in Bloomington, and queries went back to Bloomington. One problem solved !

Operationally this points out how load-balancers are both network and server ! Server-wise they do things like SSL-offload so your SSL certs actually live on the load-balancer --- so your server admins probably want to manage these. Network-wise, they're now running BGP routing with your core routers and the routing configuration on the F5 (based on Zebra) looks a lot like Cisco's IOS --- so your network admins probably want to have some control of these functions.

Now, what if I want to add IPv6 support to those DNS servers ? Well, I could go and enable IPv6 on all my DNS servers, but with a load-balancer, I could just enable IPv6 on the load-balancers and have them translate between v6 and v4 . After all, the load-balancer is essentially acting like an application-layer proxy server. In under 2 minutes I added a new Virtual IP (IPv6 in this case) and associated it with the pool of DNS servers we already configured in our test F5s and, without touching the servers, I was resolving DNS queries over IPv6 transport ! According to their documentation Zeus supports this IPv6 functionality as well. So, instead of hampering IPv6 deployment, as is the case with many network applications such as firewalls and IDPs, these load-balancers are actually making it easy to support IPv6 !
February 24, 2009
I started working when I was 15 years old. It was at the trucking company my dad worked for. I finished all the manual labor they gave me by the middle of the summer, so I got sent into the office to do data entry. That got me started on computers and the rest is history.

I bet you're wondering what this has to do with networking? Well, I remember clearly taking my timecard to the time clock to punch in and out every day. That machine was built like a tank that would last forever ! You know the kind I'm talking about - the big grey metal box with the clock on the front and the big button on the top !

Well, I'm guessing they must look a bit different these days since I found out today that time clocks are getting connected to our IP network ! Time clocks !

Here's the list of devices I know are connected to our network (off the top of my head):

Phones, cellphones, security cameras, heating and air conditioning systems, electric meters, door locks, parking gates, cash registers, laundry machines, fire alarms, MRI machines, game systems, TVs, digital signs, clocks, and probably many more I'm not aware of.

Crazy stuff !




-- Post From My iPhone
February 24, 2009
First things first. Yes, I realize it's been almost 3 months since my last post...shame on me ! The good news is that we've been quite busy working on lots of new things, so I have plenty of material to keep me writing for a while !

I'd like to start with a topic I've been thinking about a lot lately (today in particular) that I think many people are interested in. That topic is how do you provide automatic, transparent fail-over between servers located in different data centers. Ever since the I-Light fiber between Indianapolis and Bloomington was completed and the ICTC building was completed, we've been receiving requests to enable server redundancy between the two campuses. Seems easy enough, so why haven't we done this yet ?

There are really 3 main options available:

(1) Microsoft Network Load-Balancing or similar solutions. These solutions require the 2 servers to be connected to the same broadcast domain. They usual work by assigning a "shared" MAC or IP address to the two servers along with various tricks for getting the router to direct traffic destined for a single IP address to 2 different servers. Some of these packages also include software that handles the server synchronization (eg synchronizing databases, config files, etc).

(2) Global Server Load Balancing (GSLB). These are DNS based solutions whereby the GSLB DNS server is the authoritative DNS server for a domain and returns different A records based on the IP address of the client (or rather the client's recursive DNS server) and the "health" of the servers. In many cases, "servers" are actually virtual IP addresses on a local load-balancing appliance.

(3) Route Health Injection. These solutions involve a local load-balancing appliance that "injects" a /32 route via BGP or OSPF into the network for the virtual IP address of a server. Typically you have a load-balancing appliance in each data center that injects a /32 for the server's virtual IP address. The key is the virtual IP addresses are the *SAME* IP address in both data centers. It's NOT the same broadcast domain, just the same IP address and the actual servers are typically on private IP addresses "behind" the load-balancing appliances. You can achieve an active-passive configuration by setting the routing metrics so that the announcement at one data center is preferred over the other. *OR* you can set equal route metrics and clients will follow the path to the "closest" data center based on network paths -- this is referred to as "anycast".

So you're thinking "these all sound like good options, surely there must be some gotchas?"....

The issue with option #1 is that you have to extend a broadcast domain between the two data centers - in our case between Indianapolis and Bloomington. As I think I covered in an earlier post, "broadcast domain" == "failure domain". Many types of failures are contained within a single broadcast domain and by extending broadcast domains across multiple facilities, you increase the risk of a single failure bringing down multiple systems. Especially in a university environment where management of servers is very decentralized, this can become very problematic. I can recount numerous occasions where someone made a change (ie did something bad) that created a failure (eg loop, broadcast storm, etc) and all the users in multiple buildings were affected because a VLAN had been "plumbed" through multiple buildings for whatever reason. However, these solutions are typically very inexpensive (often free), so they are very attractive to system owners/administrators.

There are 2 main issues with option #2. First, in order to provide reasonably fast failover, you have to reduce the TTL on the DNS records to a relatively small value (eg 60 seconds). If you have a very large number of clients querying a small set of recursive DNS servers, you may significantly increase the load on your recursive DNS servers. The other issue is with clients that ignore the DNS TTL and cache the records for an extended period of time. GSLB solutions are also significantly more expensive than option #1 solutions. One big advantage of GSLB is that the servers can literally be anywhere on the Internet.

Option 3 is actually quite attractive in many ways. One downside is that the servers must reside behind a local load-balancing appliance. That's not entirely correct. You could install routing software on the servers themselves, but with many different groups managing servers this raises concerns about who is injecting routes into your routing protocols. The need for load-balancing appliances significantly increases the cost of the solution and limits where the servers can be located. In order to reduce costs you could place multiple systems behind a single load-balancing appliance (assuming there's sufficient capacity), but that raises the issue of who manages the appliance. There are virtualization options of some load-balancers that allow different groups to manage different portions of the configuration, so there are some solutions to this.

We are currently exploring both the Global Server Load-Balancing and Route Health Injection options in the hope of developing a service that provides automatic, transparent (to the clients) failover between at least the two UITS data centers and possibly (with GSLB) between any two locations.
November 20, 2008
Once you learn how to use the MPLS hammer, you'll suddenly see a million nails you could whack with your shiny new hammer !

We deployed MPLS and MPLS Layer3 VPNs on the IU campus network this past Monday morning. It was VERY anticlimactic ! We cut and pasted some relatively minimal configs into the routers and it all just worked. What is probably the single largest change in how we do networking on campus since the advent of VLANs happened with no one noticing (except Damon and I who were up entirely too early for a Monday morning). Of course, under the covers all kinds of fancy stuff is happening and we now have a powerful new tool in our tool chest !

Weeks before we actually configured the first MPLS VPN on the network (btw- we won't be putting a production system into this MPLS VPN until Dec. 2nd), we already planned to make MPLS VPNs the centerpiece of the network for the new data center in Bloomington ! Your first thought is probably, why the heck would we want MPLS VPNs in the data center network ?

Our current data center network design has "end-of-row" or "zone" switches (layer-2 only) with cat6 cables running to servers in the adjacent racks. The zone switches each have a 10GbE uplink into a distribution switch (again, layer-2 only). The distribution switch has an 802.1q trunk (2x 10GbE) into the core router. This .1q trunk between the distribution switch and the core router has a Juniper firewall in the middle of it - running in layer-2 mode. Those of you who know this setup in detail will know this is not exactly correct and is over-simplified, but the point is the same.

One problem with this design is that, with over 30 VLANs in the machine room, there is a lot of traffic going in and out of the firewall just to get between 2 VLANs in the same room - perhaps between 2 servers in the same row or same rack or 2 virtual servers inside the same physical server. This causes a couple of problems:

1) It adds significant extra load on the firewall unnecessarily in many cases. Think about DNS queries from all those servers...
2) It makes it very difficult to do vulnerability scanning from behind the firewall because the scanners have to be on 30+ different VLANs

The solution to "fix" this is to place a router behind the firewall - ie turn the distribution switch into a layer-3 routing switch. However, if we did this all 30+ VLANs would be in the same security "zone" - ie there would be no firewall between any of the servers in the machine room. This is not good either. For one, we offer a colocation service and virtual server services, so there are many servers that do no belong to UITS. So we don't want those in the same security zone as our critical central systems. It's probably also not a good idea to put servers with sensitive data in the same security zone as say our FTP mirror server. One solution then would be to place a router behind the firewall for each security zone. But of course that gets very expensive....if you want 5 security zones you need 10 routers (redundant routers for each zone).

And this is where the MPLS VPN hammer gets pulled out to plunk this nail on the head !! You use MPLS VPNs to put 5 virtual routers on each physical router and put a firewall between each virtual router and the physical router and your problem is solved. And actually, if you can virtualize the firewall, you can create a virtual firewall for each virtual router and you have 5 completely separate security zones with a pair of high-end routers and firewalls supporting all 5 - all for no extra cost *except* for all the added operational complexity. Those are the costs we need to figure out before we go too crazy whacking nails with our shiny new hammer !!
November 9, 2008
We had 5 major network maintenances planned in order to complete the core upgrade project and deploy MPLS VPNs. The first 2 are done: The first was disabling the IS-IS routing protocol (OSPF has been deployed along side IS-IS for some time). This was completed last Thursday. The second was replacing our primary border router (a Juniper M10) with a Cisco 6500. This was completed this morning and was the change that was giving me the most heartburn !

The next change is to swap out the secondary border router with a Cisco 6500 on Tuesday. We'll deploy BGP to all our core routers on Thursday. Currently only the border routers run BGP. BGP is needed on the core routers in order to support MPLS VPNs. The following Monday we will deploy MPLS and our first MPLS VPN.
November 7, 2008
In case you were worried about my untimely demise, no worries, I'm still alive. I've just been so busy doing that I haven't been writing about what I'm doing :) I'll attempt to catch you up and then will try to get a post out at least once a week from now on.

Wireless:

We deployed about 3,000 Access Points over the summer - roughly an average of 200-250 every week. We also rolled out WPA2 Enterprise (aka 802.1x) during the same timeframe. The majority of the Bloomington Residence Halls have wireless coverage with a few more buildings coming up later this month and around the first of the year. We're now turning ou4 attention to 802.11n to prepare for upgrades next summer. As of yesterday we have 802.11n APs in hand to start testing.

The wireless rollout wasn't without it's bumps, but there were very few user impacting problems. We've been getting a lot of positive feedback from users. When users make a point to call the NOC just to let us know how happy they are with the wireless service, you know it must be going well ! We went out on a limb just a bit by choosing a vendor (HP) that was not a household name in the area of large-scale, controller-based enterprise wireless, but it's worked out extremely well.

Core Upgrade and MPLS VPNs

We also completed the vast majority of the core network upgrade over the summer. The last parts of that upgrade are happening this coming week. We'll be replacing the Juniper M10i Border Routers with Cisco 6500's. That greatly increases the capacity on our Border Routers. As a result, we will be upgrading our primary link to the Internet from 2Gbps to 10Gbps at the same time as the swap out which will happen the day after tomorrow. Once this is completed all our core routers will be Cisco 6500's. Since we had this planned, we had been holding off on deploying MPLS so we didn't have to deal with vendor interoperability issues. Not that this wouldn't have worked with both Juniper and Cisco routers, but this saved us quite a bit of testing. We plan to have our first MPLS VPN live and fully test before the Thanksgiving holiday. This will be the VPN for PCI-DSS systems.

PCI-DSS Compliance

This is really coming together although there is still a lot of work to be done to meet the internal deadline of December 31st of this year. We should be ready to start transitioning system into the PCI-DSS MPLS VPN the week following Thanksgiving. The last network requirement we're still struggling with is 2-factor remote-access. This is just a matter of getting our current Safeword token system working with our Cisco VPN servers. It looks like we may have to wait on an upgrade of the Safeword system, but we're trying to find alternatives because that is not likely to happen before 12/31.

New Data Center

This project is really coming together as well. We're hoping to nail down the final network design for the new data center in a meeting this afternoon. I'll have a post devoted just to the data center network design issues. I think the industry is on the cusp of a major shift in data center networking. Top-of-rack switches are clearly the future in the data center, but products are only just now starting to become available. Fiber Channel over Ethernet is a promising technology, but it's day in the sun is probably still 18-24 months out. Also in the 18-24 month time horizon is 40G and 100G ethernet.
August 6, 2008
PCI as in PCI-DSS as in Payment Card Industry Data Security Standards

We met with a QSA on Monday. Don't me what QSA stands for - their the official PCI auditors. The killer statement from the meeting was that every network device we manage which forwards a packet with payment card data in it - even if that data is encrypted - is within scope for PCI compliance. My understanding is that this means that requirements like regular password rotation, quarterly config reviews, and file integrity monitoring apply to all out network equipment. We run a very secure network, but security != compliance so we will end up spending a lot of time dotting our I's and crossing our T's. And a lot more time showing auditors that we dotted and crossed !
July 29, 2008
Okay, this is not really about networking or IU, but I thought it was pretty cool so I figured I'd share it with all of you (which hopefully includes a few more people than I've already told this to in person). *AND* it did involve 1 piece of network equipment owned by IU, so....

Like many people, I'm amazed by many of the 3rd party applications for the iPhone. I was very busy preparing for the Joint Techs workshop last week, so I didn't have much time to "play" with all the new applications for my iPhone. I did, however, download the AOL Radio application a couple of days before leaving for Lincoln. It worked fairly well and I quickly thought it would be quite cool if I could use it in my car while driving ! I'm too cheap to pay for satellite radio, so the idea of being able to listen to radio stations from all over the country in my car caught my eye !

Of course, the first thing I thought was *DOH* - what about that darn GSM interference ? All that buzzing and popping coming through the radio from the streaming audio over the EDGE network wouldn't do. Luckily, I've been testing a Linksys Mobile Broadband router with a Sprint EV-DO card. So I could plug this into the power outlet in my trunk and connect my iPhone to it via Wifi. Note: with iPhone 2.0 release, you can put the iPhone in "airplane mode" - shutting down the cellular radio - and then enable the Wifi radio :) Problem #1 solved ! BTW- I've been told that HSDPA (AT&T's 3G technology) does not have the same interference issues, but alas I don't have one to test with :-(

The next problem was that Sprint doesn't have 3G in Bloomington yet. So how well would this work over the "slow-as molasses" 1xRTT network ?

Before I left for the airport, I tossed the Linksys into my trunk (not literally) and plugged into the power outlet. I dropped (again not literally) my iphone into the dock in my car and headed out. Shortly after I passed the Bloomington bypass on highway 37, I fired up AOL Radio to see what would happen. The station started, but the audio was in and out, stopping and starting --- unusable :-( I turned it off and went back to listening to a podcast. When I reached Martinsville - safely within Sprint's EV-DO coverage - I tried it again -- tuning into the Jack FM station in Chicago. This time it worked fairly well. Every few minutes there would be a short audio drop as it rebuffered, but all-in-all it worked reasonably well.

While I was in Lincoln, I had some free time to play my iPhone. I downloaded a bunch of 3rd party apps include Pandora. For those of you who haven't used Pandora, it's a personal radio station application. You pick an artist and they select songs from that artist and other similar artists. You can give songs a thumbs up or thumbs down and it supposedly adjusts to your tastes.

While in Lincoln, I used Pandora over the EDGE network from my hotel room and walking around town. I was amazed by how well it worked over the EDGE network. Excellent sound quality and almost no rebuffering. I couldn't wait to try it out on the drive home from the airport.

So, last Thursday night while driving home from the airport I tried it out. Amazing ! The quality over both EV-DO and 1xRTT networks was excellent ! Presumably it would be just as good using the cellular radio internal to the iPhone - assuming there wasn't a GSM interference issue. I've been using it for the past several days and have been amazed at how well it works - even down by my house in the southern part of the county where there are definitely some dead spots !

If I ran a satellite radio company, I'd definitely be paying attention to this. It seems to me the major cost for the satellite radio companies is transport - ie getting the signal from the head-end to the users. The reason people want satellite radio is the large selection of content that is available anywhere - not just within your local broadcast area. Exchanging satellite transport for IP transport (either over wired or wireless networks) could drastically reduce their costs and increase their availability - ie you can get IP-based connection in places you can't easily get satellite - like in basements !
July 23, 2008
I'm at the Internet2 Joint Techs Workshop in Lincoln Nebraska this week. The primary reason I'm attending is actually for 2 events that were "tacked-on" to the main workshop: The MPLS Hands-On Workshop on Sunday and the Netguru meeting today and tomorrow.

The MPLS workshop was a 1 day workshop meant to educate campus network engineer about MPLS and it's application on campus networks. The morning was spent on presentations and the afternoon on hands-on configuration of MPLS in a lab setting. This was the first MPLS workshop and it went extremely well. There were 22 people in attendance. I was an instructor for the workshop and gave about a 1 hour talk on the control-plane for MPLS VPNs. I plan to reuse the material to provide some MPLS instruction for the networking staff at IU.

The second event I'm attending is the Netguru's meeting. Netguru is a small group of network architects from universities around the country. As you might imagine, campus network architects often have lots of challenging problems they're trying to solve and find it very helpful to discuss these with other people who are facing the same challenges. I think it's typical for these folks to have 1 or 2 network architect friends that they discuss issues with on a fairly regular basis. A few years ago I shared a cab ride to an airport with David Richardson and Mark Pepin. David and I got together to discuss networking issues on a fairly regular basis - whenever we were in the same city (David worked at the Univ. of Washington before leaving to work for Amazon). We somehow started talking about how network architects share information and Mark Pepin brought up the idea of starting a small group (10-15 people) of network architects that met in conjunction with the I2 Joint Techs workshop to discuss issues of the day. Thus Netguru was born ! We have a full agenda for this afternoon, dinner tonight and all day tomorrow. I've missed the last 2 meetings, so I'm looking forward to the discussions today and tomorrow.
July 17, 2008
Well, it's been 3 weeks since my last post, but I assure you we have not been sitting around twiddling our thumbs ! Here's a summary of what's been going on...

The wireless and core upgrade projects are moving along smoothly. About 1,000 of the 1,200 APs in Bloomington have been replaced. We're also starting to complete some of the dorms in Bloomington as well - so some of the dorm rooms will have wireless by the start of the fall semester. At IUPUI, we're not quite as far along as in Bloomington, but will have completed wireless upgrades in all the on-campus buildings by the time the UITS change freeze goes into effect on August 18th.

We're finishing up the preparations for adding the "IU Guest" SSID to all the APs. This will be the SSID guests who have been given Network Access Accounts will use to access the network. This will allow us to shutdown our old web portal authentication system. The system has a scaling limitation related to the number of MAC addresses on wireless and we've been putting band-aids in place for 2 years to get it to scale to the number of wireless users we have. The "IU Guest" SSID will use the web-portal authentication built-in to the HP WESM modules - these do not have the same scaling limitations.

With these projects moving along smoothly, Jason and I have shifted our attention to the *next* set of projects. Here's a bit about what we've been up to...

We spent a day at IU-Northwest talking with them about the major network upgrade they're planning. During the next 12 months they'll be upgrading all their wiring to Cat6e, consolidating IDFs, improving their outside fiber plant, upgrading all their switches to HP5400's, and deploying over 150 new 802.11n APs.

Jason spent a day at IU-Kokomo helping them setup their new HP wireless gear and discussing their future use of HP's Identity Driven Management product. IU-Kokomo undertook a major upgrade of their network earlier this year, replacing all their switches with HP 5400's, and as part of that they purchased HP's Identity Driven Management system. I could devote a whole post just to this (and probably will eventually), but essentially this is a policy engine that let's you decide when and where users can connect to your network and what type of network service they get - which is done by placing them on different VLANs or applying ACLs to their connection. We've been interested in getting our feet wet with a system like this for some time and Kokomo has agreed to be a guinea pig of sorts :) Thanks Chris !

We had our yearly retreat with the IT Security Office - now called the University Information Security Office. This is something we've been doing for a few years now. A couple people from ITSO and a couple people from Networks get together off-campus and spend several hours thinking strategically about improving security - instead of the tactical thinking we usually do. Tom Zeller hosted the event again - Tom has a large screened in porch in the woods and we were able to watch some wildlife in addition to discussing security !

We met with the University Place Conference Center staff at IUPUI to discuss their unique wireless and guest access needs. They have web-portal authentication on both their wireless network and their wired network. The new web-portal system on the HP WESMs only works for wireless users, so when we upgrade wireless in the hotel and conference center, we'll have to do a bit of a one-off for them.

I've been very busy preparing for the upcoming MPLS Workshop at the Internet2 Joint Tech's workshop in Lincoln, Nebraska. MPLS VPNs are becoming a hot-button topic for campuses as they struggle to meet the divergent networking needs of their different constituents - from the business aspect of the university, to student housing, to researchers. In fact, we're planning to roll-out MPLS VPNs this fall, so when I was asked to be an instructor for this workshop, I figured it would be a great opportunity to sharpen my skills on MPLS VPNs *AND* I could reuse the materials I develop to provide training for all the UITS networking staff that will need to learn how to support MPLS VPNs ! As part of this process, I put together a small MPLS testlab with 3 routers and, when I return, will use this to start preparing for our MPLS VPN deployment.

We've also continued to develop our plans for networking in the new data center. I'll share some more about later once I get past the Joint Tech's workshop in Lincoln !
June 27, 2008
Shortly after my last post - actually it was the next morning in the shower, which is where I do my best thinking - I was hit by the thought, "doesn't CX4 reach 15 meters instead of 10?". So when I got to work that morning, I looked it up and, sure enough, 10GBASE-CX4 has a 15 meter (49 foot) reach. And that, my friends, makes all the difference !

15 meter reach makes it possible to use CX4 as the uplink between all the TOR switches in a 26 cabinet row and a distribution switch located in the middle of said row. This reduces the cost of each TOR switch 10GbE uplink by several thousand dollars.

The result is that, for a server rack with 48 GbE ports, it's substantially less expensive to deploy a TOR switch with 2 10GbE (CX) uplinks to a distribution switch than to deploy a two 24-port patch panels with preterminated Cat6e cables running back to the distribution switch. For server rack with 24 GbE ports, it's a wash in terms of cost - the TOR switch option being a few percent more expensive. This also means that the cost of 10G server connection is significantly lower than I originally calculated.

The only remaining issue is that, in the new data center, the plan was to distribute power down the middle aisle (13 racks on each side of the aisle) and out to each rack, but to distribute the fiber from the outsides of the rows in. One thing that makes the TOR model less expensive is that you only need 1 distribution switch per 26 racks (13 + 13) whereas with the patch panel model you'd need multiple distribution switches on each side of the aisle (2 or 3 switches per 1/2 row or 13 racks). But having only 1 distribution switch per row means that there would be CX4 cables and fiber crossing over the power cables running down the middle aisle. We have 36" raised floors though, so hopefully there's plenty of vertical space for separating the power cables and network cables.

The other consideration is that it appears to me vendors will be converging on SFP+ as a standard 10G pluggable form-factor - going away from XENPACK, XFP and X2. If this happens, SFP+ Direct Attach will become the prevalent 10G copper technology and that I believe does only have 10 meter reach. That would lead us back to placing a distribution switch on each side of the aisle (1 per 13 racks instead of 1 per 26 racks) - which will raise the overall cost slightly.
June 24, 2008
As if we had nothing else to do this year, we're busy planning for our new data center in Bloomington that will come on-line in the spring of 2009. I spent the better part of the afternoon yesterday working through a rough cost analysis of the various options to distribute network connectivity into the server racks, so I thought I'd share some of that with all of you. I'll start with a little history lesson :)

The machine rooms originally had a "home-run" model of networking. All the switches were located in one area and individual ethernet cable were "home-run" from servers directly to the switches. If you're ever in the Wrubel machine room, just pick up a floor tile in the older section to see why this model doesn't scale well ;-)

When the IT building @ IUPUI was built, we moved to a "zone" model. There's a rack in each area or "zone" of the room dedicated to network equipment. From each zone rack, cables are run into each server rack with patch panels on each end. All the zone switches had GbE uplinks to a distribution switch. We originally planned for a 24-port patch panel in every other server rack - which seemed like enough way back when - but we've definitely outgrown this ! So, when we started upgrading the Wrubel machine room to the zone model, we planned for 24-ports in every server racks. 24-ports of GbE is still sufficient for many racks, but the higher-density racks are starting to have 48-ports and sometimes 60 or more ports. This is starting to cause some issues !!

But first, why so many ports per rack ? Well, it's not outrageous to consider 30 1RU servers in a 44 RU rack. Most servers come with dual GbE ports built-in and admins want to use one port for their public interface and the second for their private network for backups and such. That's 60 GbE ports in a rack. - OR - In a large VMware environment, each physical server may have 6 or 8 GbE NICs in it: 2 for VMkernel, 2 for console, 2 for public network and maybe 2 more for private network (again backups, front-end to back-end server communications, etc). 8 NICs per physical server, 6 or 8 physical servers per rack and you have 48 to 64 GbE ports per rack.

So, why doesn't the zone model work ? In a nutshell, it's cable management and too much rack space consumed by patch panels. If you figure 12 server racks per "zone" and 48-ports per rack, you end up with 576 Cat6e cables coming into the zone rack. If you use patch panels, even with 48-port 1.5RU patch panels, you consume 18 RU just with patch panels. An HP5412 switch, which is a pretty dense switch, can support 264 GbE ports in 7 RU (assuming you use 1 of the 12 slots for 10G uplinks). So you'll need 2 HP5412s (14 total RU) PLUS an HP5406 (4 RU) to support all those ports. 18 + 14 + 4 = 36 - that's a pretty full rack - and you still need space to run 576 cables between the patch panels and the switches. If you don't use patch panels, you have 576 individual cables coming into the rack to manage. Neither option is very attractive !

Also, if you manage a large VMware environment, with 6 or 8 ethernet connections into each physical server, 10GbE starts looking like an attractive option (at least until you get the bill ;-). Can you collapse the 8 GbE connections into 2 10GbE connections ? The first thing that pops out when you look at this is that the cost to run 10GbE connections across the data center on fiber between servers and switches is simply prohibitive ! 10GBASE-SR optics are usually a couple grand (even at edu discounts), so the cost of a single 10GbE connection over multimode fiber is upwards of $4,000 *just* for the optics - not include the cost of the switch port or the NIC !

For both these reasons (high-density 1G and 10G) a top-of-rack (TOR) switch model starts looking quite attractive. The result is a 3-layer switching model with TOR switches in each rack uplinked to some number of distribution switches that are uplinked to a pair of core switches.

The first downside that pops out is that you have some amount of oversubscription on the TOR switch uplink. With a 48-port GbE switch in a rack, you may have 1 or 2 10GbE uplinks for either a 4:1 or 2:1 oversubscription rate. With a 6-port 10GbE TOR switch with 1 or 2 10GbE uplinks, you have a 6:1 or 3:1 ratio. By comparison, with a "zone" model, you have full line-rate between all the connections on a single zone switch although the oversubscription rate on the zone switch uplink is likely to be much higher (10:1 or 20:1). Also, the TOR switch uplinks are a large fraction of the cost (especially with 10G uplinks), so there's a natural tendency to want to skimp on uplink capacity. For example, you can save a LOT of money by using 4 bonded 1G uplinks ( or 2 pairs of 4) instead of 1 or 2 10G uplinks.

My conclusion so far is that, if you want to connect servers at 10GbE, you *absolutely* want to go with a TOR switch model. If you need to deliver 48 or most GbE ports per rack, you probably want to go with a TOR model - even though it's a little more expensive - because it avoids a cable management nightmare. If you only need 24-ports (or less) per rack, the "zone" model still probably makes the most sense.
June 18, 2008
One thing I've done quite a bit of since taking on the network architect role last summer is meet with LSPs to discuss their networking needs. Just yesterday we met with the Center for Genomics and Bioinformatics, this morning we're meeting with the Computer Science department, and Friday with University College @ IUPUI. What I've learned is that there are many excellent LSPs and that they know their local environment better than we ever will.

As the network becomes more complex with firewalls, IPS', MPLS VPNs and such, I think we (UITS) need to find ways to provide LSPs with more direct access to affect changes to their network configurations and with direct access to information about their network. For example, if an LSP knows they need port 443 open in the firewall for their server, what benefit does it add to have them fill out a form, which opens a ticket, which is assigned to an engineer, who changes the firewall config, updates the ticket and emails the LSP to let them know it's completed ?

Okay, it sounds easy enough to just give LSPs access to directly edit their firewall rules (as one example) - why not just do this ?

First, you have to know which LSPs are responsible for which firewall rules. To do that you first need to know who the "official" LSPs are, but then you also need to know which IP addresses they're "officially" responsible for. It turns out this is a pretty challenging endeavor. I've been told we now have a database of "authoritative" LSPs that is accomplished by an official contact from the department (e.g. dean) designating who their LSPs are. But then you need to associate LSPs with IP addresses - and doing this by subnet isn't sufficient since there can be multiple departments on a subnet. The DHCP MAC registration database has a field for LSP, but that only works for DHCP addresses and is an optional user-entered field.

Second, you have to have a UI into the firewall configuration that has an authentication/authorization step that utilizes the LSP-to-IP information. None of the commercial firewall management products I've seen address this need, so it would require custom develop. The firewall vendors are all addressing this with the "virtual firewall" feature. This would give each department their own "virtual firewall" which they could control. This sounds all fine and good, but there are some caveats.... There are limitations to the number of virtual firewalls you can create. If you have a relatively small number of large departments, this is fine, but a very large number of small departments might be an issue. Also, one advantage of a centrally managed solution is the ability to implement minimum across-the-board security standards. None of the virtual firewall solutions I've seen provide the ability for a central administrator to set base rules for security policy that the virtual firewall admins cannot override.

Third, it is possible to screw things up and, in some rare cases, one person's screw up could affect the entire system. High-end, ASIC-based firewalls are complex beasts and you should really know a bit about what you're doing before you go messing around with them. So would you require LSPs to go through training (internal, vendor, SANS, ?) before having access to configure their virtual firewall ? Would they have to pass some kind of a test ?

I don't think any of these hurdles are show-stoppers, but it will take some time to work through the issues and come up with a good solution. And this is just one example (firewalls) of many. Oh, and people have to actually buy-in to the whole idea of distributing control !
June 18, 2008
Well, the last two weeks were very hectic on a number of fronts and I didn’t get a chance to post.

The Friday before last was the networking “all-hands” meeting. This was a meeting for *everyone* in UITS that is involved in supporting networks. My back-of-the-envelope head-count was up over 70, but with vacations, 24x7 shift schedules and whatnot, we ended up with about 40. I babbled on for a solid 90 minutes about the draft 10-year network plan, the discussion and work that went into developing it, and how that will translate into changes over the next year or so. After questions, we were supposed to have some great cookies and brownies, but much to my dismay, the caterers showed up an entire hour late - after most people had left.

After the snackless break, we asked everyone who had a laptop to come back for some wireless testing. We had 3 APs setup in the auditorium and during the presentation had done some testing to see how well clients balanced out across the APs (not very well as we expected) and what throughput/latency/loss looked like with certain numbers of users on an AP. The dedicated testing was all done with 1 AP for all users. Using speed tests and file downloads, we tried to take objective and subjective measurements of performance with different numbers of users associated with that AP (10, 20, & 40). The goal was to set a cap on the number of users per AP that, when reached by a production AP, would be used to trigger a notification so we proactively track which locations that are reaching capacity.

I spent last week out in Roseville California meeting with HP ProCurve. I don’t know about you, but trips to the west coast just KILL me ! Doesn’t matter what I do, I always wake up at 4am (7am Eastern) and, inevitably, my schedule is booked until 9-10pm Pacific. The meeting was excellent and useful - although you’re not going to get any details here because of our non-disclosure agreement.

Okay, now throw on top of this that I’ve had multiple contractors tearing up my house for the last 2 weeks, and it’s been NUTS !
May 29, 2008
We're in the middle of our 2 weeks of "downtime" before the full-scale deployment starts. So far everything is running fairly smoothly. We did a fail-over test of a primary WESM controller with 151 APs associated with it and that went very well. It's summer so of course, although we have 200+ of the new APs deployed, the total number of simultaneous users is still quite low. We won't truly stress the system until fall classes start.

This Friday Jason and I are doing a Tech Talk for LSPs on the new wireless system. I'll get an overview of the project, and probably the core upgrade project as well, and Jason will cover the details that LSPs will need to know.

Next Friday we're having a "all-hands" meeting of the networking staff. The idea is to present the 10-year network master plan to everyone supporting the network from the installers to the IP engineers, so they understand how this will impact them and what role they will play. There will be a particular focus on the wireless and core upgrade work this summer, the MPLS VPN deployment this fall and support for IPTV and Voice/IP. We're also going to use this meeting as an opportunity to do some performance testing of the new wireless system. We'll be in a fairly large auditorium and should have 60-70 people. We're going to do thing like setup 3 APs and watch how clients balance across the 3 (or not) and put all the uses on a single AP to see how many users and how much bandwidth a single AP can support.

The deployment schedule is completed with approximately 200 APs being installed every week, starting June 9th and wrapping up on August 1st.
May 22, 2008
...or at least it feels like we are ! If you couldn't guess from my posts, last week was a tough week for us. But the team put in a lot of hard work and preparation and our first full week of wireless upgrades is going very smoothly ! (He says as he raps hardily on a nearby wooden table)

We upgraded Geology and Geological Survey on Monday, Psychology and Business Grad on Tuesday, Business (and SPEA Library) on Wednesday and they'll start connecting new APs in SPEA and Informatics in another hour or two. The Wells Library will be upgraded tomorrow, then we'll take a couple of weeks off to let the dust settle and resolve any issues before continuing the upgrades.

We did have a couple of small hiccups this week - both of which were resolved quickly. In Business, some of the power injectors that supply power on the ethernet cables running to the APs were older models that only supported "pre-standard" Power over Ethernet. The new APs do not support this pre-standard version, so we had to install some standard 802.3af power injectors quickly on the morning of the change over. Also, we discovered 2 APs in the SPEA Library who's datajacks are connected to an IDF in the Business School. This meant the old APs there went down yesterday morning when we upgraded the Business School instead of this morning when we're upgrading SPEA. We were able to get new APs installed there fairly quickly to get service back up. For those who don't know, the 2 buildings are joined at the hip and apparently some of the jacks in the SPEA building are too far away from the SPEA IDF, but close enough to an IDF in the Business School. Considering we have almost 1,000 IDFs and 60,000 data jacks, these minor oversights will happen every now and then.

We also ran into a minor bug on one of our WESM controllers yesterday (nothing service impacting). We got word last night that HP engineers were able to reproduce the bug and we're waiting to hear when the fix will be available. Thanks goes to our HP TAM (Technical Account Manager) for helping us pin this down quickly !
May 16, 2008


This morning we completed the core upgrade transition and wireless upgrade for essentially 2 buildings - the UITS Comm Services building and WCC. The team put in 3 very long days getting ready, but it was definitely worth the effort as the changes this morning went off without a hitch (well, almost) !!

The 2 things we caught this morning were that the WESM's (aka controllers) had redundancy configured, but it was not enabled. Also, the radio port adoption settings had the right default power levels, but were set for random channel assignment instead of automatic (intelligent) channel assignment. I caught both of these before they started connecting the new APs, so they all came up with the right settings. Charlie pushed these config changes to the other 10 WESMs, so we're set to go.

On another topic, if you ever wondered what 348 wireless access points looked like, that's what you're looking at up above. They come 12 in a carton and there are 29 cartons there. Now just imagine what 4,800 would look like :) This is why we're trying to pace the delivery - otherwise we'd need a LOT of storage space !!
May 15, 2008
Unfortunately yesterday dealt us mostly the latter two !! After 12 years in the networking field, I've found some days everything seems to just go your way and some days...well...some days everything that can go wrong does and you wish you'd just stayed in bed !

For example, what's the chance of a compact flash card - one that worked perfectly fine all day long the day before - being corrupted when you try to boot a switch from it the next morning ??!! What's the chance of an upgrade to one switch causing another switch across campus to crash - repeatedly ??!! Sometimes even when you've taken every preparation possible, things just don't go your way ! The only consolation after a day like that is that the next one HAS to be better !

Looking on the bright side though, when we finally left that office about 8pm last night, 3 or the 4 core switches had been upgraded and 2 of them had their routing configuration about 75% complete. If you remember, we're converting our layer-2 only aggregation switches into layer-3 switches that will be the default gateway routers for the subnets on campus. Once the global routing configuration is completed on these switches and tested, we will transition the routing for each building to these switches. This routing transition for each building needs to happen first thing in the morning on the day the wireless in the building is upgraded. Therefore, the upgrade of these switches and the configuring of routing on them is a prerequisite for starting the wireless upgrade.

We should have the routing configuration completed this morning and routing for the VLANs that support the network engineers transitioned this morning. We will test throughout that day to make sure everything is working properly. The next step is to transition routing for the UITS buildings in Bloomington tomorrow morning. This will be the first test of the process of converting a building's routing followed by the wireless upgrade of the building and will give us a chance to work out kinks in the process before we start on the first 8 "pilot" buildings next week. By the time we finish the first 8 pilot buildings next week, we should have the process running like a well oiled machine !
May 9, 2008
OUT FOR DELIVERY

I don't know about you, but when I'm waiting for that next cool gadget to arrive in the mail, those 3 words always trigger a little burst of excitement...sort of like Christmas morning when I was 6 years old :) But today it's 255 new little gadgets !
May 8, 2008
That's right, we just received confirmation that our first 255 Access Points arrived at Fedex in Columbus Indiana just a couple of hours ago. We'll take a few of these APs to install in all the UITS buildings later next week and another 190 of them to install in 8 buildings in Bloomington starting on May 19th. We also received 14 WESM modules (aka controllers) earlier in the week, so we have enough hardware for the complete 8 WESM deployment at IUPUI and 1 of the 2 12 WESM deployments at IUB.

In case you're interested in the technology, the WESM modules can be deployed in groups of 12 (or less). Wireless clients can "roam" seamless across all the APs that are associated to any of the 12 WESMs in the "Mobility Group". At IUPUI, we will initially have 8 WESM modules (some of which are for redundancy purposes only) in a Mobility Group to support the 600 APs we plan to deploy there. At IUB, we will have 24 WESM modules (again, some are just back-ups) in 2 separate Mobility Groups to support the 4,200 APs we plan to deploy there.
May 6, 2008
The core upgrade project is moving along smoothly as well. We received the last few bits of the equipment yesterday - a shipment of XENPACK modules - so we have everything we need to move full steam ahead. Thanks to some hard work over the weekend, all of the new 24-port and 48-port gigabit ethernet modules are installed in the core switches in Bloomington.

This Thursday and Friday they'll be swapping out the Supervisor 720a cards for Supervisor 720bxl cards in the core switches. Essentially this will allow us to turn the core switches into full blown routers with MPLS capabilities. Once this is completed, we'll connect the core switches up to the backbone, configure routing and prepare for migrating the routing function for all the campus buildings to these switches. These transitions will happen building by building as we upgrade the wireless equipment in the buildings. Once we transition the routing for all buildings to the core switches (ie complete the replacement of all existing wireless APs), the core upgrade will be complete. Sounds easy, right ?
May 6, 2008
I know....it's been a whole week since my last post...I'm a slacker !

Jason and I attended a meeting of the CIC schools in Chicago last week to talk about wireless. For those of you who don't know, the CIC is *roughly* the BigTen Conference schools. It was great to hear what other schools are doing and to share information and ideas. One thing was definitely clear from the meeting - there is no perfect controller-based wireless product. No matter which controller-based wireless vendor you choose, you'll run into bugs, limitations and short-comings. Heck, that's true for any network equipment !!

Yesterday we received 16 of the WESM controllers. This brings the total to 22 which is enough for the entire IUPUI installation (8), 1 of the 2 mobility groups at IUB (12), and a couple of WESMs to test with. We started yesterday getting these installed in the switches and configured. Now we're just waiting on the shipment of APs that are due to arrive early next week.

Today our Technical Account Manager (TAM) from HP is in town for our first meeting. He will be our primary support contact and will know our network well so we don't have to bring a support person up to speed on what we're doing every time we open a case. Given the size (4,800 APs) and the short timeframe (3-4 months) of our deployment, having a dedicated support contact that know our network will be extremely important !
April 28, 2008
I've alluded to our other big summer project a couple of times, but now I'm finally going to give you the low-down on it !

When I returned from vacation in early January, I fully anticipated that the core of our network would remain fairly static until the summer of 2009. This is when, according to the 10-year network plan we had been developing (a topic for another post), we would do a "fork-lift" upgrade of the core. Since that was only 16 months away, I set to work developing requirements, scheduling initial discussions with vendors, etc. As part of this process, I also started meeting with various departments and groups to get a little better handle on what they would need - in addition to the UITS projects I already knew the network would have to support such as VoIP and IPTV. What I learned over the course of January and early February led to a slight change of plans :)

One of the important things I learned is that there are several departments looking to take advantage of the high-capacity, high-bandwidth networked storage systems we've deployed over the last 2 years. Our MDSS tape storage system now has 24 10GbE connected "front-end" servers each capable of moving data in and out of the system at around 3-4Gbps. The Data Capacitor is a disk-based storage system that also has 24 10GbE connected servers each capable at moving data between the IP network and disk at around 7Gbps. I met or spoke with at least 4 departments during January that were all looking to move large data sets between their buildings and these central storage systems at very high bandwidths.

Our current architecture aggregates all the 1GbE connections from the buildings into layer-2 ethernet switches, applies 802.1Q VLAN tags and trunks all those VLANs over 10GbE links to the routers. This architecture provides a lot of flexibility and works fine for large numbers of 1GbE connected buildings using a few hundred Mbps of bandwidth each, but not so well for a dozen 10GbE buildings bursting up to 3-4Gbps each. In addition, those layer-2 ethernet switches are also out of empty ports and modules.

The solutions to this are:

(1) Move the layer-3 routing function onto the aggregation switches that terminate the fiber connections to the buildings. This removes the bottleneck between the layer-2 aggregation switches and the layer-3 routing switches. It also frees up quite a few 4-port 10GbE modules that can be reused to support 10GbE connections to buildings.

(2) Upgrade the 16-port 1GbE modules to 24 or 48 port 1GbE modules. This frees up slots in the aggregation switches Inow layer-3 switches) to install the 4-port 10GbE modules to support 10GbE connections to buildings.

The other big thing that I learned about in January was PCI-DSS !! For those of you who haven't heard about PCI-DSS, that stands for Payment Card Industry - Data Security Standards. Think HIPAA on steroids for merchants that accept credit/debit cards :) PCI-DSS has a laundry list of network and processes requirements that must be met in order to be compliant.

As I dug deeper into what it would take to support the PCI-DSS requirements, it became clear (to me at least) that MPLS Layer-3 VPNs was the way to go. We had already been discussing MPLS VPNs for a while and several other universities have already deployed MPLS VPNs to solve problems like this. The general problem is that there are many different groups (or groups of systems), that each have unique network requirements and that have users/machines spread across many different buildings on campus. In addition to PCI-DSS compliant systems, you have building control systems (e.g. HVAC, security cameras, door access systems, etc), IP phones, and School of Medicine and Auxiliary Services that supports users/systems across many buildings. In a nutshell, MPLS Layer-3 VPN allows you to "group" these systems into separate virtual routers, each of which can have different network services and policies (firewall, NAT, IPS, etc).
April 28, 2008
I thought I should introduce some of the people working on the project so, when I say Jason did this or Dwight did that, you'll know who I'm talking about !

There's a core engineer group that is working through the myriad of engineering issues involved in getting the project from the RFP stage to the full-scale deployment phase. This is by no means a complete list of people working on the project !! A deployment of scale involves *many* people from all parts of the organizations !!

Ed Furia, who rose to fame as part of the video group, is the project manager for the wireless project. Ed's also involved in a lot of the engineering work, especially related to WPA2 Enterprise. Ed's done an excellent job of getting up to speed on the project in just a few weeks !

Jason Mueller, who hails from Iowa and the University of Iowa, started his tenure at IU just 5 short (or long) weeks ago. Jason has a seriously "mad" wireless skills - (How's that for modern pre-teen lingo!) - and really great experience from deploying Iowa's wireless network.

Dwight Hazen and Charlie Escue round out the group and have loads of experience and great ideas !
April 22, 2008
It's always an exciting part of a project when those emails titled "packing at the dock" start rolling into my inbox :)

Last Friday a small box with about 80 LX SFPs arrived. These are for the upgrade of the core switches we're working on (I promise I'll put up a post describing what we're doing there soon). Yesterday I got a "package at the dock" email that said there were 16 boxes form Cisco - WOO HOO !!! This got my hopes up that the new Cisco 6500 interface cards we're waiting on started showing up early - which would be awesome ! Hans was nice enough to run over to the dock for me (remember, I'm hanging out in northern Virginia) only to find it was only the daughter cards that attach to the interface cards :-( So we'll enter them into the inventory DB and put them in the storage room and wait *patiently* for the cards they mate to. The new server hardware to upgrade the RADIUS servers showed up yesterday too !

Keep it coming !
April 21, 2008
I know you're all dying to hear what's been going on...and a LOT has happened since my last post on Thursday morning. I'm at the Internet2 Member's Meeting in Arlington, VA this week and will try to make use of what free time I have to check up again.

We met with the Messaging team last week to discuss the impact of the wireless project on DHCP and ADS. The biggest issue perhaps is the need to configured DHCP option 189 on the subnets the APs are on. Option 189 can pass up to 3 IP addresses to the APs which is how the APs figure out which controllers to associate with. The APs hold no configuration through reboot. Each time they boot, they will learn the IP addresses of the primary and backup controller via DHCP option 189 and will contact the controller to get their configuration.

The engineering team met on Friday morning. The primary topic was nailing down a tentative schedule, especially for the early part of the deployment. We plan to allow 1-2 weeks of testing after the equipment arrives. Then we want to deploy around 200 APs and let them "burn-in" for a couple of weeks before starting deployment in earnest. In addition to testing, these first 200 APs (about 7 buildings) will give us a chance to document and verify the deployment procedures, so we can move quickly and smoothly with the remaining buildings. We're tentatively scheduling the first 7 buildings during the week of May 12th with full-scale deployment starting the first week of June.
April 17, 2008
Okay, I know all the smart kids out there are screaming, "But why
aren't you using NAT?" The short answer is *time* - or the lack
thereof. The HP WESMs (Wireless Edge Services Modules) do have NAT
support built-in. We could also use an external firewall placed in
front of the WESMs to perform NAT - or rather PAT since we really want
all the wireless users to shared a small pool of public IPs. We also
realize that, as the number of simultaneous wireless users grows
extremely large (say more than 16,000) and as our overall pool of
unused IP blocks dwindles, we will absolutely need to consider NAT on
wireless in order to conserve public IPv4 addresses.

HOWEVER ! We also need to deploy a few thousand APs in the next 2-3
months *AND* roll-out WPA2 Enterprise *AND* roll-out a new guest
access portal. Oh, and we have this other little project to
completely overhaul the core of the network and deploy MPLS VPNs
before August (I'll dive into that project in future posts). SO,
since we have an unused /16 block at our disposal, we think that's the
best course of action. We won't allow incoming TCP connection (no
wirelessly connected servers) and wireless clients are transient by
nature, so switching to NAT later on should be fairly painless - well,
for users at least :)

April 17, 2008
...to all the wireless users, of course ! Several weeks ago I went
looking for IP subnets to assign to our new wireless SSIDs. What I
discovered is that we do NOT have the vast amounts of unused IP space
we once thought.

But first things first... the first step was to move our IP allocation
documentation from spreadsheets and flat files into a database.
Fortunately, we already developed a nice IP allocation database for
our support of networks like Internet2 and NLR, so we had a database
ready to go. Now that all our IP allocations for all our campuses
are documented in a single place, we can look at overall IP
utilization, delegate authorization to allocate addresses from
specific IP blocks, and do better planning of our IP allocations.
This will become very important as our IPv4 address space becomes more
scarce !

Once I started looking at this, I found that, especially in
Bloomington, we don't have a whole lot unused subnets and especially
not contiguous subnets. And it turns out a LOT of these are eaten up
by wireless users !

According to our monitoring software, we are seeing about 5,000
simultaneous wireless users in Bloomington these days. However, our
DHCP lease timers are in the 90-120 minute range [see note below].
So if someone uses wireless for 10 minutes and then shuts their
laptop, their IP address is reserved for another 80-110 minutes.
This means we actually have about 10,000 total host IP addresses
assigned to our wireless subnets ! That's 1/6th of a whole /16 or
legacy Class B block. But it gets worse !! Since users must use
VPN to get full access to wireless, most of these users are also
consuming an IP in the VPN address pool. So we have several thousand
more IPs assigned to those pools for a total of nearly 16,000 host IPs
assigned for wireless users. That's 1/4th of an entire /16 or Class
B !!!

Note: On DHCP lease timers, we'd love to decrease them, but there's an
issue with some VPN clients that, when they have a VPN connection,
they don't renew their lease properly because they send DHCP packets
improperly over the VPN tunnel instead of to their local subnet, so
when their DHCP lease expires they loose their network connection
until the VPN tunnel drops and they renew their lease over their local
subnet. We used to have shorter lease times, but many users
complained that their VPN connections kept dropping in the middle of
meetings and they would have to reconnect. This won't be an issue on
the new WPA2 Enterprise SSID !

Even with shorter lease times on the WPA2 Enterprise network, given
the level of growth we're seeing in wireless usage and all the new
wireless clients from the expansion into the dorms, we think we need
to at least allocate 16,000 host IPs to the new wireless network.
Since we can't reclaim the IP space from the current wireless network
until users transition to the new one, we need to come up with a new /
18 block of IPs. The *ONLY* block we can take this from is
140.182.0.0/16 which is the last unused Class B network we have.
Since we've never used this block, we need to give ample warning to
all system administrators incase they have host firewalls that need to
be updated. And THAT, my friends, is at the top of my to-do list for
today !

April 16, 2008
Back in the good old days, when you had to carry around a PCMCIA card in order use Wifi, and even before the term "Wifi" was coined, Wireless Access Points (WAPs or just APs) provided all the functionality of 802.11 wireless in a single device. Each AP minded it's own business and did it's own thing - communicating with clients over radio frequencies, encrypting and de-encrypting packets if necessary, and passing those packets onto the wired network. This was all fine and good when Wifi hotspots were - well, just that - "spots" - individual, isolated locations. But as companies and universities started deploying very large areas of contiguous coverage - sometimes with thousands of APs - some issues surfaced with this model.

For example, with individual autonomous APs, someone has to manually tune the power of radio signals on each AP so that, together, the APs cover an entire area. Also, as clients roamed between APs that had encryption enabled, the client needed to establish a new encryption key with each new AP which would take time and cause a short "outage" during the transition.

But, what if there was a central "controller" that controlled all the APs and knew everything all the APs knew ? Then the controller could tell each AP how strong it's radio signal needs to be in order to "fill" an area. And the encryption key could reside on the controller instead of the APs so that as clients roam between APs they don't need to renegotiate an encryption key.

Thus the terms "thick" or "fat" APs and "thin" APs were born ! With these new "controller-based" systems, the functionality of the traditional AP is split across the AP (or in HP speak Radio Ports) and one or more central controllers. This architecture provides a number of advantages in addition to the ones I mentioned and nearly all the enterprise-class wireless systems on the market today utilize this model.
April 16, 2008
Well, we really have a few main goals associated with upgrading the wireless network.

1) Replace the old wireless hardware that is now 6+ years old with a "modern" system that is much more capable. If you stay tuned, I'll cover what a lot of those new capabilities are...

2) Expand coverage area especially in the Bloomington Halls of Residence, but also in other areas of the IUB and IUPUI campuses. Our current wireless deployment in the IUB Halls only covers common areas such as lounges. We will be expanding to cover nearly every square inch of the Halls of Residence (ok, don't quote me on that every square inch thing) - that means coverage in every student room as well as all common areas. To give you an idea of the scale, we currently have a total of about 1,200 Access Points (APs) for the entire Bloomington campus. We will be adding 2,700 new APs *just* in the Halls of Residence ! That's a LOT of wifi !

3) Deploy WPA2 Enterprise to replace VPN as the mechanism for accessing wireless securely. I'll have one or more posts dedicated to discussing what WPA2 Enterprise is and how it works, but in a nut shell it provides an authenticated and encrypted wireless connection in a way that is MUCH more user-friendly than VPN. I've been using this is our test environment for several months and I can tell you, once you use WPA2 Enterprise, you'll never want to use VPN to connect to wireless again !
April 16, 2008
After MANY months of working on the RFP for IU's next-generation wireless network, we FINALLY made an award this past Monday !! Of course that means now the REAL work starts !

Starting very soon now (soon = about 3-4 weeks from today) we will start deploying a new, improved and much larger wireless network using HP ProCurve's ZL wireless system. I know, I know - you probably have all sorts of questions like "why are we doing this ?" and "what do I get out of it ?". Hang tight ! I'm well over 30 and new to this new fangled blogging thing, but I'll try to get everyone up to speed over the next week or so. So stay tuned and you'll learned everything you wanted to know about wireless and probably a few things you didn't want to know !

If you want the readers digest version, hang on for a few more days and I'll post a link to the video podcast we just shot about an hour ago over at the Wells Library. I think I covered all the topics I had in my outline, so this should give you a reasonably decent overview.