"Your Entertainment Source" – WWW.INSIDEBOXMAG.COM
When Facebook started work on its new data center in Forest City, North Carolina, the idea was to create pretty much an exact copy of the new-age facility the company had just built on the high-desert of central Oregon. “The blueprint we’d put together was pretty good,” says Jay Parikh, the man who oversees Facebook’s entire data center infrastructure. “We felt that all we needed to do was lather, rise, and repeat.”
But about two months into the project, Parikh and company decided this was a poor idea — not because the Oregon facility was deficient in any way, but because Facebook’s network traffic had changed in a big way and, as is always the case in the internet world, more changes were on the horizon. “We decided to change everything,” Parikh says. “We realized that we have to make sure our infrastructure is several steps ahead of what need now.”
What Facebook noticed was a significant jump in the traffic generated by its internal services — software systems that generate things like friend recommendations and realtime notifications. These services work in tandem to build each new Facebook page, and the bits traveling between these services was growing exponentially faster than the traffic to and from the internet.
So, in building the North Carolina facility, Parikh and his team overhauled the entire network to accommodate this trend. And just for good measure, they revamped the servers as well. They kept the basic design of the data center building used in Oregon. Though they’ve installed additional cooling hardware for those summer days when the North Carolina temperatures exceed what you typically get on the Oregonian high-desert, the Forest City still cools its server rooms with the outside air. But inside the data center, nothing is the same.
With its Prineville, Oregon facility, Facebook joined a small group of internet giants that are now building their own data centers and in some cases their own servers and other hardware. Like Google, Microsoft, Amazon, eBay, and Yahoo, the social networking behemoth aims to significantly reduce the cash, power, and hassle needed to operate one of the web’s most popular service, but also to maintain the speed of this service amidst competition from a host of rivals, and with its late decision to revamp the hardware in its North Carolina data center, Facebook shows just how important it is to continue pushing the edge of the proverbial envelope.
Facebook has previously discussed the new server designs used at its North Carolina. These have been “open sourced” under the aegis of the Open Compute Foundation, an organization founded by Facebook in an effort to improve hardware designs across the computing industry. But this is the first time the company has revealed its change in network topology. Jay Parikh — who took over as Facebook’s head of infrastructure engineering in November 2009 — discussed the new data center with us this week, before detailing the Forest City changes during a keynote speech at a tech conference in Silicon Valley on Tuesday.
According to Parikh, Facebook has completely revamped its internal network, from the network cards installed in the servers to the switches that connect racks of servers to the core data center network to the switches and routers that make up that core to the cables that connect everything together. For the first time, the company is running its entire network at 10 gigabits per second, which boosts raw speed of the network by ten times, and this required all new hardware.
Facebook is not unusual in moving to 10Gbps. We suspect that Google — which designs it own networking gear — has already moved to 40 or 100 Gbps. But according to Matthias Machowinski — a directing analyst with Infonetics, a research firm that tracks the networking market — the official market for 10Gigabit Ethernet is still relatively small. In 2011, he says, the official market spanned only about 9 million “ports,” or connections to servers.
At the same time, Facebook has overhauled the topology of the network. Previously, the company used what’s called a “layer 2″ network — which means it routed traffic using the basic Ethernet protocol — and all servers used the same core network to connect to each other as well as the outside world. But the company decided this needed to change when it realized that traffic between its servers was growing so quickly. According to Parikh, “inter-cluster” traffic has more than doubled over the past seven months.
Over the past seven months, the traffic moving between Facebook’s server has nearly doubled, while the traffic between the servers and the outside world has grown at a far more steady rate. Image: Facebook
“There are so many services behind Facebook. Whether you get a friend recommendation or a realtime notification or an ad, all of those are driven by different services running on the back-end,” he says. “Because of the way these services are connected to each other, we saw this exponential growth in inter-cluster bandwidth — the servers inside of Facebook talking to other servers inside of Facebook.”
So, the company moved to a “layer 3″ network — where traffic is routed using the high-level border gateway protocol, which is also used to route traffic in the heart of the internet — and it installed a new set of network routers dedicated to moving data between servers. “We had to rethink the entire topology,” Parikh says. “It separates the traffic going out to our users from the traffic going out across the data centers.”
In North Carolina, Facebook has installed at new sub-network that only handles traffic traveling between servers. Previously, it did not use this sort of “cluster network,” and all traffic was handled by the same network that connected the servers to the outside world. Image: Facebook
With the servers themselves, Facebook continued its effort to reduce costs by streamlining data center hardware. Though the company had just designed a new breed of server for its Oregon data center, company engineers put together an entirely new design for North Carolina.
The basic idea is to strip these machines down to their bare essentials — Facebook calls it “vanity-free” engineering — but the company also works to reduce the electrical power needed to run the servers and the man power needed to repair and replace them.
With its latest server designs, Facebook has packed two server motherboard into each chassis — not just one — letting them share other hardware, such as power supplies. Plus, the company has moved each hard drive to the front of the chassis so that techs can more easily remove and replace it. According to Parikh, the company has improved the design of its servers to the point where the company needs only a single data center technician for every 15,000 servers.