A look at Facebook’s efficient information architecture

Facebook is truly a success story not only in the social networking space, but in the technology sector. It dwarfs IBM by market capitalization (US $212 billion versus $161.6 billion for IBM). It keeps itself relevant through acquisitions, and is constantly innovating. It is the innovation in the back-end that are valuable lessons for enterprises. By keeping the backbone running efficiently and cheaply, Facebook ensures it meets its long-term infrastructure needs. What is Facebook doing?

Meeting transactional needs

Facebook processes 930 million daily photo uploads daily, from 1.35 billion users. There are 6 billion likes and 12 billion messages sent daily. There is clearly a high load on its data centers, which means high heat creation. Facebook lowers its cooling costs by using cold outdoor air.facebook-toronto-blue

Open Compute Project

In 2011, Facebook started the Open Compute Project. Though by 2013 the designs were not yet for live data centres, the goal of the project is about improving energy efficiency. Another way Facebook drove efficiency goals is through HHVM. HHVM works through just-in-time (JIT) compilation approach to obtain superior performance. Together with HPHP, which optimizes PHP source code, Facebook caches database and web data, with code running very efficiently. Facebook generates a five- to six-fold efficiency advantage with HHVM over Apache. Overall, at peak demand, data centers are still able to perform when servers are under full load.

facebook-toronto-black

Efficiency with News feed rack

Facebook keeps recent activity stored in RAM. The leaf aggregator, which has the data, takes into account the ranking algorithm results, then consolidate the information when responding to a request. When the aggregator makes a query, it is sent out in parallel to 40 servers in a rack. The system gathers data from the subset of data, ranked by user interest, then displayed.

Data storage

Facebook uses HADOOP for the data warehouse involved with data processing. Since data I/O (input/output) is intense for functions like graph search, Facebook has flash sled storage. Each unit has between 256 – 512 GB RAM. A 20 computer server could then have 320 CPU cores, 3 terabytes of RAM, and 30 TB of flash.

 

social media
Image courtesy of Shutterstock.com

 

No virtualization

Facebook does not appear to use virtualization in its architecture. The company’s site is constantly active, so there are hardly ever times servers are idle. Virtualization makes sense when there is idle workload. In that case, the system could compact server count by eliminating or shutting down the idle ones. Conversely, Facebook builds with scale in mind. Balancing server utilization results in greater efficiency.

Future expectations

There is a trend for faster NICs (Network interface controller). Network bandwidth will become a limiting factor for data centres. Network speeds will need to advance. This will transform the way data centers will run. For Facebook, faster network speeds will mean the firm changes the way it writes the code for services.

 

Would you recommend this article?

Share

Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.


Jim Love, Chief Content Officer, IT World Canada
Chris Lau
Chris Lauhttp://seekingalpha.com/author/chris-lau/articles
In search for alpha. Telecom, media, technology. Social media. Financial Markets. Real-Estate Agent. Seeking Alpha Contributor. Toronto, Ontario · bit.ly/uaDXCc

Featured Download

IT World Canada in your inbox

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

Latest Blogs

Senior Contributor Spotlight