List of Software Used Facebook - The big challenge for the Facebook engineer is to keep the site alive and running smoothly despite handles nearly half a billion active users. This article attempts to look at some software and techniques they use to achieve this.
Before we get into things more detail, here are some facts as an illustration of the scale of the challenges to be dealt with Facebook:
Facebook serves 570 billion page views per month (according to Google Ad Planner).
Photos on Facebook more than the combination of all other photo sites.
More than 3 billion photos uploaded each month.
Facebook serves 1.2 million photos per second. This does not include images served by CDN Facebook.
More than 25 billion content (status updates, comments, etc.) are shared each month.
Facebook has more than 30,000 servers (this is data from last year!)
In some ways Facebook is still a LAMP-based websites (like), but amended and expanded its operations to include many elements and other services, and a modified approach as it exists today.
For example:
- Facebook still uses PHP, but has made compilernya so it can be converted to native code on the web server, thus improving performance.
- Facebook uses Linux, but it has been optimized for its own purposes (especially in the case of the network).
- Facebook uses MySQL, but primarily as a key-value persistent storage, moving joins and logic to a web server for optimization easier there (in?? The other side?? Than memcached).
Then there are the custom-written systems, like Haystack, a highly scalable object store used to serve the photos on Facebook that much at all, or Scribe, a logging system that can operate on the scale of Facebook.
Now let's discuss (some of) the software (software) used Facebook to provide a social networking site in the world.
Memcached.
Memcached is currently one of the most popular software on the internet. He is a distributed memory caching system used Facebook (and many other sites) as a caching layer between the web server and the MySQL server (for database access is relatively slow).
Over the years, Facebook has repeatedly perform optimizations on Memcached and the surrounding software (like optimizing tissue buildup).
Facebook runs thousands of Memcached servers with tens of terabytes of data cache at one point in time. This possibility is the world's largest Memcached installation.
HipHop for PHP.
PHP, a scripting language, is relatively slow when compared to the code that runs natively on the server. HipHop convert PHP into C + + code that can then be compiled to produce a better performance.
This has allowed Facebook to optimize web server because it relies heavily on PHP to serve content.
A small team of engineers (initially just three people) at Facebook spent 18 months developing HipHop, and now has been in production.
Haystack.
Haystack is a storage system / high-performance search photos on Facebook (Actually Haystack is an object storage, so it does not save photos). He is doing heavy work, there are more than 20 billion photos uploaded on Facebook, and each is stored in four different resolutions, producing more than 80 billion photos.
And it is not just about being able to handle billions of images, the performance is also very important. As mentioned previously, Facebook serves around 1.2 million photos per second, a figure that does not include pictures / photos served by CDN Facebook.
BigPipe.
BigPipe service system is a dynamic web pages developed by Facebook. Facebook used to serve each web page in sections (called?? Pagelets?? To produce optimal performance.
For example, the chat window is taken separately, taken separately news feeds, and so on. Pagelets can be taken in parallel, so kenerja stay awake, and also allows users to access the site is well though some parts are disabled or broken.
Cassandra.
Cassandra is an open source system that served as a distributed database management. Facebook then develop it and use it for NoSQL Inbox Search feature. Besides Facebook, a number of other services are also used, such as Digg.
Scribe.
Scribe is a flexible logging system used internally by Facebook for many purposes. It was developed to handle logging in Facebook, and automatically handles new logging category that appears.
Hadoop and Hive.
Hadoop map-reduce implementation is open source which allows for the calculation of large amounts of data. Facebook used for data analysis (and as we all know, Facebook has huge amounts of data).
Hive was developed by Facebook, and with it possible to use SQL queries against Hadoop, making it easier for non-programmers to use.
Both Hadoop and Hive are open source and are used by a large number of services, such as Yahoo and Twitter.
Thrift.
Facebook uses several different languages ââfor some services. PHP is used for the front-end, Erlang is used for Chat, Java and C + + is also used in some places (and possibly other languages ââas well).
Thrift is a cross-language framework developed internally to tie all these different languages, thus enabling communication between languages. This allows Facebook to cross-language development.
Varnish.
Varnish is an HTTP accelerator that acts as a load balancer and also cache content which can then be served as fast as lightning. Facebook uses Varnish to serve photos and profile pictures, handling billions of requests every day. As with almost all used Facebook, Varnish is open source.
Other things that help Facebook run smoothly
We have mentioned some software that built the system up and helping services correctly. But the handling of that system is a complex task. Here are the things that made Facebook to keep its services continue to run smoothly.
Gradual releases and dark launches.
Facebook has a system they called Gatekeeper that allows them to run different code for a different set of users. This allows Facebook to do fitus gradual release features new A / B testing, activate certain features only for Facebook employees, etc..
Gatekeeper also allows Facebook to do something called?? Dark launches??, Ie to activate elements of certain features on behind the scenes before it aired (without the user's knowledge).
It serves as a real test and helps expose bottlenecks and other problems before the feature is officially launched. Dark launches are usually done two weeks before the official launch.
Profiling of the live system.
Facebook to monitor the system carefully and monitor the performance of each function of PHP in a live production environment. Profiling PHP is done using an open source tool called XHProf.
Gradual feature disabling for added performance.
If Facebook is experiencing performance problems, there are a number of levers that gradually disable less important features to improve the performance of the main features of Facebook.
If we look, it turns out Facebook a lot of instruction in open source to build the system. Not only is using, Facebook has also contributed to open source software such as Linux, memcached, MySQL, Hadoop, and others.
More than that, Facebook also makes software that is developed internally by them as open source. Examples of open source projects comes from Facebook is HipHop, Cassandra, Thrift and Scribe.
List of open source software where Facebook is involved in development can be viewed on the Facebook page Open Source.
Are there other sites like this?
Incredible.
: D
Comments
Post a Comment