How does Facebook manage the insane amount of data that one billion users pour into the service nearly every day? Wired explains how the company manages and analyzes the staggering amount of data using custom-built software solutions like Prism and Scuba. Facebook has the world's largest Hadoop cluster — a group of servers connected using Hadoop's open-source software — with more than 4,000 machines containing over 100 petabytes of data. Even more impressive, it isn't Facebook's only cluster. The problem of managing the constantly swelling system requires some of the greatest engineering and computing minds to solve, but as database administration and storage systems manager Santosh Janardhan told Wired, "if you're a technical guy, this is like Candy Land."
We'll email you a reset link.
If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.
Choose an available username to complete sign up.
In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.