Data too big for your algorithm? Use more machines
Data science strives to build applications that help solve modern, complex business problems. Often these problems requires solutions that scale through the use of distributed, parallel computing. However, a lot of our known techniques do not seem to directly scale using these techniques. This post discusses how we can take advantage of the Central Limits Theorem to scale some of our more advanced analysis tools.