What is Apache Spark Used For?

What is Apache Spark Used For?

This article was originally posted by Edd Dumbill of Silicon Valley Data Science and re-posted here with Edd’s permission.  We thought it would be a great complement to these other StampedeCon presentations: Using Multiple Persistence Layers in Spark to Build a...
Will big data allow the right to be forgotten?

Will big data allow the right to be forgotten?

In a decision that will have big data usage implications, the European Union Court of Justice has ruled that users have the right to request Google remove search results which contain private, or otherwise sensitive data. The search engine giant will be obliged to...
Scaling R with Hive via Pluggable Query Generation

Scaling R with Hive via Pluggable Query Generation

Apache Hive is a good tool for performing ETL and basic analytics but is limited in statistical analysis and data exploration capabilities. R, on the other hand, has become a preferred language for analytics, as it offers a wide variety of statistical and graphical...
Piloting Big Data: Where To Start?

Piloting Big Data: Where To Start?

You know you have data–perhaps in various data silos.  You may even feel that there is more useful data that you could be collecting. You know there are problems it can solve. But how do you bridge the gaps in between? Even after experimenting with Big Data...