Select Page
Defining The Data Lake

Defining The Data Lake

The following is a guest post from John Mallory, CTO of Analytics in EMC’s Emerging Technology Division.  John will be in St. Louis at the StampedeCon 2016 Big Data Conference presenting Best Practices For Building & Operating A Managed Data Lake. Due to the...
Apache NiFi Not From Scratch

Apache NiFi Not From Scratch

This is a guest post from Paul Boal, Big Data Practice Lead at Amitech Solutions.  Paul will be in St. Louis at the StampedeCon 2016 Big Data Conference presenting Using The Internet of Things for Population Health Management.  You can also get hands-on experience...
What is Apache Spark Used For?

What is Apache Spark Used For?

This article was originally posted by Edd Dumbill of Silicon Valley Data Science and re-posted here with Edd’s permission.  We thought it would be a great complement to these other StampedeCon presentations: Using Multiple Persistence Layers in Spark to Build a...
Solving Large-scale Offline Data Ingestion Challenges

Solving Large-scale Offline Data Ingestion Challenges

New mobile, social, sensor and click-stream data from consumers, the Internet of Things and even the “Enterprise of Things” is creating opportunities but also challenges for organizations who have decided to try to harness this data’s potential.  One...
Will big data allow the right to be forgotten?

Will big data allow the right to be forgotten?

In a decision that will have big data usage implications, the European Union Court of Justice has ruled that users have the right to request Google remove search results which contain private, or otherwise sensitive data. The search engine giant will be obliged to...
Scaling R with Hive via Pluggable Query Generation

Scaling R with Hive via Pluggable Query Generation

Apache Hive is a good tool for performing ETL and basic analytics but is limited in statistical analysis and data exploration capabilities. R, on the other hand, has become a preferred language for analytics, as it offers a wide variety of statistical and graphical...