StampedeCon Big Data Sessions
Investigate the world of big data and what that means for the present and future as industries and technologies evolve.
Slides are now available from almost all of our sessions at StampedeCon 2016. To view the slide decks, click on the individual session in the agenda below.
Sessions selected for StampedeCon 2016 cover the full range of big data strategy to big data architecture and deep dives into big data technology in technical workshop sessions. Each session is lead by expert speakers from big data technology providers like Hortonworks, EMC, and Cloudera or from big data powered companies like Pandora, Monsanto, and Graybar Electric. We have two presentation tracks which allow attendees to select from Business and Strategy or Architecture and Technical presentations throughout the entire 2-day event. Business/Strategy sessions are held in the Colonnade Ballroom while Architecture/Technical sessions are help in the Breckenridge Ballroom.
July 27, 2016
The Internet of (Human) Things is just beginning to take shape. The human body is an inexhaustible source of data about personal health, and the healthcare industry is just beginning to scratch the surface of the potential insights and value that will come from that data. While much of healthcare traditionally focuses on the episodic […]Click for more information on 'Using The Internet of Things for Population Health Management'
Hadoop adoption is a journey. Depending on the business the process can take weeks, months, or even years. Hadoop is a transformative technology so the challenges have less to do with the technology and more to do with how a company adapts itself to a new way of thinking about data. There are challenges for […]Click for more information on 'The Big Data Journey – How Companies Adopt Hadoop'
The Twitter data firehose delivers hundreds of millions of Tweets every day. This data flood comes with many ‘big data’ challenges in terms of both data volumes and velocities. This presentation will focus on tools that help you find your data ‘signal’ of interest, and will include several demos that focus on using Twitter for […]Click for more information on 'Floods of Twitter Data – Keynote'
This session will detail best practices for architecting, building, operating and managing an Analytics Data Lake platform. Key topics will include: 1) Defining next-generation Data Lake architectures. The defacto standard has been commodity DAS servers with HDFS, but there are now multiple solutions aimed at separating compute and storage, virtualizing or containerizing Hadoop applications, and utilizing Hadoop […]Click for more information on 'Best Practices For Building & Operating A Managed Data Lake'
Spark 2.0 includes many exciting new features including Structured Streaming, and the unification of Datasets (new in 1.6) with DataFrames. Structured Streaming allows one to define recurrent queries on a stream of data that is handled as an infinite DataFrame. This query is incrementally updated with new data. This allows for code reuse between batch and streaming […]Click for more information on 'What’s New in Spark 2.0: Structured Streaming and Datasets'
The collection and use of Big Data has become an important part of modern business practice. The Internet of Things (IoT) movement promises to provide new opportunities for businesses interested in the intersection of people and technology. It is also wrought with pitfalls for practitioners and researchers who struggle to make sense of an increasing […]Click for more information on 'Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyzing'
As Hadoop becomes a mainstream data platform across organizations, securing a vast and growing volume of critical information, especially financial and healthcare data, is more essential than ever. In this presentation, Derek will elaborate how to leverage Big Data technologies without sacrificing security and compliance, and will focus specially on how comprehensive security mechanisms should be put […]Click for more information on 'Hadoop Security and Compliance'
Big Data and IoT are changing the world. The big question is how Big Data and IoT are related? This presentation explores the synergy of Big Data and IoT. We will anatomize Big Data and IoT separately, in terms of what, which, why, where, when, who, how and how much. We then analyze the relationship between IoT and Big Data, specifically […]Click for more information on 'Interplay of Big Data and IoT'
Want to run queries in Impala as fast as possible without choking other workloads and services? If you are a Hadoop cluster administrator or a big data application developer, this course will help you understand how Impala Admission Control can help you make good use of available resources, avoid bad performance issues, and provide better […]Click for more information on 'Resource Management in Impala'
Enterprise Holding’s first started with Hadoop as a POC in 2013. Today, we have clusters on premises and in the cloud. This talk will explore our experience with Big Data and outline three common big data architectures (batch, lambda, and kappa). Then, we’ll dive into the decision points to necessary for your own cluster, for example: cloud vs on premises, […]Click for more information on 'Innovation in the Data Warehouse'
NOTE: Slides not available The healthcare industry is going thru a very exciting time of change. A growing population, changing economic models, and ever increasing technological innovation are all converging to make a truly data-driven healthcare system a near term reality. At Hewlett Packard Enterprise we are working at the forefront of big data revolution […]Click for more information on 'Next Generation Healthcare Analytics'
Spark has come a long way since it first showed up in our landscape. Lets spend a few moments looking at how far Spark has come. Lets look at Spark with Hadoop and Spark without Hadoop. In this discussion we will look at deploying spark the way that best suits your business and solves your data challenges. Taking a look at […]Click for more information on 'Apache Spark With or Without Hadoop?'
This session addresses the first problems of Big Data & Analytics–Identifying, Indexing, Connecting and Gaining Insight of Existing Data to Drive Value. HPE’s Chief Field Technologist will give her perspectives on Enterprise Search as a Fundamental Cornerstone of Building a Data Driven Enterprise.Click for more information on 'Enterprise Search: Addressing the First Problem of Big Data & Analytics'
July 28, 2016
Apache Hadoop is commonly used as the core of massive data pipelines. Due to it’s popularity, and strong community of contributors, the ecosystem of related software has grown to include as many as 140* projects. While having such a wide range of tools can be convenient, the sheer volume of options can also be very overwhelming. To address […]Click for more information on 'Building a Data Pipeline With Tools From the Hadoop Ecosystem'
Companies today are all focused on finding new consumption models to better utilize the data they produce. This presentation will provide insights and best practices for creating the organization and sponsorship necessary to set the foundation for success. For this session, Dan will provide an overview of the process and methodologies he employs to establish and sustain a Data Driven Culture. Key […]Click for more information on 'Creating a Data Driven Organization'
Keeping up with the Big Data analytics landscape is challenging as new tools and architectures constantly emerge to support the demand for real-time, scalable data analytics. StampedeCon has brought together a panel of Big Data experts and thought leaders to get their perspectives on what is coming next in the Future of Data Analytics.Click for more information on 'The Future of Data Analytics – Panel Keynote'
At Monsanto, emerging technologies such as IoT, advanced imaging and geo-spatial platforms; molecular breeding, ancestry and genomics data sets have made us rethink how we approach developing, deploying, scaling and distributing our software to accelerate predictive and prescriptive decisions. We created a Cloud based Data Science platform for the enterprise to address this need. Our primary […]Click for more information on 'Turn Data Into Actionable Insights'
Have you ever wanted to analyze sensor data that arrives every second from across the world? Or maybe your want to analyze intra-day trading prices of millions of financial instruments? Or take all the page views from Wikipedia and compare the hourly statistics? To do this or any other similar analysis, you will need to analyze large […]Click for more information on 'Analyzing Time-Series Data with Apache Spark and Cassandra'
This session will be a detailed recount of the design, implementation, and launch of the next-generation Shutterstock Data Platform, with strong emphasis on conveying clear, understandable learnings that can be transferred to your own organizations and projects. This platform was architected around the prevailing use of Kafka as a highly-scalable central data hub for shipping […]Click for more information on 'Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy Wins'
As the data world undergoes its cambrian explosion phase our data tools need to become more advanced to keep pace. Deep Learning and Recurrent Neural Networks have shown to be valuable tools in next-generation machine learning. Applications in text, sensor processing (IoT), image processing, and audio processing have all emerged as prime deep learning applications. In this session […]Click for more information on 'Deep Learning and Recurrent Neural Networks in the Enterprise'
Over the past several years, the Hadoop ecosystem has made great strides in its real-time access capabilities, narrowing the gap compared to traditional database technologies. With systems such as Impala and Spark, analysts can now run complex queries or jobs over large datasets within a matter of seconds. With systems such as Apache HBase and […]Click for more information on 'Introduction to Kudu'
Looking to implement Hadoop but haven’t pulled the trigger yet? You are not alone. Many companies have heard the hype about how Hadoop can solve the challenges presented by big data, but few have actually implemented it. What’s preventing them from taking the plunge? Can it be done in small steps to ensure project success? […]Click for more information on 'How to get started in Big Data without Big Costs'
NOTE: Slides not available From a Data Governance perspective, this presentation will give examples of what to consider, when transitioning from traditional RDBMS environments into the world of Big Data / Hadoop. A discussion of how vendors use Apache Projects such as: Atlas, Falcon, NiFi and Navigator to apply the principles of Data Governance, as […]Click for more information on 'Applying Data Governance in a Big Data World'
This session will touch upon two visual languages, one to describe the context around what is being asked from the data, and the other, to describe what is quantifiable. From these two visual constructs we will go specifically into the following topics: Grids, Balance, Proximity, Contextual Kernels and Hierarchy.Click for more information on 'Visualizing Big Data – The Fundamentals'