Get hands-on experience building a scalable, real-time Big Data analytics platform.
This technical workshop will be led by Jon Haddad, one of the industry’s sought after technical gurus. He was a primary developer and maintainer of cqlengine, the Python object mapper for Cassandra, which is now Cassandra’s native python driver. He also works with Datastax as their Technical Evangelist.
Deep Dive into Apache Cassandra & Apache Spark
This will be a hands on, deep dive into Apache Cassandra & Apache Spark. We’re going to roll up our sleeves and get our hands dirty. In this 3 hour session we’ll start by covering the core concepts of Cassandra that will allow us to build data models that will scale our OLTP application from gigabytes in a single data center to petabytes across half a dozen distributed around the world. After that, we’ll dig into Spark, running a local cluster for bulk processing, followed by live streaming. We’ll wrap things up by visualizing our data using iPython notebooks, making it easy to quickly analyze our datasets.
Prerequisites and System Requirements:
- You must be familiar with using Linux
- Install VirtualBox; a Linux-based VM will be provided.
- VM will require 4 GB of RAM and at least 2.2 GB of disk.
About Jon Haddad
Jon has 15 years experience in both development and operations. For the last 10 he’s worked at various startups in southern California. He was a primary developer and maintainer of cqlengine, the Python object mapper for Cassandra, which is now Cassandra’s native python driver. He’s now a Technical Evangelist at Datastax, continuing to focus on advancing Cassandra in the Python, operations and data science communities.