The collection and use of Big Data has become an important part of modern business practice. The Internet of Things (IoT) movement promises to provide new opportunities for businesses interested in the intersection of people and technology. It is also wrought with pitfalls for practitioners and researchers who struggle to make sense of an increasing cacophony of signals. How should they poll and collect data from millions of signals in a way that is manageable, scalable, and statistically valid? How should they analyze and predict using these data? This presentation will discuss these challenges with applied examples from monitoring and managing one of the world’s largest computers. 
 
The polling strategy is one of the most critical parts of an IoT data strategy. Partially because it seems so straightforward, polling is a common pitfall that practitioners often face. Polling needs to take place at the appropriate rate for each type of signal. It needs to have a centralized management strategy while operating a set of distributed processes across thousands of devices. Polling can also augment the collection and analyzing phases. In order to further scale the entire data platform, researchers and practitioners both are increasingly pushing the intelligence of their platforms from the core to the edges. Distributed monitoring and polling allows selective message passing based upon local criteria. This helps compress the volume of data sent over the wire. 
 
Big Data has physical limits. As we move from the core to the edge this physical limit becomes bandwidth. Traditionally these limits have related to the amount of storage available. In the age of IoT bandwidth is more of a constraint than storage. Latency is a crucial factor in real-time systems. To handle this problem we build a processing framework that contains periodic buffers throughout the system. These buffers should account for expected workloads. On top of these buffers we can build streaming, kinetic algorithms. To enable subsequent analysis, message schemas should conform to internal and external standards. The responsibility for creating conformant messages takes place at the edge so that a common set of analytical processes can take actions on these messages. 
 
Data analysis in IoT is challenging because of the number of different types of signals. Add to this the need to build a platform makes automatic use of new signals. To handle this complexity, segment messages into sets based upon type. Two common types are events and time series. Event messages are trigger-based and occur immediately following a pre-defined set of logic. The time between events is important. This is another reason why latency is critical. It is also a reason why multiple timestamps are necessary. Time series messages are polling-based and take place periodically based upon the selected interval. The numerical value associated with the time series message is important for determining whether a system is behaving as expected. Events and time series draw from different distributions. In many cases one distribution is the conjugate prior for the other. For this reason, we can use event-driven messages to informing predictions created using time series values. 
 
The goal of this presentation is to introduce practitioners to some of the engineering and analytical challenges we have faced in building a distributed system for polling, collecting, and analyzing one of the worlds largest computers.