The Internet of Things (IoT) generates a large amount of data that can be used to provide real-time and/or historic insights into sensor measurements. The design and operating environment of these devices presents a unique set of challenges for ingesting data into big data systems.
Managing data ingestion is a serious challenge as the variety of sources and processing platforms expands while the demand for immediately consumable data is unceasing.
This white paper describes a reference architecture for using StreamSets Data Collector to move IoT sensor data into Hadoop. Specifically, we will cover two patterns:
- Real-Time Streaming of IoT Events into Hadoop
- Batch-Oriented Transfer of Historic or Master Data