In the pre-digital era, IT departments mastered a variety of technological approaches to extract value from data. Data warehouses, analytical platforms, and different types of databases filled data centres, accessing storage devices where records were safely preserved on disk for their historical value.
By contrast, says Kelly Herrell, CEO of Hazelcast, data today is being generated and streamed by Internet of Things (IoT) devices at an unprecedented rate. The “Things” in IoT are innumerable — sensors, mobile apps, connected vehicles, etc. — which by itself is explosive. Add to that the “network effect” where the degree of value is directly correlated to the number of attached users, and it’s not hard to see why firms like IDC project the IoT market will reach US$745 billion (€665 billion) next year and surpass the $1 trillion (€0.89 trillion) mark in 2022.
This megatrend is disrupting the data processing paradigm. The historical value of stored data is being superseded by the temporal value of streaming data. In the streaming data paradigm, value is a direct function of immediacy, for two reasons:
Difference: Just as the unique water molecules passing through a length of hose are different at every point in time, so is the unique data streaming through the network for each window of time.
Perishability: The opportunity to act on insights found within streaming data often dissipates shortly after the data is generated.
The concepts of difference and perishability apply to this streaming data paradigm. Sudden changes detected in data streams demand immediate action, whether it’s a pattern hit on real-time facial recognition or drilling rig vibration sensors suddenly registering abnormalities that could be disastrous if preventive steps aren’t taken immediately.
In today’s time-sensitive era, IoT and streaming data are accelerating the pace of change in this new data paradigm. Stream processing itself is rapidly changing.
Two generations, same problems
The first generation of stream processing was based largely on batch processing using complex Hadoop-based architectures. After data was loaded — which was significantly after it was generated — it was then pushed as a stream through the data processing engine. The combination of complexity and delay rendered this method largely insufficient.
The second generation, (still largely in use), shrunk the batch sizes to “micro-batches.” The complexity of implementation did not change, and while smaller batches take less time, there’s still delay in setting up the batch. The second generation can identify difference but doesn’t address perishability. By the time it discovers a change in the stream, it’s already history.
Third-generation stream processing
The first two generations highlight the hurdles facing IT organisations: How can stream processing be easier to implement while processing the data at the moment it is generated? The answer: software must be simplified, not be batch-oriented, and be small enough to be placed extremely close to the stream sources.
The first two generations of stream processing require installing and integrating multiple components, which results in too large of a footprint for most edge and IoT infrastructures. A lightweight footprint allows the streaming engine to be installed close to [...]
The post Streaming data unlocks new possibilities in the IoT era appeared first on IoT Now - How to run an IoT enabled business.
IOT
via https://www.aiupnow.com
Anasia D'mello, Khareem Sudlow