Hazelcast IMDG and Apache Cassandra combine to deliver real-time smart meter IoT platform
Hazelcast, the open source in-memory data grid (IMDG) with hundreds of thousands of installed clusters and over 39 million server starts per month, announced details of a new IoT platform designed by Future Grid – the developer of an operational intelligence data platform which provides electric power generation utilities with a real-time, streamlined view of their Internet of Things (IoT) data. The platform combines the unique in-memory capability of Hazelcast IMDG with Apache Cassandra to process extreme volumes of data cost effectively.
Future Grid works with several Australian utility companies to automate the processing of sensor and smart meter data which crosses energy networks. Their customers are collecting approximately 3 billion data points every day. In terms of daily post processing this equates to 20 billion records as each record has multiple, individual data points – a massive scaling challenge. To make the most of this information, customers need a real-time data aggregation and processing solution which enables them to make complex real-time decisions.
When Future Grid first tried to solve this problem it used traditional relational databases. However, it soon became apparent that traditional databases couldn’t cope with huge volumes of data in real-time, the main issue being that they can’t execute algorithms against incoming data fast enough. Therefore, Future Grid decided to build its own solution combining Hazelcast IMDG with Apache Cassandra’s persistence data store capabilities.
Chris Law, co-founder and managing director at Future Grid, explained: “We implemented Hazelcast IMDG at the core of our products in-memory capability, while also integrating it with a range of purpose built technologies to deliver the platform our customers required.
For example, Hazelcast IMDG is integrated with Apache Cassandra which provides internal data storage in regard to reference data while maintaining a distributed grid architecture. We found integrating Hazelcast with Cassandra was a very straightforward process.”
For Future Grid, Cassandra’s persistence capabilities were pivotal. In the context of storing data in a computer system, persistence means that data survives after the process with which it was created has ended. Therefore, Future Grid amalgamated the strengths of the two open source solutions for its energy customers.
Integrating Hazelcast IMDG with Cassandra makes more data available and effective. Importantly, the combined solution maintains the high availability and horizontal scalability of Cassandra, while delivering performance that is 1000x faster than disk-based approaches due to Hazelcast IMDG.
For the utility companies, these are the use cases covered by the solution:
- Power quality, interval and event derivations: clean de-duplicate five minute power quality data and daily per device “rollup” that includes pre-calculations to make further analysis faster and more accurate.
- Loss of neutral detection: using machine learning and fast data processing to monitor and predict safety issues, reducing shock instances significantly.
- Phase based substation aggregation: transformer modelling using aggregate meter interval data to provide better visibility per phase substation usage. Used for long term asset planning, phase balancing and alerting of exceeding designed rating.
Customer Phase Cross referencing: using machine learning to investigate data correctness of meter to substation mappings including a responsive, real-time visualisation solution.
Law continues: “Using Hazelcast IMDG has enabled our customers to realise the dream of real-time data without the significant cost of traditional relational database models. Out of the box speed and resilience have helped our customers deliver operationally critical production systems.”
Greg Luck, CEO of Hazelcast, said: “Hazelcast IMDG has been designed to continuously process big data volumes, while ensuring low end-to-end latency. Our technology is inherently quick and solves storage issues by forming storage clusters. It can also transmit reactive access patterns to notify analysts when values change. Therefore, it can be used as a cache for big datasets during processing, while forming in-memory data lakes for frequently used data. Importantly, it is also very easy to deploy.”