With an estimated 29 billion connected devices expected to be in operation by 2022 – and over 75 billion Internet of Things (IoT) devices anticipated to be in use by 2025 worldwide – the Internet of Things is a major consideration for forward-thinking enterprises.
The abundance of IoT devices currently in use offers enterprises extensive quantities of data that can be used to create powerful insights and this is only expected to grow in the coming years, says Shivnath Babu, chief technology officer, Unravel Data. However, as enterprises deploy increasing numbers of smart devices, and the quantities of data generated increases, centralised cloud systems will play a fundamental role in ensuring these insights are being utilised smartly. As such, the proliferation of IoT proposes considerable DataOps challenges.
Difficulties handling data
With a great number of IoT devices come great quantities and types of data. For instance, IoT devices can provide types of data as varied as: customer sales, miles driven, GPS coordinates, humidity, number of persons present, vehicle speed, temperature and air quality. Many businesses are having difficulty handling the complexity and sheer quantity of data created by IoT and are finding that their data pipelines are becoming inefficient. For app-driven services that rely on real-time streaming, this is a significant issue.
To this end, personalised, real-time, streaming applications like Kafka, Spark, Kudu, Flink, or HBase are needed to manage the heavy big data requirements of modern cloud-delivered services. That being said, analysing streaming traffic data and generating statistical features requires complex and resource-consuming monitoring methods.
Although analysts can apply multiple detection methods simultaneously to the incoming data, this inevitably results in complexity and performance challenges. This is especially the case when applications span across multiple systems (e.g. interacting with Spark for computation, with YARN for resource allocation and scheduling, with HDFS or S3 for data access,or with Kafka or Flink for streaming). These deployments can become even more complex if they contain independent, user-defined programs as repeat data preprocessing or feature generation common in multiple applications.
Explosive IoT growth
To create the cloud infrastructure necessary to sustain the explosive growth of IoT devices, current data management tools and processes aren’t up to the task. To manage the challenge presented by extensive IoT devices, many businesses are beginning to recognise the need for AI or ML-integrations.
These integrations augment the capabilities of data teams in making sense of all this data by enabling intelligent data operations that reduce the burden of manually sorting data. This helps data be routed to the right place faster, keep pace with business needs and sustain the real-time element of their dataops.
Often in these scenarios, the streaming application can lag behind in processing data in real-time and determining the root cause can be a cumbersome challenge for such a complex system. As such, a data deployment that relies on machine learning and artificial intelligence (AI) is far more likely to provide the performance, predictability and reliability needed when compared to alternatives.
To enable the efficient and continuous collection of data from IoT devices, machine learning algorithms have proven essential in enabling scrutiny of application execution, identifying the cause of potential failure, and generating recommendations for improving performance and resource usage. Another key benefit is that the implementation of such processes allows for organisations to enjoy lower costs and increased reliability.
Consider each use case
As such, it’s key to consider each individual use case and see what specific IoT challenge it is providing an answer to. By understanding the environment first, and the problems it presents for its respective organisation, IT teams are able to make a faster path to implementing the necessary solutions. Whether that be machine learning or AI, delivering an IoT-based deployment is contingent on augmenting the data team with automation to manage the complexity that emerges.
The author is Shivnath Babu, chief technology officer, Unravel Data.