How data quality makes IoT projects more profitable

Global technology spending on the Internet of Things (IoT) is expected to reach $1.2 trillion (€1 trillion) in 2022, led by industries such as discrete manufacturing $119 billion (€108 billion), process manufacturing $78 billion (€70.8 billion), transportation $71 billion (€64.5 billion) and utilities $61 billion (€55.4 billion).

Indeed, the market for Industry 4.0 products and services is expected to grow significantly over the next few years – and over 60% of manufacturers are expected to be fully connected by that time, utilising a change of technologies such as RFID, wearables and automated systems, says Ramya Ravichandar, VP Products, FogHorn.

Although the industry anticipates positive growth in current and upcoming IoT and IIoT projects, some significant challenges still need to be addressed in order to fully win customer trust and move pilot projects into successful, large-scale IoT productions. While many see connectivity limitations, security risks, and data bias, including data quantity, issues as roadblocks to IoT success, we have found data quality also plays a critical role in delivering effective IoT projects.

What is data quality – and how does it impact deployment success?

Data quality plays a vital role in the increasing adoption of IoT devices in three main ways:

Organisations can only make the right data-driven decisions if the data they use is correct and suitable for the use case at hand.
Poor-quality data is practically useless – and can lead to severe issues, such as inaccurate machine learning models, inaccurate decision-making, or deficient ROI.
Specifically, the classic problems of garbage in/garbage out resurfaced with the increase of artificial intelligence and machine learning applications.

High-quality data feeds, trains, and tunes machine learning (ML) models to empower IoT-enabled factories to make informed data-driven decisions.

For example, the unexpected failure of a steam turbine can create a critical disruption, damage, and economic loss to both the power plant and the downstream power grid. Predictive machine learning models, trained on high-quality data sets, help these industrial organisations maximise the reliability of their equipment by detecting potential failures before significant problems arise.

However, dirty data, including data that is missing, incomplete, or error-prone, leads organisations to make inconvenient, time-consuming, and expensive mistakes. In fact, according to The Data Warehouse Institute (TDWI), dirty data costs U.S. companies around $600 billion (€545 billion) every year. It is a fact that about 80% of a data scientist’s job is focused on data preparation and cleansing to ensure that the ML models provide the right insights.

Looking ahead, organisations must incorporate methodologies to ensure the completeness, validity, consistency, and correctness of its data streams to enhance insight quality, deploy effective IoT projects, and realise optimal ROI.

So, what role does edge computing play in data quality?

Industrial sensors come in many different types and collect high volumes, varieties, and velocities of data, including video, audio, acceleration, vibration, acoustic, and more. If an organisation is able to successfully align, clean, enrich and fuse all these various data streams, it can significantly improve the efficiency, health, and safety of their operations. However, to paint a complete, accurate picture of the factory operations, organisations must gather, marry and process the raw insights delivered by these varied, remote data sources.

Edge computing thrives on these types of environments as they can gather and process real-time data at its inception, and then create a structure within the data to help identify the value.

Edge-enabled machines help clean and format dirty data locally, which improves the training and deployment of accurate and effective machine learning models. Indeed, industry researchers believe edge-based use cases for IoT will be a powerful catalyst for growth across the key vertical markets – and that data will be processed (in some form) by edge computing in 59% of IoT deployments by 2025.

For example, using edge computing, factories can improve product quality by analysing sensor data in real-time to identify any values that fall outside of previously defined thresholds, build and train an ML model to identify root problem causes, and, if desired, deploy the ML model to automatically stop the production of defective parts.

For these, and similar, use cases, edge-enabled solutions transform real-time machine data (low-quality data) into actionable insights (high-quality data) related to production efficiency and quality metrics that can be used by operations managers to reduce unplanned downtime, maximise yield and increase machine utilisation.

Many organisations are beginning to understand the value edge computing can bring to their IoT and IIoT projects, as edge solutions turn raw, streaming sensor data into actionable insights using real-time data processing and analytics. By cleansing and enriching dirty data at the point of its creation, edge computing can significantly enhance data quality and refine repetitive machine data for better operational efficiencies.

The author is Ramya Ravichandar, VP products, FogHorn

Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow