Improving the reliability of the Internet of Things

Johan Kraft of Percepio

Over-the-air updates are enabling a dramatic change in the way systems in the Internet of Things (IoT) operate. Here, Johan Kraft, CEO and founder of Percepio explains the benefits.

The obvious advantage is, of course, easier updates, often downloaded and installed transparently. When this is coupled with software tracing, it becomes a powerful mechanism for improving the quality and reliability of a wide range of embedded IoT systems.

Systems still deployed with bugs

Despite the best efforts of developers, these systems are still deployed with bugs remaining in their code. A development team introduces on average about 120 bugs per 1,000 lines of code during development and about 5%, or 6 bugs per 1,000 lines of code, typically remain in the shipped software. When there are thousands of IoT devices deployed in the field, relying on users to report the problems caused by these bugs is neither reliable nor scalable. User reports also tend to be vague and unhelpful for solving the problem. When there are millions of devices, this matters even more.

These missed bugs probably won’t show up right away, but only cause problems under certain circumstances, otherwise they would have been found before the product shipped. While an over-the-air (OTA) update can solve the problem in the field, developers need some kind of feedback system to know about issues in the deployed devices, and they need to know quickly. This approach has long been standard in the development of mobile and cloud applications (DevOps), and it has now become viable for embedded development as well.

Identify new and important issues

The key to finding out about, and solving, problems in the field is the combination of software tracing, cloud management and OTA updates, but this is a complex challenge. The tracing code needs to be as efficient as possible in a system that is already constrained in resources. The link back to the cloud needs to be secure, transparent and transfer the right data to help developers identify any problems quickly and easily. The cloud service has to identify what issues are new and important, and then notify the developers that there is a problem that they need to fix. Once it’s fixed, the updated software must be distributed to all devices via an OTA update. And all of this needs to scale across millions of devices.

The information flow starts in the error handling code of the IoT device, such as already existing sanity checks and fault exception handlers. Using a software agent, firmware issues are uploaded as alerts to a customer’s cloud account. An alert may include an error message and any other information relevant to the specific issue, such as software state variables and hardware registers. Depending on the severity of the issue, the alert is either uploaded directly or after a device restart, once the cloud connection has been restored.

The alerts may also include a trace of the most recent software events in the device, which is recorded automatically by the agent. The trace provides both the details of the error and the context, making it easier for developers to identify the bug.

The encoding efficiency is key here, to ensure that a minimal amount of memory is needed to store a trace that provides developers with the context they need to identify the real problem. This is important for two reasons: In the collection of traces of sufficient length even from memory-constrained IoT systems, it reduces the upload time to a fraction of a second, and it minimises the cloud-side operational costs of alert messaging and storage. This encoding efficiency makes it possible to use the trace technology out in the field, also in small IoT devices, bringing dramatic advantages.

Alerts from the firmware agent are uploaded to the customer’s cloud service, which is configured to store the alerts and to also notify an engine that handles classification, statistics and sending of notifications to the developers. It also offers configuration options, for example identifying the conditions under which notifications should be sent and to whom.

Notifications received

When developers receive notification about a new issue, they can access alerts and traces to see what the problem is.

Privacy is also key here. The software trace never needs to leave the customer’s cloud account. Only an anonymised signature of the alert is required for the cloud processing, which can be provided in an external cloud service. This information can be made completely transparent, configurable, and meaningless on its own. The communication and storage is provided by the existing capabilities in the developer’s IoT platform using best practices for authentication and encryption.

Lab testing is not enough

Testing in the lab just isn’t enough to eradicate all software issues due to the complexity of today’s embedded IoT systems. Real-time tracing and alerts can identify bugs in the field as they happen, with automatic notifications to the developers to speed up resolution.

Such a system has to be scalable, secure and transparent to the developers. Once in place, it can provide immediate awareness on the very first occurrence of an issue, before many users have been affected, and let developers take full advantage of OTA updates to rapidly improve their product.

The author is Dr. Johan Kraft is CEO and founder of Percepio AB.

Dr. Johan Kraft, CEO and found of Percepio AB
Johan Kraft

About the author

Dr. Johan Kraft is CEO and founder of Percepio AB. Dr. Kraft is the original developer of Percepio Tracealyser, a tool for visual trace diagnostics that provides insight into runtime systems to accelerate embedded software development. His applied academic research, in collaboration with industry, focused on embedded software timing analysis. Prior to founding Percepio in 2009, he worked in embedded software development at ABB Robotics. Dr. Kraft holds a PhD in computer science.

For more information about the DevAlert cloud service which provides immediate feedback when something unexpected happens in the software of deployed IoT devices click here.

Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow

FEATURED IoT STORIES

9 IoT applications that will change everything

Posted on: September 1, 2021

Whether you are a future-minded CEO, tech-driven CEO or IT leader, you’ve come across the term IoT before. It’s often used alongside superlatives regarding how it will revolutionize the way you work, play, and live. But is it just another buzzword, or is it the as-promised technological holy grail? The truth is that Internet of

Read more

Which IoT Platform 2021? IoT Now Enterprise Buyers’ Guide

Posted on: August 30, 2021

There are several different parts in a complete IoT solution, all of which must work together to get the result needed, write IoT Now Enterprise Buyers’ Guide – Which IoT Platform 2021? authors Robin Duke-Woolley, the CEO and Bill Ingle, a senior analyst, at Beecham Research. Figure 1 shows these parts and, although not all

Read more

CAT-M1 vs NB-IoT – examining the real differences

Posted on: June 21, 2021

As industry players look to provide the next generation of IoT connectivity, two different standards have emerged under release 13 of 3GPP – CAT-M1 and NB-IoT.

Read more

IoT and home automation: What does the future hold?

Posted on: June 10, 2020

Once a dream, iot home automation is slowly but steadily becoming a part of daily lives around the world. In fact, it is believed that the global market for smart home automation will reach $40 billion by 2020.

Read more
RECENT ARTICLES

What is a vCIO and why do SMEs need them to thrive?

Posted on: September 28, 2021

The pandemic forced many small and medium-sized companies to rethink their business plans and the technologies they use. The hybrid and remote workforce has spawned technology challenges, including cross-team collaboration and vulnerabilities. Additionally, says Gary Pica, founder and president of TruMethods, a Kaseya company, many businesses are now evaluating regulatory issues and boosting cybersecurity measures.

Read more

Truphone enables mass IoT deployments with iSIM collaboration

Posted on: September 28, 2021

Truphone has announced that, in collaboration with Sony Semiconductor Israel Ltd., a cellular IoT chipset provider, and Kigen, a global security provider, it has enabled its IoT platform and global connectivity to run on the integrated SIM of Sony’s Altair cellular IoT chipsets, powered by Kigen iSIM OS.

Read more