The public internet today is inexpensive and ubiquitous. For the most part, it’s fairly reliable, too: A customer can reliably use in-app purchasing within a game from a smartphone, and an employee has no problem establishing a VPN and logging into SharePoint from a coffee shop.
Even high-bandwidth streaming videos and real-time games are experience relatively few dropouts, even over cellular connections. While nobody likes buffering pauses, for most users the network is plenty fast enough, no matter where the customer makes his/her connection.
All is calm on the surface… until the network edge is pummeled by an onslaught of unexpected tiny packets flowing from IoT/M2M devices and applications. IoT traffic patterns can be hard to predict, except for within controlled LAN environments like hospitals or warehouses. When M2M and IoT hit the public internet – think fitness bands, weather telemetry, geolocation, medical monitors, security systems, smart watches, and even smartphone apps – connections come in bursts, and can come from anywhere and everywhere.
With strong, unpredictable traffic growth comes the potential for non-determinism and chaos: packet loss, jitter, delays. When services degrade, the result is unhappy end users — who take to Twitter to rant about their IoT devices’ “connection failure” messages. Those devices’ applications providers then pick up the phone, call their carriers and start using words like “SLA” and “lawsuit.”
Fortunately, carriers constantly monitor the back-end data services using analytics to predict when to switch to alternative circuits or routes, add new short-term or long-term capacity to existing dedicated links, or provision new dedicated links.
The monitoring starts with instrumentation out on the network edge. All carrier-class routers and switches include robust monitoring, but that tends to be single-point, reporting on “what” is happening with that device, but not as much “why” things are going wrong across the entire network. Yes, administrators can be alerted when ports are overloaded or optical fibres are saturated, but what’s the root cause? How does the fault affect the end-to-end service? What is the service load end-to-end, across the network, and what can be done to ensure that the network is robust enough to handle IoT/M2M bursts?
Lifecycle service orchestration solutions overlay on top of the physical network equipment to offer that critical big-picture view, enabled by collecting comprehensive utilisation data per circuit, port and segment based on aggregation and correlation of historical performance data across multiple data sources. That analysed data can then be compared against SLA and other defined thresholds to generate utilisation alerts, and also draw attention to anomalies and potential problem spots for predictive analytics.
For short-term traffic overloads caused by IoT and M2M, real-time monitoring and analysis can enhance fault isolation and help the carrier meet SLA requirements by letting them quickly find and resolve the fault, and add just-in-time capacity to problem spots. For the long term, that type of analysis helps carriers predict where their network is showing a traffic growth trend, and thus where the capacity needs capital investment.
A key for analysing network traffic for IoT/M2M is to continuously audit network inventory, capacity and system features, as well as the data integrity of M2M platforms, paths and even devices. Big Data techniques help the analytics platform visualize service paths between customers, wireless gateways and physical networks – and predict when customers may experience data outages or service degradation. That includes tracking all connections including operational state and circuit details for fault isolation and resolution.
Carriers are increasingly asked to service IoT and M2M traffic – and that traffic looks different, and acts different, than typical business-to-business or consumer connectivity. It’s critical that carriers and service providers understand how IoT/M2M traffic impacts their customers and the network, and use the proper lifecycle service orchestration data gathering and analytics tools to proactively monitor connections to predict outages, and ensure SLAs and customer satisfaction. After all, the last thing the fast-growing IoT industry needs is “connection failure.”
By Chris Purdy, CTO, CENX
Chris leads the development of CENX’s technology vision, as well as professional services, and product management at CENX. A leading technical contributor to the development of Carrier Ethernet, Chris is specification editor of Carrier Ethernet Service Constructs in the MEF’s technical committee. Before CENX, Chris was CTO at Nakina Systems, a telecom operations software vendor, and helped take the company from early stage to one deployed in multiple Tier 1 providers managing thousands of network elements. Prior to Nakina, Chris spent 19 years with Nortel in numerous roles including Professional Services, Senior Network Operations and IT consultant, and Director of Optical Ethernet Management. Chris holds a BASc. in Electrical engineering from the University of Toronto.