Maybe congestion on the network has slowed a key business app to a crawl. Maybe remote workers are having trouble logging in. Maybe an entire section of the network has crashed.  Whatever the crisis, when the phones start ringing and the alerts start flashing, all eyes often turn to the one person who’s seen it all and fixed it all many times in the past. Like a firefighter rushing into a burning building to save a toddler, the network hero puts out the fire with his or her unique understanding of the organization’s arcane IT infrastructure and expertise wielding network management software tools.

This so-called “hero culture” is no way to run a network. While the internet may have been designed to withstand a nuclear blast, many CIOs and other IT executives know they’d be hard-pressed to uphold their service commitments if this hero were to leave the company. And with today’s increasingly severe shortage of IT talent, the odds of that happening are probably higher than ever.

Fortunately, a new generation of network data analytics tools can help solve this hero problem. Analysts say these AI-infused predictive technologies mark a significant leap forward on the march to automation, giving companies the ability to troubleshoot problems based more on structured workflows and less on the knowledge and intuition of a single individual.

These systems aggregate network data from historic and real-time sources, then apply the analytics capabilities of machine learning and other forms of AI to help predict and prevent network problems, identify security-related anomalies, optimize performance, and assist with capacity planning. They provide data-driven recommendations on how to remediate network issues that do crop up. Many are designed to automatically respond to problems, although most network pros are still reluctant to hand over total control to an automated system. At this point, it’s still best to have humans devise the temporary workarounds.

Nevertheless, these tools can quickly provide options, and can form the basis of a modern troubleshooting approach that provides actionable information to a broad range of people. Done right, such a process can help companies improve productivity and security while reducing downtime and other network problems.


Identifying the Critical Need

In the past, enterprise networks were relatively simple. They were typically hard-wired and didn’t change much from year to year, making it feasible to create thorough documentation and sets of network diagrams. More often than not, organizations squeaked by thanks to the expertise and seat-of-the-pants processes created by humans with intimate knowledge of these systems.

This state of affairs doesn’t cut it anymore. Studies show that networking teams do a lousy job of maintaining proper documentation. Forty-four percent networking professionals surveyed by NetBrain in 2017 said they hadn’t updated their diagrams in more than one month, and 61% said more than half of their network documentation was out-of-date.

More to the point, many of today’s enterprise networks are too complex and dynamic for traditional troubleshooting approaches. They’re wired and wireless. Applications are running in data centers and in multi-cloud environments, delivered via containers, as microservices and on serverless platforms. End users can log in from almost anywhere on a variety of devices, and IoT devices are generating a vast new flood of network traffic. No set of manual procedures, no matter how well documented, can keep up.

Of course, many companies have been investing in technology to help address this problem. According to Shamus McGillicuddy, an analyst at Enterprise Management Associates (EMA), the typical company has three to five network performance management systems, which they use to track a wide variety of data types, including network flow data, packet flows, log data, SNMP data and synthetic network data generated by test tools. But in crunch time, this fragmented approach doesn’t spit out the answers networking teams need.

“People want smarter tools to help them understand what’s happening on the network and they want it more now because more things are changing,” says McGillicuddy.


Use Cases and Benefits

One key benefit of the new predictive tools is that they can be deployed as a holistic overlay to aggregate data from these other systems. According to a recent EMA survey of 150 enterprise network professionals, the most popular use cases for advanced data analytics are network security monitoring (38%), followed by network optimization (32%), and business process optimization (27%).

Here is a more detailed description of the most popular use cases:

  • Security: Data analytics systems that take advantage of machine learning and AI can create a picture of what “normal” network traffic looks like and then continuously monitor the network to identify suspicious behavior. AI-based tools have the intelligence to sift through vast amounts of data in a way that’s far beyond the capabilities of IT staffers. One of the thorniest issues associated with typical security alert systems is how to prioritize what could be an overwhelming number of alerts. AI-based systems have the potential to prioritize the anomalies that pose the highest risk to the organization.
  • Predictive Analysis: In the old days, a call from a frustrated user was often the network team’s first indication of a problem. By monitoring and analyzing key performance metrics deep within the network, data analytics software can sometimes identify potential network failures before a business disruption occurs. And these predictive algorithms can be extended beyond network traffic flows in order to identify trends in device usage or user behavior.
  • Performance Optimization: Machine learning systems can recognize potential capacity problems well before a human operator and can suggest ways to re-route traffic or re-balance loads on specific devices in order to keep the network humming at top efficiency.
  • Cost savings: Predictive maintenance is most commonly associated with manufacturing, such as the ability to anticipate when a motor is about to fail. In the networking domain, it’s far more cost effective to shift traffic flows or to upgrade a switch or router before the device crashes and an outage occurs.

Data Analytics Reality Check

While the technology behind data analytics is advancing at a rapid pace, adoption among enterprises has been slowed by a number of speed bumps.

For starters, there’s considerable technology risk, says McGillicuddy. Many customers are understandably hesitant to pull the trigger on expensive, bleeding-edge products. Vendor selection is also a challenge. Incumbent network management system vendors are adding AI-based enhancements to their existing products. Traditional vendors such as IBM, BMC and CA argue their new analytics tools are the safest way forward, while new but unproven “AIOps” start-ups such as Big Panda and Moogsoft say it’s time for a change. Since this is a relatively new area, there are no good market research studies to help customers choose.

Cultural issues also stand in the way. Since they’re not ready to fully trust automated systems with a decision that could spark a disastrous outage or slowdown, networking pros tend to only use emerging automation features for low-level, low-risk functions.

So what’s the best way to get buy-in?

One good angle is the promise of reduced stress. Oftentimes, our hero is the last one that wants to give up control in tough situations. But by lending support to make sure an analytics tool is deployed properly, they can have a hand in creating a more manageable work existence for themselves, in which they can have more confidence in teammates to handle some of the cardiac moments.

Adoption of these tools can also reduce the amount of time networking pros spend on break-fix. Rather than be forever under the gun to perform despite outdated tools, they can spend more of their time working on more interesting, forward-looking projects.

Finally, these new systems enable networks to transition from being reactive to proactive. Networks can take further steps towards becoming self-configuring, self-optimizing and self-healing. And, most importantly, intelligent networks can provide the secure, reliable, resilient platform so business workers can stay focused on business.

With these new data analytics platforms, everyone can become a networking hero.