The Power of Provenance: The Path from Reactive to Proactive Cyberdefense

Author: Yunchuan Wei, Ph.D
Date Published: 28 July 2022

While next-generation firewalls, extended detection and response (XDR) and other security solutions do a great job of detecting and thwarting cyberattacks, it is all too common for a sneaky or camouflaged threat to slip through into the network. An organization’s security team is then required to put forth heroic effort to mitigate and remediate the danger. In conjunction with the high numbers of inaccurate security alerts (and routine maintenance and management tasks) for security solutions, this proves that, too often in cybersecurity, professionals are forced into reactive modes. Essentially, security teams must act as firefighters, rather than fire prevention specialists.

The hackers behind malware, advanced persistent threats (APTs) and other exploits are only becoming more sophisticated. Often, cybercriminals use evasive techniques to obfuscate an attack’s origins in an attempt to elude first-line security defenses. For example, they might use anonymous networks, mask the Internet protocol (IP) address, or deploy a botnet.

These methods obscure the true origins, or provenance, of the threat. Certain security methods, such as blacklists and malware signature matching, rely at least in part upon being able to identify the origination point of an attack. The devious methods of disguise used by hackers might, therefore, escape detection by certain baseline security techniques.

To aid in shifting from reactive to proactive mindsets, academic and industry researchers are beginning to investigate provenance analysis, which can theoretically help identify sources of security risk and enables proactive and automatic tactics (e.g., blocking, sandboxing, termination) to be performed before any harm can be done.

Become Proactive Through Provenance Analysis

Provenance analysis is a relatively new field of research in the security realm. Put simply, it utilizes vast amounts of log data collected by various network devices, standardizes and analyzes them, and peels back the layers of obfuscation to identify the source of an attack. Once identified, a network attack can be blocked and/or terminated in real time.

Rather than detecting an attack only after it has locked or corrupted network components, provenance analysis allows security teams to become proactive and identify threats before they can cause damage.

Rather than detecting an attack only after it has locked or corrupted network components, provenance analysis allows security teams to become proactive and identify threats before they can cause damage.

Currently, a number of existing security technologies such as next-generation firewalls (NGFWs), intrusion prevention systems (IPSs), and web application firewalls (WAFs) support log aggregation, which allows log data to be examined across multiple dimensions. Security personnel can use log aggregation to identify suspicious anomalies or detect false positives and then tune policies or take other actions.

While log aggregation is a valuable security tool, it does require human analysis and tracing of the root of an attack. It is definitely a consideration when choosing security solutions. But it does not quite rise to the complete definition of fully automated provenance analysis.

Challenges Ahead

Before the full potential of provenance analysis can be realized, a number of significant challenges must be overcome. For example, a monumental amount of storage is required to house all the data that provenance analysis intakes. Computing and network bandwidth overhead are also immense challenges that directly affect the practicality of provenance analysis from an engineering viewpoint.

Likewise, traditional network constructs and protocols are typically not designed to support provenance analysis. For example, traditional transmission control protocol (TCP)/Internet Protocol (IP) protocols cannot mark packets as suspicious or malicious, thus making human-performed provenance analysis a very high-cost proposition. In this example, security analysts spend a significant amount of time manually sifting through bulk packet dumps to find risky or malevolent traffic.

Another area of concern that must be addressed is sensor network structures, such as those used by the Internet of Things (IoT). While these architectures typically feed log data into the provenance analysis engine, pushing mitigation and enforcement measures back out to IoT devices can be a challenge. With the proliferation of IoT devices in enterprise settings—and their noteworthy vulnerabilities—effective cybersecurity for these devices only becomes more urgent.

Looking ahead, certain newer network architectures such as cybersecurity mesh architecture (CSMA) and software-defined networking (SDN) can help make provenance analysis not only feasible, but more widely available regardless of the legacy network infrastructure. In the interim, several provenance analysis techniques (e.g., log storage query, data packet marking) are likely to be packaged and become available sooner. These transitional solutions can provide a pathway to more complete provenance analysis deployments in the future.

Next Steps for Provenance Analysis

While provenance analysis is still in its infancy, it is an area of active research for academics, the network security industry, and institutions such as the Alan Turing Institute.1 Although there are several obstacles and impediments to achieving a complete provenance analysis solution, the level of risk posed by APTs and other cyberthreats mandates a new way to defend against them. Provenance analysis is a way for cybersecurity professionals to move from a reactive to a proactive stance—and to transform from constantly fighting fires to preventing them from occurring in the first place.

Endnotes

1 The Alan Turing Institute, “Provenance, Security and Machine Learning,” England

Yunchuan Wei, Ph.D

Is a research analyst at the Research Institute of New Technology (RINT) at Hillstone Networks. His scope of research spans deep neural networks, advanced threat correlation analysis, threat intelligence enrichment, and intrusion detection, among other topics in the network and cybersecurity space. Wei has published 6 academic papers (2 SCI-index), including a paper accepted at the US Institute of Electrical and Electronics Engineers’s (IEEE’s) INFOCOM, and has been awarded 9 patents. He can be reached at https://www.linkedin.com/in/frank-yunchuan-wei-226723144/.