Log analysis is a process that gives visibility into the performance and health of IT infrastructure and application stacks, through the review and interpretation of logs that are generated by network, operating systems, applications, servers, and other hardware and software components.
Logs typically contain time-series data that is either streamed using collectors real-time or stored for review at a later time. Log analysis offers insight into system performance and can indicate possible problems such as security breaches or imminent hardware failure.
Since logs offer visibility into application performance and health, log analysis lets operations and development teams understand and remedy any performance issues that arise during the course of business operations.
Log analysis serves many important functions including:
Some regulatory bodies insist that organizations perform log file analysis to be certified as compliant with their regulations, and every organization that wants to improve its cybersecurity posture will need expertise in log analysis to help uncover and remediate cyber threats of all kinds. Some of the regulatory compliance requirements that log analysis helps meet are ISO/IEC 27002:2013, regarding the code of practice for IT security, PCI DSS V3.1 which covers privacy for credit card and other financial information, and NIST 800-137 regarding continuous monitoring for federal IT organizations.
Logs are time-series records of actions and activities generated by applications, networks, devices (including programmable and IoT devices), and operating systems. They are typically stored in a file or database or in a dedicated application called a log collector for real-time log analysis.
A log analysts task is to help interpret the full range of log data and messages in context, which requires normalization of the log data to ensure use of a common set of terminology. This prevents confusion that might arise if one function signals ‘normal’ and other function signals ‘green’ when they both mean that there is no action required.
Generally, log data is collected for the log analysis program, cleansed, structured or normalized and then offered for analysis for the experts to detect patterns or uncover anomalies such as a cyber-attack or data exfiltration. Performing log file analysis generally follows these steps:
Here are some components of an effective log analysis system.
Normalization: Converting different log element data into a consistent format can help ensure that ‘apples to apples’ comparisons can be made, and that data can be centrally stored and indexed regardless of the log source.
Pattern recognition: Modern machine learning (ML) tools can be applied to uncover patterns in the log data that could point to anomalies, for instance by comparing messages hidden in an external list to help determine if there is a threat hidden in the pattern. This can help filter out routine log entries so analysis can focus on those that might indicate abnormalities of some kind.
Tagging and classification: Tagging with keywords and classifying by type enables filters to be applied which can accelerate the uncovering of useful data. For example, all entries of class “LINUX” could be discarded when a virus that attacks Windows servers is being tracked.
Correlation: Analysts can combine logs from multiple sources to help decode an event not readily visible with data from just a single log. This can be particularly useful during and after cyber-attacks, where correlation between logs from network devices, servers, firewalls, and storage systems could indicate data relevant to the attack and indicate patterns that were not apparent from a single log.
Artificial Intelligence: Artificial intelligence and machine learning (AI/ML) tools incorporated into modern log analysis systems can automatically recognize and discard or ignore those log entries that do not aid in uncovering anomalies or security breaches. Sometimes referred to as “artificial ignorance”, this function enables log analysis to send alerts regarding scheduled routine events that did not occur when they should have.
Structured: To give the most value, all log data should be in a central repository and structured so it is understandable to both human and machine. Thanks to advances in log analysis tools much of the heavy lifting can be done automatically. Thus, organizations should practice full-stack logging throughout all system components to get the most complete view of activities and anomalies.