Visual Correlation of Network Alerts

Transcription

Visual Correlation of Network Alerts
Visualization for Cybersecurity
Visual
Correlation of
Network Alerts
Stefano Foresti, James Agutter,
Yarden Livnat, and Shaun Moon
University of Utah
Robert Erbacher
Utah State University
ociety’s dependence on information systems has made cybersecurity an increasingly important issue. Computer networks transport
financial transactions, sensitive government information, power plant operations, and personal health information. The spread of malicious network activities
poses great risks to the operational integrity of many
organizations and imposes heavy economic burdens on
life and health.
Of particular concern is the identification of sophisticated attacks. Naive attacks are easily detected and have
small likelihood of success—for
instance, system administrators and
network analysts aren’t very conThe VisAlert visual
cerned with script kiddies or unsophisticated vulnerability exploits
correlation tool facilitates
because intrusion detection systems
(IDSs) readily detect them. Port
situational awareness in
scans are another example. These
attacks try to identify the services
complex network
running on a system by sending network packets to that service (a specenvironments by providing
ified network port). A naive scan
uses simple TCP connect packets
a holistic view of network
sent as quickly as possible. An IDS
can easily detect port scans because
security to help detect
of their close proximity in time and
high volume. Sophisticated attacks
malicious activities.
are harder to detect because they
use stealthy mechanisms and more capable techniques.
A sophisticated port scan can use alternatives to TCP
connect packets or dilute the scan over time such that
there is a delay of 0.4 seconds, 15 seconds, 5 minutes, or
even longer between packets. This delay prevents easy
algorithmic identification and can cause activities to be
lost in the noise.
IDSs analyze network traffic and host-based processes in an attempt to detect malicious activity. When they
identify anomalous activity or activity matching known
malicious activity, these systems generate an alert to
notify the administrators or analysts of their impending
doom. Each alert identifies the threat type using the
alert type classification system. IDSs often store these
alerts in stove-piped databases that aren’t easily corre-
S
48
March/April 2006
lated to other alerts or logs on the network. Thus, network analysts must use a myriad of tools that show different information in different formats, making it
difficult for them to gain an overall understanding of
the network’s security status. The high rate of false positives that these systems generate compounds this complexity.
Because attacks are dynamic, if analysts can’t absorb
and correlate the available data, it’s difficult for them to
detect sophisticated attacks. Developing tools that
increase the situational awareness and understanding of
all those responsible for the network’s safe operation
can increase a computer network’s overall security. System administrators are typically limited to textual or
simple graphical representations of network activity
(Bejtlich1 describes many available capabilities and their
applications).
Information visualization techniques and methods in
many applications have effectively increased operators’
situational awareness, letting them more effectively
detect, diagnose, and treat anomalous conditions.2 A
growing body of research validates the use of visualization to solve complex data problems3-5 (see the “Previous
Work” sidebar). Visualization elevates information comprehension by fostering rapid correlation and perceived
associations. To that end, the display’s design must support the decision-making process: identify problems,
characterize them, and determine appropriate responses. It must also present information in a way that’s easy
for the user to process. Our visualization technique integrates the information in log and alert files into an intuitive, flexible, extensible, and scalable visualization
tool—VisAlert—that presents critical information concerning network activity in an integrated manner,
increasing the user’s situational awareness.
Objectives and assumptions
We based our research and development on several
premises to ensure that visualization for cybersecurity
reflects the needs of operational environments. In general, the visualization techniques must be scalable,
robust, and effectively and intuitively represent the data
and relationships that are relevant to decision making.
The objective is to overcome the limitations of existing
Published by the IEEE Computer Society
0272-1716/06/$20.00 © 2006 IEEE
Previous Work
Historically, visualization has been applied to network
monitoring and analysis, primarily for monitoring network
health and performance. Initial visualization techniques for
intrusion detection system (IDS) environments focused on
simple scales and color representations to indicate state or
threat level. The need for better analysis mechanisms for
security and IDS-related data has motivated the exploration
of more advanced visualization techniques. Many of these
techniques effectively visualize malicious activities such as
worm or denial-of-service (DoS) attacks. However, these
visualization techniques tend to focus on specific problems
rather than general alert correlation for an entire enterprise.
Other techniques have focused on visual pattern
matching⎯that is, the representation of known attacks.
Teoh et al.1,2 analyze worms and other large-scale attacks
on Internet routing data. Similarly, McPherson et al.3
developed a technique for visualizing port activity that’s
geared toward monitoring large-scale networks for naive
port scans and DoS attacks.
Yin et al.4 and Lakkaraju et al.5 focus on representing
netflows and associated link relationships. Such techniques
are critical for analyzing attacks and IDS data, but they
quickly suffer scalability issues and are limited as to the
number of representable parameters.
Wood6 describes basic graph-based visualization
techniques, such as pie charts and bar graphs, and how
analysts can apply them to typical network data available to
all system administrators. This work describes how users can
implement visualization and apply it to such data, as well as
the meaning behind the identified results. The technique is
limited only in the visualization’s simplicity, which currently
can’t analyze the high-volume, high-dimensional data
generated by today’s environments. This remains a major
challenge for IDS data analysis in general.
Traditional representations and network alert-reporting
techniques tend to use a single sensor-single indicator
display paradigm. Each sensor uniquely represents its
information (indicator) and doesn’t depend on information
cybersecurity tools and visualizations that focus on narrow problems, work on small data sets, or don’t effectively map to the human visual and decision-making
processes. To this end, our premises include:
■ Analyst involvement. We worked with security ana-
lysts with experience in large government networks.
Their continual interactive involvement has ensured
our work’s value and validity and thus a good fit
between problem and solution, based on user needs.
■ Realistic data. We developed a realistic scenario to validate the design and used simulated data for testing.
■ Data size and completeness. The visualization handles
an organization’s subnets and hosts, numerous data
sets, and disparate relationships across multiple logs.
Our scalability solution has widespread applicability
in visualization research.
■ Holistic view. Providing a visual holistic view of the
network’s status—the least fulfilled need in state-of-
gathered by other sensors. The benefit of such an approach
lies in the separation of the various sensors. The user can
thus optimize each sensor’s indicator for the data it
produces, and then can choose which sensors to use in an
analysis. Furthermore, the failure of one sensor doesn’t
impact the rest of the system’s capability.
Consequently, the separation between sensors is also the
weakness of this representation technique. Because each
indicator is isolated, the user must observe, condense, and
integrate information generated by the independent sensors
across the entire enterprise. This process of sequential,
piecewise data gathering makes it difficult to develop a
coherent, real-time understanding of the interrelationship
between the information being displayed⎯particularly the
identification of malicious attacks.
References
1. S. Teoh et al., “Case Study: Interactive Visualization for Internet
Security,” Proc. IEEE Conf. Visualization, IEEE CS Press, 2002, pp.
505–508.
2. S. Teoh, K. Ma, and S. Wu, “Visual Exploration Process for the
Analysis of Internet Routing Data,” Proc. IEEE Conf. Visualization,
IEEE CS Press, 2003, pp. 523–530.
3. J. McPherson et al., “Portvis: A Tool for Port-Based Detection of
Security Events,” Proc. CCS Workshop Visualization and Data Mining for Computer Security, ACM Press, 2004, pp. 73-81.
4. X. Yin et al., “Vis-Flowconnect: Netflow Visualizations of Link Relationships for Security Situational Awareness,” Proc. CCS Workshop
Visualization and Data Mining for Computer Security, ACM Press,
2004, pp. 26-34.
5. K. Lakkaraju, W. Yurcik, and A. Lee, “NVisionIP: Netflow Visualizations of System State for Security Situational Awareness,” Proc.
CCS Workshop Visualization and Data Mining for Computer Security, ACM Press, 2004, pp. 65-72.
6. A. Wood, Intrusion Detection: Visualizing Attacks in IDS Data, Global Information Assurance Certification (GIAC) Practical, SANS Inst.,
2003.
the-art technology—helps analysts quickly decide
how pervasive and severe problems are, and how to
direct further attention.
■ Environment extensibility. We gave users the ability to
add new data sources, alert types, attack signatures,
and data views, as well as to enrich the visualization
with user suggestions.
Our goal is to aid analysts’ decision making by providing a visual correlation mechanism. We don’t try to
solve the entire intrusion-detection problem, nor do we
aim to make decisions for the user.
Interdisciplinary design process
We employed a user-centered interdisciplinary
methodology6 for developing information displays that
promotes design as a function of human behavior and
interaction between subject and object. We drew our
research techniques from several disciplines, including
IEEE Computer Graphics and Applications
49
Visualization for Cybersecurity
solid task analysis at the onset tend to be consistently
more useable, lead to better human performance, and
require less training.10
In the study, we used the knowledge of intrusiondetection analysts, network administrators, and security-assessment professionals. The goal of the analysis is
to ensure that the intended users will find the visualizations meaningful and intuitive, identify the components
from a list of alternatives, and extract useful information from the domain-specific design.
To achieve an understanding of the user’s mental
model, we
Analysis
Problem space analysis
Team
Mental model analysis
Refinement
Data analysis
Design
Conceptual representation scheme
Information
architecture
Scenario representation
Usability testing
Design evaluation protocol
Implementation
Computer
science
Refinement
Team
■ performed background analysis, including a litera-
Refinement
User interaction
Technical representation assessment
ture review and informal consultations with
researchers;
■ conducted semistructured interviews with administrators, security analysts, and decision makers;
■ made unstructured naturalistic observations of problem solving; and
■ organized and reported the data into workflow diagrams.
During the domain analysis, we attempted to gain
understanding in these key areas:
Prototype development
■ rules of thumb or tricks of the trade that guide rea-
soning;
Cognitive
psychology
Formal testing
■ empirical knowledge gained by experience, drawing
High-fidelity simulation
■ expert’s overall model of the problem; and
■ tasks, including control, prediction, diagnosis, plan-
on laws and relationships;
ning, monitoring, instruction, and interpretation.
Company
Success
1 Interdisciplinary design methodology using techniques from
cognitive psychology, architecture and design, and computer
science.
architecture, cognitive psychology, and computer science
(see Figure 1). We loosely based our design approach on
Snodgrass and Coyne’s hermeneutical circle concept,7
which is an iterative process of implementing a design,
learning and understanding from discussion and feedback, and subsequently refining the design.
Domain analysis
Our domain analysis study aims to identify the most
important objects and operations in the chosen domain,
these objects’ attributes, the relationships among
objects, and how people in the domain interact with
them.8 The result is a conceptual model representing
the system scenarios and the functional relationships
and criticality among variables, whether objective or
subjective (in the user’s mental model). This is necessary to design the software and the visual displays that
fulfill a group of people’s needs for a particular purpose.9
Systems that have been designed or modified with a
50
March/April 2006
We’ll submit the specifics of the procedure used for
domain analysis and the details of the cognitive analysis studies to a cognitive and human-factor studies publication.
Decision-making process
The domain analysis work identified six discrete steps
in the decision-making process. These steps identify critical areas where analysts need additional support, and
where visualization can provide the greatest benefit.
1. Identify an incident related to the computer network
that the individual is responsible for (that is, detect
that an incident occurred).
2. Evaluate the incident to see if it’s a benign alarm or
an indication that further investigation is needed
(that is, is the detected incident suspicious?).
3. Determine how prevalent the problem is and what
else is being affected. The analyst determines the
problem’s boundaries by analyzing other information to gain knowledge about the problem’s criticality. Analysts also explore what other machines are
experiencing these problems.
4. Drill down data to identify patterns and test hypotheses. The analyst tests multiple hypotheses with
detailed information about the questionable matter.
5. Report and mark results to communicate information to others. After identifying a problem, the ana-
Network flow data
Size
Protocol
Port
IP
address
Heuristic
knowledge
External
information
Hint from Jonzy: 20 suspicious
source IP’s sending too much
email
Query: view 20 suspicious
source IP’s flow data
Suspicious source IP isolated by
previous experience; others were
legitimate
Contacts suspicious source
IP user
Check suspicious source
IP flow
Destination IP not in use
Machine is probably
compromised and sends large
amounts of email
Port scan on suspicous IP:
blocked by firewall
Check protocols used by
suspicious source IP
Discovers protocols not
frequently used
Internet storm center states
protocol is increasingly
dangerous
Will check change tomorrow;
will repeat process
2
A portion of the network analysts’ workflow diagram resulting from the domain analysis.
lyst records and describes it within the larger context.
6. Direct a response. The analyst directs the responsible individuals to respond appropriately to the
problem.
Figure 2 shows the workflow diagram section resulting from the domain analysis. These workflow diagrams
help designers determine the most relevant information
to visualize different stages of the decision-making
process.
their network of responsibility. The source IP might
become an object of interest and investigation after
they detect a problem.
■ Detecting potentially dangerous attacks requires the
query and correlation of enterprise-wide large data
sets. Users want access to all sorts of data, but need
the capability to filter and remove clutter.
Our findings provided guidelines and priorities for
designing the visualization.
Visualization design
Relevant factors in data analysis and user
requirements
The domain analysis work also let us identify the data
analysis priorities and process. Our relevant findings
include:
■ A false-positive alert shouldn’t appear correlated to
other alerts, but a sustained attack will likely raise
several alerts. Furthermore, real attack activities will
likely generate multiple alerts of different types.
■ Users need a primary view of the destination IPs in
The first step in the design phase is to develop a set of
visual metaphors and descriptors along with rules defining why, how, and where to use each descriptor. The
objective is to represent information by exploiting perceptual abilities innate to human beings and embedding
them into a set of objects’ graphic properties, behaviors,
and relationships. We use basic 2D and 3D design principles such as
■ mapping data values to 1D, 2D, and 3D geometrical
primitives;
IEEE Computer Graphics and Applications
51
Visualization for Cybersecurity
determined that alerts must possess what we term the
W3 premise: the when, where, and what attributes. This
concept lets us visually correlate multiple alerts.
When
■ When refers to the point in time at which the alert
occurred.
■ Where refers to the local network node—for example,
an IP address that the alert pertains to.
■ What refers to an indication of the alert type—for
example, ($log = snort, gid = 1, sid = 103$).
Where
What
3
The VisAlert W3 visualization concept: a line connecting an alert type
(what) at time (when) to a resource (where) represents an alert instance.
■ assigning graphic attributes such as color and texture;
■ using graphic associations such as proximity, loca-
tion, similarity, and contrast; and
■ assigning transformations such as changes in the
design geometry or organization.
For instance, the application of perceptual grouping
(using color, similarity, connectedness, motion, sound,
and so on) can facilitate the understanding of the relationships between individual pieces of data. Proper presentation of information also affects the speed and
accuracy of higher-level cognitive operations.
Modern human factors theory suggests that for effective data representation we must present information in
a manner consistent with the user’s perceptual, cognitive, and response-based mental representations. When
the information is consistent with cognitive representation, performance is often more rapid, accurate, and consistent. Conversely, failure to use perceptual principles
appropriately can lead to erroneous information analyses. It’s therefore imperative that we present information in a manner that facilitates the user’s ability to
process it and minimizes any mental transformations
that must be applied to the data. This qualitative filtering
and depiction of information toward achieving a clear
end essentially constitutes representation design.11,12
We typically correlate alerts based on their when or
what attributes. If we group the alerts based on their
what attributes, we correlate them within their groups
based on additional attributes associated with that
attribute. However, the alert’s real value relates to the
local resources it pertains to. Preserving the resources’
status and integrity is in fact an IDS’s main focus. The
alerts’ what and when attributes have little if any inherent value by themselves. Consequently, visually correlating alerts with respect to resources is the key factor of
this work. A discussion of prior work and issues of correlating alerts is available elsewhere.13
The need to correlate the who attribute is secondary
in the decision-making process. Using the W3 concept
lets us simplify the representation, considering the visual
clutter that would arise from such a huge domain as
remote IPs. We can thus concentrate on the local
resources, which are what analysts try to protect. However, we incorporate the who to obtain a full representation of who, when, where, and what (W4) using the
virtual log, which we describe later.
Visualization concept
Figure 3 shows our design layout, which maps an alert’s
where attribute into the center of the circle. We represent
this using a topology map of the network under scrutiny.
The layout maps an alert instance’s what attribute to
the different sections of the outside circular element.
This arrangement allows for flexibility with regard to
the number of alert types as well as easy integration of
new alert types.
The layout maps the when attribute of an alert
instance to the circle’s radial sections, moving from most
recent (closest to the topology map) to the least recent
as it radiates outward.
We can now visualize alert instances as lines from
ρ(what, when) → (angle, radius) on the outer ring, to
Ψ(where) → (x, y) in the inner circle, where ρ and Ψ are
general projections of the alerts into our two domains.
Our system lets the user dynamically control and configure these two projections as necessary.
To reduce the possible visual clutter when showing
all alerts simultaneously, we divide the when space into
varying intervals and show only the alert instances for
the most recent history period. The remaining history
periods show only the number of alert instances that
occurred during that period.
W3 concept
The main problem in correlating alerts from disparate
logs is the seeming lack of mutual grounds on which to
base any kind of comparison between alerts. We’ve
52
March/April 2006
Additional visual indicators
We incorporated additional visual indicators that
encode information to increase the user’s situational
awareness. In the design’s first iterations, we used color
to identify alert classifications. In current display implementations, color indicates that user-determined thresholds have been exceeded. For instance, red indicates
high priority, while green indicates low priority. We’ve
also adopted a method of increasing the icon size for
nodes experiencing several alerts. The assumption is
that a resource or node on the topology that’s experiencing multiple unique alerts from both host- and network-based sources has a higher probability of malicious
activity than one experiencing only one alert. A scan of
a particular machine is an example. Although the scan
might generate a Snort alert, the activity might be
benign; however, a standard IDS will catch this simple
probe and reject the traffic. If, on the other hand, a
machine is receiving a Snort alert in addition to a Windows log alert, that machine might be experiencing an
intrusion attempt or even a successful attack. The node’s
size is a clear indicator and easily distinguishes the node
from other machines, thus attracting the attention of
the user, who can correct the problem on the suspect
machine.
The alert beams encode a problem’s persistence. If
many of the same alerts are triggered on a particular
machine over a given time interval, the line thickens to
show the number of alerts (see Figure 4). In this manner,
continual or recurring problems quickly become evident, letting the user take swift action. A beam’s color
encodes the alert’s severity when available⎯for example, Snort associates a severity level with each alert.
Thus, more severe problems become immediately distinguishable from other alerts.
Visual filters
VisAlert provides many ways to filter the data to
reduce visual clutter or help network analysts focus
on particular events of interest. Users can turn the
alert beams on and off globally, resulting in small lines
indicating which alert has been triggered on the particular nodes using color and orientation. Users can
selectively turn particular alert beams on or off by
clicking the desired beam. Users can turn alert groupings and individual alerts on or off through a dialog
box. This can help users fine-tune the display to show
only alerts that are relevant and of high priority to
their organization, eliminating many instances they
would otherwise observe. In addition, users can filter
the data to show machines experiencing a certain
number of alert types, with specific IP ranges, experiencing the same alerts, or that have the same outside
IP associated with them or a particular alert.
Simulated attack scenario
We used a simulated attack scenario to validate the
display’s efficacy prior to implementation. The sequence
of the images in this scenario shows how a malicious
attack emerges out of the background noise in our visualization design, helping users to rapidly detect and identify the attack. The attacks consisted of exploiting a
vulnerable host to gain access to more secure machines.
A security assessment expert developed the scenario. He
generated an attack using different methods and broke
When
Where
What
4 VisAlert exhibiting multiple alerts and additional relevant visual indicators, including alert type using color coding, larger node size showing more
alert types, and larger beam size for persistence of a particular problem.
the attack into different stages. To add sufficient noise,
we fed this information into a data set polluted with
other network traffic. We characterize this scenario as
an external attacker with five distinct stages. During the
five stages, as it moves from normal network activity to
data exfiltration, the visualization will show how the
node under attack slowly emerges out of the background
because of the number of types of alerts it receives.
Stage 1: reconnaissance
Reconnaissance is the identification of hosts and services on a targeted network. This form of reconnaissance
often involves simple Web queries, social engineering,
and dumpster diving.
Figure 5 (next page) shows the network’s status during the reconnaissance stage. Given the attacker’s lack
of presence on the network, this can also be considered
normal network activity with multiple instances of Snort
alerts tripped at a particular time. In this initial attack
stage, the attacker is generally passive with respect to
the network. At this time, identifying an attack in the
noisy normal network activity is unlikely.
Stage 2: probe
In this context, a probe is an attacker’s attempt to
gather information about services on a targeted host or
hosts discovered during the reconnaissance phase. Analysts could see the Internet Protocol Communication
(IPC) violations during this phase because of a particular Snort alert that was tripped on a machine on their
IEEE Computer Graphics and Applications
53
Visualization for Cybersecurity
Checksum
-45
-35
-25
-15
-5
5
-0
Win
do
w
ev
en
ta
ler
ts
P
FT
ale
rt s
H TTP
a le r t s
Sn
ta
or
le
r
ts
5 In stage 1, the attacker is doing reconnaissance—that is, looking for
hosts and services on the network. VisAlert exhibits normal activity.
requirements. This often indicates a forged packet⎯
that is, an attacker created a packet not conforming to a
proper connection. This could indicate an attempt to
hijack a session, scan a system, or attack a vulnerability.
The line’s thickness indicates the persistence of the
same Snort alert over time. A persistent Snort alert
indicates its recurrence. This is typical of naive scans in
which an attacker begins scanning a sequence of ports
on a single or multiple machines. In this case, the
attacker has targeted a single host with a long-running
scan. Such a scan can not only identify what services
are running but can also potentially identify what version of the services are in use, as well as the version
of the operating system. An attackers can use this type
of detailed information to identify detailed vulnerabilities for known attacks—that is, it can identify a version of a service with a known buffer-overflow
vulnerability.
The environment’s extensibility lets the visualization
represent any alert, no matter what instrument generated it. In other words, if a new instrument generates
other types of alerts, VisAlert can directly incorporate
its results through a plug-in architecture.
Figure 6 shows a probe and a connection (correlation)
between the IPC interface (shown with a higher-priority Snort alert) and a Windows VMTools alert. Such a correlation between events indicates a progressing attack.
Stage 3: attack
Checksum
-45
-35
-25
-15
-5
5
-0
Win
do
w
ev
en
ta
ler
ts
P
FT
ale
rts
H TTP
a le rt s
Sn
ta
or
le
r
ts
6 In stage 2, the attacker probes the network. VisAlert exhibits persistence
of an alert on a host. Simultaneously, the attacker triggers a second alert
type.
network topology. An IPC violation occurs when a connection attempts to violate defined TCP or IP interface
54
March/April 2006
In this context, an attack on a vulnerable system is an
attempt to gain unauthorized access to a network host,
usually by exploiting a vulnerable network service. We
captured several attacks during this simulation.
The first attack was an attempt to access the vulnerable system by guessing the administrator password—
a common brute-force attempt to break into a system.
Computer logs indicate repeated failed passwords as
attempted logins.
The second attack exploited a vulnerability in the
Windows Local Security Authentication Subsystem Service. LSASS has a known buffer-overflow vulnerability
in several of its versions. Snort uses pattern recognition
to identify packets containing the compromised code
for this vulnerability and generates an alert on identifying such a packet. MS Windows uses LSASS for all
authentication, thus it appears in this attack multiple
times.
Figure 7 shows another attack, which involves generating heavy scanning activity on another host on the network as a diversion. Sophisticated attackers often create
noise to cover their tracks. Generating many alerts
through port scanning makes it far more difficult for an
analyst to pick out and identify the more noteworthy
alerts.
The heavy lines that emerge out of the background
represent two machines experiencing persistent indications of a scan.
Stage 4: dig-in
Dig-in is a catch-all term for describing actions taken
by an attacker that leverages newly gained privileges on
the compromised system. This could include download-
ev
en
ta
ler
ts
P
FT
ale
r ts
H TT P
a le r t s
le
rts
7 In stage 3, the attacker attempts to access a vulnerable system and
trigger multiple alerts on the host while diverting attention by heavily
probing another host.
Checksum
-45
-35
-25
-15
-5
5
-0
Win
do
w
ev
en
ta
ler
ts
P
FT
ale
rts
H TTP
a le r t s
Migration is a human attacker’s attempt to use a compromised system to attack other systems within the targeted network. Migration relies on the fact that the
attacker has gained access to a host on the secure side of
the firewall, and will be able to see hosts and services
not visible from an external host.
In this simulation, the attacker generated a successful attack on the victim, followed by a TFTP session to
download a toolkit, followed immediately by rapid scans
for other vulnerable hosts.
Figure 9 (next page) shows the correlation of these
almost simultaneous alert triggers of different kinds on
the same host, while other hosts have triggered alerts,
but of one kind. The node’s increasing size lets analysts
focus their attention on the host that’s actually being
attacked, while the divertive or normal activity remains
in the background, cause for lesser concern.
Win
do
w
ta
or
Stage 5: migration
Checksum
-45
-35
-25
-15
-5
5
-0
Sn
ing toolkits or modifying files on the compromised system to hide malicious activity. The end goal is installing
a rootkit, which will let the attacker gain easy access in
the future, cover his or her tracks, provide complete
access to all system resources, and let the attacker identify and attack additional systems using the just-compromised system as a jumping-off point.
In this simulation, the attacker generated a Trivial File
Transfer Protocol (TFTP) GET command, commonly
generated by compromised systems and using automated attack tools and worms. (This TFTP command is part
of the first LSASS attack described earlier.) The attack’s
goal here is to download the appropriate rootkit and the
attacker’s toolkit for use against other systems in the
network. The attack then redirected a Windows command prompt, followed by multiple TFTP GET commands. This redirection let the attacker execute
command from a file and subsequently download an
entire set of files in rapid succession.
Figure 8 illustrates this attack. In this stage, the
attacked node begins to expand, which might indicate
to a network analyst the need for action. The node’s size
indicates the number of alerts associated with that host.
A large number of distinct alerts suggests a progressing
attack.
Testing VisAlert
Sn
ta
or
To test our system’s capabilities with larger and more
complex data, we used a data set generated by Skaion
Corporation for use by the Intelligence Community,
Advanced Research and Development Activity (ICARDA) research projects. This data set, which contained
numerous disparate logs and alerts from various sensors and hardware, simulated attack scenarios in large
notional unclassified intelligence community environments. Because of the research’s sensitive nature, we
can’t provide additional details on the specifics of the
data or attack scenarios.
Figure 10 shows alerts for Snort, dragon, and firewall
logs. The firewall generated numerous alerts (blocked
traffic), but of only two types. On the other hand, the
Snort log had thousands of alert types but few were
actually triggered in the tests we present here. In con-
le
r
ts
8 In stage 4, the attacker attempts to access other systems, triggering
multiple alerts on the already compromised system.
trast, the dragon log provides a rich set of alerts, many
of which were triggered.
IEEE Computer Graphics and Applications
55
Visualization for Cybersecurity
Checksum
-45
-35
-25
-15
-5
5
-0
Win
do
w
ev
en
ta
ler
ts
P
FT
ale
r ts
H TTP
a le r t s
Sn
ta
or
le
r
ts
(a)
Checksum
-45
-35
-25
-15
-5
5
-0
Win
do
w
ev
en
ta
ler
ts
P
FT
ale
r ts
H TTP
a le r t s
Note the correlation between alerts from the dragon
(blue), Snort (green), and firewall logs (orange).
Figure 10c shows a large attack on many nodes. This
view includes the virtual log (top talker), which shows
the attack’s who attribute. These outside IPs show in
one view what alerts they’ve generated at what time and
on what local machine. Using this view, a user could easily see a distributed attack on one node on their system.
We deployed the VisAlert prototype at the Air Force
Research Lab (AFRL) in Rome, New York. We worked
with system analysts with a decade of experience and
network-wide responsibility for specific AFRL sites. Such
key analysts have been a focal point in our new technology’s development and the network data’s analysis.
In this installation, VisAlert generated a positive
response. Users specifically noted its effectiveness, simplicity, and flexibility. They stated that it might increase
situational awareness by letting them see a holistic view
of their network security status. AFRL staff want to integrate VisAlert with their tools because it lets them see
information that their systems might not currently identify. Specifically, they used VisAlert as a visualization
front end to demonstrate their Air Force Enterprise
Defense system to the US Department of Defense.
To a great extent, we’ve incorporated the analysts’
suggestions, resulting in a more usable and useful tool.
AFRL continues to evaluate the tool, and we incorporate analysts’ suggestions as we receive them. Evaluation and testing is scheduled at the Army Research Lab
and at the US National Security Agency.
We presented VisAlert at the Information Assurance
Workshop (Philadelphia, February 2005) and other
meetings where it was exposed to analysts and higherlevel officials within the intelligence community and
other organizations in the Department of Defense. They
expressed interest in performing formal testing in operational environments, including VisAlert in a software
bundle for their customers, and further developing the
tool, including its incident reporting functionality.
VisAlert features and limitations
Sn
ta
or
le
r
ts
(b)
9 In stage 5, (a) the attacker attempts to access a vulnerable system, triggering multiple alerts on that host, while diverting attention by heavily
probing another host; and (b) the analyst has filtered out activities of hosts
that aren’t of interest.
The images in Figure 10 show different examples of
the visualization in different scenarios. Figure 10a shows
normal traffic. A few machines are experiencing alerts;
however, the alerts are uncorrelated, as expected.
Figure 10b shows an attack on several local machines.
56
March/April 2006
The VisAlert software already has several interactive features allowing it to filter out or expand details, including
the implementation of virtual logs (see the “The Virtual
World” sidebar, pg. 58) and the level of detail of the when
and what attributes. In the when axis, VisAlert lets users
configure different time increments to explore potential
patterns at different time scales. In the what axis, VisAlert
software lets users collapse and expand alert groupings,
allowing varying detail levels in the log hierarchies.
In its current implementation, VisAlert’s ability to
interact with the where attribute space is limited. We’re
currently implementing automatic topology generation,
which is a priority for testing in different environments.
Future research includes detail level in the topology display and the representation of dynamic networks.
We distilled the domain analysis underpinning
VisAlert’s visualization concept into a decision-making
process that’s common among many of the analysts we
observed. However, VisAlert might be limited in its ability to, or inappropriate for, enhancing some problem
types experienced by certain analysts and organizations.
Future work
Ongoing and future work is in several areas. First, we
plan to design additional visualization structures to let
analysts perform analysis and hypothesis testing of alert
details, and to let decision makers view incident reports
(the VisAlert system will evolve in a visual continuum
to allow seamless transition from a holistic view of the
system to detail drill-down).
We’ll also develop feature enhancements to let users
encode and correlate their own alert algorithms, and
enhanced capabilities for selecting and displaying detail
level. In addition, we’ll deploy VisAlert in an operational
environment.
Finally, we’ll perform formal testing—that is, measure
performance with respect to existing tools on equivalent
scenarios—in a simulated environment. Formal testing
of VisAlert will show whether VisAlert improves recognition and identification of a compromised computer network or workstation. We’ll use various simulated network
states, both threatened and nonthreatened, to assess the
visualization tool’s applicability. We’ll test users individually in two experimental sessions, counterbalancing network conditions to control for order effects.
We also hypothesize that the visualization tool will
reduce analysts’ workload, as workload assessments measured by NASA’s task load index tool should indicate. We
believe the anticipated difference in workload will derive
from the integrated and intuitive presentation of information afforded by the visualization tool.
■
(a)
Acknowledgments
We thank the network security experts and managers
from Battelle, the AFRL, and the University of Utah (Information Security Office, NetCom, Center for High Performance Computing, and Scientific Computing and Imaging
Institute), who significantly contributed to the domain
analysis work. We also thank AFRL and NSA for hosting
tests of the VisAlert system, and to the Skaion Corporation and IC-ARDA for providing us with their attack simulation data set. Special thanks to Jeff Thomas for creating
the simulated attack described in this article, Kirsten Whitley for providing valuable feedback and access, and Marty
Sheppard for providing continuous feedback and suggestions on the technology development.
A grant from the IC-ARDA (with contracting and technical management by AFRL Information Directorate)
and the Utah State Center of Excellence Program partially supported this work.
(b)
References
1. R. Bejtlich, The Tao of Network Security Monitoring: Beyond
Intrusion Detection, Addison-Wesley Professional, 2004.
2. E. Tufte, The Visual Display of Quantitative Information,
Graphics Press, 1983.
3. K. Lakkaraju, W. Yurcik, and A. Lee, “NVisionIP: Netflow
Visualizations of System State for Security Situational
Awareness,” Proc. CCS Workshop Visualization and Data
Mining for Computer Security, ACM Conf. Computer and
Comm. Security, ACM Press, 2004, pp. 65-72.
(c)
10
Visualization of alerts. (a) Normal activity. (b) Attack on specific
machines. A purple color log represents the attack’s who attribute. (c)
Multiple attacks on many machines and a firewall blocking a scan activity.
IEEE Computer Graphics and Applications
57
Visualization for Cybersecurity
The Virtual World
To expand the domain over which VisAlert operates, we
introduce the notion of a virtual world—that is, a domain of
information or metadata about the logs and alerts stored in
the database. In accordance with our general approach, we
don’t generate new alerts based on alerts in the database.
Other intrusion detection systems (IDSs) perform data
mining and create new types of logs and alerts. The key
difference is that these IDSs generate persistent data that
are stored in a database. Our virtual world extension is
temporary. The information is gathered on the fly, depends
on the current user setup, and isn’t archived.
Virtual alerts
A virtual alert represents any kind of information that
occurs during a particular time period and can be gathered
from the alerts. We call this information an alert because we
provide it to VisAlert via the regular alert mechanism. For
example, a key issue raised by the analysts we collaborated
with is the notion of top talkers. In the context of our
discussion, top talkers are nodes outside the installation that
generate the most alerts during a specific time period (for
example, the most recent history period or the innermost
ring). Obviously, such information can be computed and
gathered in the database, but it isn’t explicitly stored or
computed ahead of time.
To facilitate this talkalot example, we define new alerts
whose type indicates a remote machine. The alert contains
the number of alerts that the remote node generated in the
specified time period with respect to our local nodes. Given
a specific time period, we aggregate the alerts in the
database based on the remote machine, sort them based
on the number of alerts per machine, and then select the
top 10 talkers.
Virtual views
The top talkers in particular, and the virtual alerts in
general, extend the model domain and increase the
number of alert types. As such, we can use the same
presentation methods we applied to the regular persistent
alerts, such as hierarchical grouping and multiple views.
For example, we can group the top talkers based on their
IP addresses, or, if we list the top 100 talkers, we can
organize them in groups of 10. We can also use a view in
which we place the top talkers in order along the circle
based on the number of alerts. The problem with this
approach is that in the likely event that a top talker in a
particular time period is also one of the top talkers in the
next period, the relative position might differ. In this case,
the user might lose track of the top talker and not notice
the problem’s persistence.
An alternative view might consider the top talkers in the
previous time period. Once a top talker is assigned a
position around the circle, it stays in that position for as
long as it’s part of the top-talker group. This approach
provides consistency, but requires the user to notice when
the top talker drops out of the top group and is replaced by
a new top talker. To help the user notice such changes, we
add a dark red background to the top talker’s name (its IP
address). If the top talker remains in place after the next
clock cycle, the background becomes brighter, signaling
this top talker’s persistence.
We can also ask for the top talkers with respect to the
number of types of alerts (signatures) these remote
machines triggered rather than the number of alerts they
generated. In this case, the top talker definition differs (total
number of alerts versus number of unique signatures) and
thus these two views are essentially two different (virtual)
logs. However, because these virtual logs represent two
views of the same concept (top talker), we can regard them
as two views of a single log.
W4 and top talkers
Top talkers are an example of how to correlate relevant
who attribute information, thus filtering the immense
source IP data set. The who information might also be of
interest when requesting event details: the source IP can be
included in a pop-up display.
4. K. Vicente, K. Christoffersen, and A. Pereklita, “Supporting Operator Problem Solving through Ecological Interface Design,” IEEE Trans. Systems, Mass, and Cybernetics,
vol. 25, 1995, pp. 529-545.
5. J. Agutter et al., “Evaluation of a Graphic Cardiovascular
Display in a High Fidelity Simulator,” Anesthesia and Analgesia, vol. 97, 2003, pp. 1403-1413.
6. J. Bermudez et al., “Interdisciplinary Methodology Supporting the Design Research & Practice of New Data Representation Architectures,” Proc. European Assoc. for
Architectural Education/Architectural Research Centers Consortium (EAAE/ARCC) Research Conf., Dublin Inst. of Technology, 2004, pp. 223-230.
7. A. Snodgrass and R. Coyne, “Models, Metaphors, and the
Hermeneutics of Designing,” Design Issues, vol. 9, no. 1,
1992, pp. 56-74.
8. D. Monarchi and G. Puhr, “A Research Typology for Object-
58
March/April 2006
9.
10.
11.
12.
13.
Oriented Analysis and Design,” Comm. ACM, vol. 35, no.
9, 1992, pp. 35-47.
R. Priéto-Díaz, “Domain Analysis: An Introduction,” ACM
Sigsoft/Software Eng. Notes, vol. 15, no. 2, 1990, pp. 47-54.
W. Zachary, J. Ryder, and J. Hicinbothom, “Building Cognitive Task Analyses and Models of a Decision-Making
Team in a Complex Real-Time Environment,” Cognitive
Task Analysis, Lawrence Erlbaum Assoc., 2000, pp. 365384.
C. Ware, Information Visualization: Perception for Design,
Morgan Kaufmann, 2000.
A. Triesman, “Preattentive Processing in Vision,” Computer Vision, Graphics, and Image Processing, vol. 31, 1985, pp.
156-177.
Y. Livnat et al., “A Visualization Paradigm for Network
Intrusion Detection,” Proc. IEEE Workshop Information
Assurance and Security, IEEE CS Press, 2005, pp. 92-99.
Stefano Foresti is cofounder and
director of the Center for the Representation of Multi-Dimensional Information (CROMDI), senior scientist at the
Center for High-Performance Computing at the University of Utah, and president of Intellivis. His research interests
include visualization, user-interaction
design, security, distributed computing, intellectual property, and technology commercialization. Foresti has a doctorate in mathematics from the University of Pavia, Italy.
Contact him at [email protected].
James Agutter is an assistant
research professor in the College of
Architecture + Planning, University of
Utah, and assistant director of CROMDI. His research interests include
information visualization, human–
computer interaction, user interface
design, and technology transfer. Agutter has an MS in architecture from the University of Utah.
Contact him at [email protected].
Yarden Livnat is a research scientist at the Scientific Computing and
Imaging Institute at the University of
Utah. His research interests include
visual analytics with emphasis on situational awareness, scientific visualization, and software common components architecture. Livnat has a
PhD in computer science from the University of Utah. Contact him at [email protected].
Robert Erbacher is an assistant
professor in the Computer Science
Department at Utah State University.
His research interests include computer security, intrusion detection, computer forensics, data visualization,
and computer graphics. Erbacher has
an ScD in computer science from
the University of Massachusetts-Lowell. Contact him at
[email protected].
Shaun Moon is a research assistant
at CROMDI and is pursuing an MS in
computational design at Carnegie
Mellon University. His research interests include communication design
and information visualization. Moon
has a BS in architectural studies from
the University of Utah. He is a student
member of the IEEE and the Information Architecture Institute. Contact him at [email protected].
For further information on this or any other computing
topic, please visit our Digital Library at http://www.
computer.org/publications/dlib.
Join the IEEE Computer Society
online at
www.computer.org/join/
Complete the online application and get
• immediate online access to Computer
• a free e-mail alias — [email protected]
• free access to 100 online books on technology topics
• free access to more than 100 distance learning course titles
• access to the IEEE Computer Society Digital Library for only $118
Read about all the benefits of joining the Society at
www.computer.org/join/benefits.htm
IEEE Computer Graphics and Applications
59