BitBlaze: Binary Analysis for Computer Security
[Research Statement and Overview]
[Software Release]
[Current
Projects] [Publications] [Online Analysis Service] [Members][Contact]
Binary analysis is imperative for protecting COTS (common
off-the-shelf) programs and analyzing and defending against the myriad
of malicious code, where source code is unavailable, and the binary
may even be obfuscated. Also, binary analysis provides the ground
truth about program behavior since computers execute binaries
(executables), not source code. However, binary analysis is
challenging due to the lack of higher-level semantics.
Many higher level
techniques are often inadequate for analyzing even benign binaries,
let alone potentially malicious binaries.
Thus, we need to develop tools and techniques which work at the
binary level, can be used for analyzing COTS software, as well as malicious binaries.
The BitBlaze project aims to design and develop a powerful
binary analysis platform and employ the platform in order to (1) analyze and develop novel
COTS protection and diagnostic mechanisms and (2) analyze,
understand, and develop defenses against malicious code. The
BitBlaze project also strives to open new application areas of
binary analysis, which provides sound and effective solutions to
applications beyond software security and malicious code defense,
such as protocol reverse engineering and fingerprint generation.
The BitBlaze project consists of two central research directions: (1)
the design and development of the underlying BitBlaze Binary Analysis
Platform, and (2) applying the BitBlaze Binary Analysis Platform to
real security problems. The two research focii drive each other: as
new security problems arise, we develop new analysis
techniques. Similarly, we develop new analysis techniques in order to
better or more efficiently solve known problems. Below, we give an
overview of the two research directions.
Here is an overview paper of the BitBlaze project.
The BitBlaze
Binary Analysis Platform
The underlying BitBlaze Binary Analysis
Platform features a novel fusion of static and dynamic analysis
techniques, dynamic symbolic execution, and whole-system
emulation and binary instrumentation. The BitBlaze platform has
different components for each task: Vine, TEMU, and
Rudder. The three components in tandem provide the power for
effective analysis of real-world binary programs for various
applications.
- Vine, the static analysis
component.
Open source release available now.
Vine provides an an intermediate language for
assembly (ILA), and an infrastructure for analyzing programs
written in this language. ILA is a full language in
which programs can be written, type-checked, then compiled
down to assembly. We also provide analysis on the
ILA, such as abstract interpretation, dependency analysis, and
logical analysis via interfaces with theorem provers.
- TEMU, the dynamic analysis
component.
Open source release coming soon.
TEMU provides a dynamic analysis environment
through whole-system emulation and dynamic binary
instrumentation. TEMU is OS-aware (i.e., it understands
OS-level semantics) and enables various fine-grained
dynamic analysis to build upon, such as dynamic taint
analysis and fine-grained behavioral analysis.
- Rudder, the component for online
dynamic symbolic execution. Rudder is an engine for
online dynamic execution on binaries. At a
high level, with a specified set of input sources of
interest, Rudder can automatically explore different
execution paths in a program determined by the input
sources. It will automatically build logical formulas
representing the constraints on the chosen input to take the
followed paths.
Release Information:
We are now making some key parts of the BitBlaze Binary
Analysis Platform available under open-source licenses.
See a separate page for more
information.
BitBlaze in Action: Security Applications
Using the BitBlaze Binary Analysis Platform, we have enabled new approaches and solutions to a suite of different security problems. These results demonstrate the utility and effectiveness of the BitBlaze approach and vision---binary analysis enables fundamentally new approaches to a broad spectrum of different security problems, often solving problems at their root cause; the underlying BitBlaze Binary Analysis Platform is extensible and powerful for a broad spectrum of different security applications.
In particular, we show below three classes of security applications: (1) vulnerability detection, diagnosis, and defense; (2) automatic in-depth malware analysis and defense; (3) automatic model extraction and analysis.
-
-
Automatic Defense System against Zero-day Exploits and Worms
Worms such as CodeRed and SQL Slammer can compromise
millions of hosts within hours or even minutes and have
caused billions of dollars in estimated damage. How can
we design and develop effective defense mechanisms
against such fast, large scale worm attacks?
Sting is an automatic
worm defense system which proposes a suite of novel
techniques to automatically detect new exploits, perform
in-depth diagnosis, and generate effective anti-bodies
(vulnerability signatures and hardened binaries) to
protect vulnerable hosts and networks from further
attacks.
-
Automatic Patch-based Exploit Generation
Security patches are supposed to fix vulnerabilities in
programs. But what are the security implications of a
security patch?
In this work, we propose new
techniques and demonstrate that one could automatically
generate exploits from the patch binary and the original
vulnerable program binary and sometimes in minutes of time.
-
Loop-extended Symbolic Execution: Buffer Overflow Diagnosis and Discovery
Loop-extended symbolic
execution (or LESE) is a new technique that
generalizes the results of previous dynamic symbolic
execution techniques, which broadens the results with
effects of loops. LESE is a key enabler for powerful
automated discovery of security vulnerabilities, especially
buffer-overflows, which is highly inefficient with pure
symbolic/concrete execution. It also enables deeper
diagnosis of known vulnerabilities, which allows automated
signature generation tools to reason about variable-length
input or repeated elements in the input.
-
Measuring Quantitative Influence
Dynamic taint analysis is a fundamental tool for detecting
overwrite attacks, but it is limited to an all-or-nothing
distinction as to whether values are under the control of an
attacker, and suffers from both false-positive and
false-negative errors.
We propose quantitative
influence to more precisely characterize the degree of
control an attacker has over a value. A specialization of
the concept of channel capacity from information theory, we
show that quantitative influence can be computed precisely
using a decision procedure. Quantitative influence
accurately distinguishes real attacks from false positives
among warnings generated by a dynamic taint analysis tool on
vulnerable binary servers.
-
-
Detection and
Analysis of Privacy-Breaching Malware
A myriad of malware such as keyloggers, Browser-helper
Objects (BHO) based spyware, rootkits, backdoors, accesses
and leaks users' sensitive information and breaches
users' privacy. Can we have a unified approach to
identify such privacy-breaching malware despite their
widely-varied appearance?
Panorama proposes a
unified approach to detect privacy-breaching malware
using whole-system dynamic taint analysis.
Try it online!
-
Hidden Code Extraction from Packed Executables
Code packing is one technique commonly used to hinder malware
code analysis through reverse engineering. Even though this problem
has been previously researched, the existing solutions are
either unable to handle novel samples, or vulnerable to various
evasion techniques.
Renovo
proposes a fully dynamic approach for hidden code extraction,
capturing an intrinsic nature of hidden code execution.
Try it online!
-
Detection and Analysis of Malware Hooking Behaviors
One important malware attacking vector is its hooking mechanism. Malicious programs implant hooks for many different purposes. Spyware may implant hooks to get notified of the arrival of new sensitive data. Rootkits may implant
hooks to intercept and tamper with critical system information to
conceal their presence in the system. A stealth backdoor may also place hooks on
the network stack to establish a stealthy communication channel with remote attackers.
HookFinder
proposes fine-grained impact analysis to automatically detect and analyze malware's hooking behaviors. Since this technique captures the intrinsic nature of hooking behaviors, it is well suited for identifying new hooking mechanisms.
Try it online!
-
Automatic Malware
Dissection and Trigger-based Behavior
Analysis
Malware often has embedded behavior which is only
exhibited when certain conditions are met. Such
trigger-based behavior includes time bombs, logic bombs,
and botnets programs which reacts to commands. Static
analysis of malware often provides little utility due to
code packing and obfuscation. Vanilla dynamic analysis
can only provides limited view since the trigger
conditions are usually not met. How can we design
automatic analysis methods to uncover the trigger
conditions and trigger-based behavior hidden in malware?
BitScope enables
automatic exploration of program execution paths in
malware to uncover trigger conditions (such as the time
used in time bombs and commands in botnet programs) and
trigger-based behavior, using dynamic symbolic
execution. BitScope also provides in-depth analysis of
the input/output behavior of the malware.
-
- Extracting security-related models from browsers for analysis and vulnerability discovery
In this work, we show how to use string-enhanced white-box exploration techniques to automatically extract security-related models from browsers and to automatically discover cross-site scripting (XSS) vulnerabilities by comparing the extracted models with websites' filters.
-
Deviation
Detection in Binaries
Many network protocols and services have several
different implementations. Automatically identifying
deviations in different implementations of the same
protocol/service can enable the detection of potential
implementation errors without protocol specification, and
can enable automatic generation of fingerprints to
identify an implementation remotely. How can we
automatically identify such deviations in binaries
implementing the same specification?
Deviation Detection
automatically identifies deviations in different
binaries to detect implementation errors and generate
fingerprints. It is achieved by building symbolic formulas
that characterize how each binary processes an input.
-
Protocol Reverse
Engineering and Application Dialogue
Replay
Many network protocols are proprietary or have no well
documented specification. However, many security
applications require protocol reverse engineering and
application dialogue (network trace) replay.
Dispatcher, Polygot and
Replayer automatically extract information about
network protocols and enables application dialogue replay
using binary analysis.
BitBlaze in the News: Vulnerabilities and Coverage
Vulnerabilities Discovered
News Coverage
- Breaking the Botnet Code
Software that deciphers botnet communications could help infiltrate criminals' networks. (Technology Review, November 2009)
For general questions regarding to the BitBlaze project, please send email to bitblaze at gmail.com.
To receive announcements about code releases and other bitblaze related updates, please subscribe to the Bitblaze Announcement List