|
This document is a quick start guide for setting up and running TEMU, the dynamic tracing component of the BitBlaze Binary Analysis Framework. It assumes that you have some familiarity with Linux. The instructions are based on the release of TEMU shown in the header, running on a vanilla Ubuntu 9.04 distribution of Linux. We intermix instructions with explanations about utilities to give an overview of how things work. The goal in this exercise is to take a simple program, trace it on some input and treat its keyboard input as symbolic. You can then use the generated trace file in the separate Vine tutorial.
The following script shows the steps for
building and installing TEMU and the other software it depends on:
(This is also found as
docs/install-temu-release.sh
in the TEMU source,
#!/bin/bash # Instructions for installing TEMU 1.0 on Ubuntu 9.04 Linux 32-bit # Things that require root access are preceded with "sudo". # Last tested 2009-10-05 # This script will build TEMU in a "$HOME/bitblaze" directory, # assuming that temu-1.0.tar.gz is in /tmp. cd ~ mkdir bitblaze cd bitblaze # TEMU is based on QEMU. It's useful to have a vanilla QEMU for testing # and image development: sudo apt-get install qemu # Stuff needed to compile QEMU/TEMU: sudo apt-get build-dep qemu # The KQEMU accelerator is not required for TEMU to work, but it can # be useful to run VMs faster when you aren't taking traces. # # The following commands would build a kqemu module compatible with # your system QEMU, but in Ubuntu 9.04 that would be too new to work # with TEMU. # sudo apt-get install kqemu-common kqemu-source # sudo apt-get install module-assistant # sudo module-assistant -t auto-install kqemu # For the BFD library: sudo apt-get install binutils-dev # TEMU needs GCC version 3.4 (neither 3.3 nor 4.x will work) sudo apt-get install gcc-3.4 # Unpack source tar xvzf /tmp/temu-1.0.tar.gz # Build TEMU # You can select one of several plugins; "tracecap" provides # tracing functionality. (cd temu-1.0 && ./configure --target-list=i386-softmmu --proj-name=tracecap \ --cc=gcc-3.4 --prefix=$(pwd)/install) (cd temu-1.0 && make) (cd temu-1.0 && make install)
While QEMU itself is compatible with almost any guest OS that runs on x86 hardware, TEMU requires more knowledge about the OS to bridge the semantic gap and provide information about OS abstractions like processes. For Linux, we embed knowledge about kernel data structures directly into TEMU; the same approach could potentially be used for Windows, but TEMU’s current Windows support uses an extra driver that runs within the guest. This release of TEMU works out-of-the-box with VMs running Ubuntu Linux 9.04 32-bit. A few extra steps are required to support Windows XP or other versions of Linux.
%SYSTEM32%\drivers
directory (i.e., typically
C:\Windows\system32\drivers
).
Then, double-click the testdrv.reg file to copy its contents
into the registry to configure the driver; it will then be loaded on
the next reboot.
To confirm that the driver is working correctly, look for a
guest.log file created in the directory where you are running
TEMU; it shows some of the data collected by TEMU.Most of this information (all, for some 2.4 kernels) can be collected automatically using a kernel module whose source is found in the shared/kernelinfo directory. There are several sample variants for different distribution versions; procinfo-ubuntu-hardy, which was originally created for Ubuntu 8.04 and also works for 9.04, would be a good starting point for modern 2.6-based systems. Copy the module source to your guest VM, and compile it there (you should have the kernel header files matching the running kernel installed). Then, load the module using the insmod command, and look for its output in the kernel’s logs (e.g., /var/log/kern.log) or the kernel log ring buffer (displayed by the dmesg command). Then copy these entries to shared/read_linux.c and recompile TEMU. For 2.6 kernels, we haven’t been able to find an appropriate hooking function that is exported to modules, so you’ll need to find the address of a function that is called after a new process is created using the kernel’s symbol table (usually kept in a file like /boot/System.map-2.6.28-15-generic), and add it as the second value in the information structure by hand. For recent kernels, we’ve found the function flush_signal_handlers works well.
After performing the above steps, you can check that things are OK by running the guest_ps (Windows) or linux_ps (Linux) command, and verifying that the current processes are correctly displayed; an error in the configuration will likely cause this command to output garbage, or cause TEMU to crash/hang.
Running QEMU by itself should be the first step, before you try to run TEMU. There are many platform specific tweaks that you may need in order to get QEMU usable for your project. Though not needed for this excercise, you will often need to set up a network inside the QEMU image that you use. You may skip this network setup section, if you will not need this.
This document does not intend to go into great depth in setting up QEMU itself. But we describe some mechanisms that have worked for us. You may need a bit Googling to set this up on your specific platform and network configuration.
The simplest kind of network emulation, which QEMU performs by
default, uses just user-level network primitives on the host side, and
simulates a private network for the virtual machine. This is
sufficient for many utility purposes, such as transferring files to
and from the virtual machine, but it may not be accurate enough for
some kinds of malicious network use. The QEMU options for enabling
this mode explicitly are
-net nic -net user,hostname=mybox
, where mybox
is the
hostname for the virtual DHCP server to provide to the VM.
If you want to connect to well-known services on the VM, you’ll need
to redirect them to alternate ports on the host with the
-redir
option. For instance, to make it possible to SSH to a
server on the VM, give QEMU the option -redir tcp:2022::22
,
then tell your SSH client to connect to port 2022 on the local
machine.
|
Once QEMU is set up and running, TEMU should run in the same way. You can run TEMU’s qemu as root, just the same way as you run QEMU using the installed qemu in the PREFIX directory.
Assuming that you have compiled TEMU and you have identified the command line to launch your QEMU session, we can now go ahead and try out a simple example trace. Here we demonstrate the procedure for a Ubuntu 9.04 Linux image; the commands are mostly the same for a Windows image.
The command-line options for TEMU are mostly the same as for
QEMU. Besides whatever options are needed for your virtual machine to
run correctly, the example below adds two more. -snapshot
tells
QEMU not to write changes to the virtual hard disk back to the disk
image file unless explicitly requested, so you don’t have to worry
about messing up your VM with experiments gone awry.
-monitor stdio
tells QEMU to put up a command-line prompt on
your terminal, which we will use to give TEMU commands.
A command line to launch TEMU looks like:
|
The output on the console is:
|
You may also see a warning indicating that kqemu
is disabled
for one reason or another; these may mean that your VM will run more
slowly, but can otherwise be ignored.
|
At the (qemu)
prompt, say:
|
The warning about Cannot determine file system type
applies to
functionality we won’t be using, and can be
ignored. enable_emulation
is required to activate any of TEMU’s
per-instruction tracing hooks; without it, later steps won’t see any
of the instructions executed.
In the (qemu)
prompt, run the linux_ps command to
find the process id of the ./foo application running in the
guest Linux image.
|
The PID, here 958, is shown on the header line for the named process.
The other information isn’t relevant for what we’re doing, but if
you’re curious, the CR3
value is a pointer to the kernel-space
page table for each process, and the remaining lines show the virtual
address ranges for the process’s various segments (mappings), which
are either text or data segments from executables or shared libraries,
or anonymous heap or stack areas.
For a Windows image, you need to run the guest_ps command instead of linux_ps.
The trace command takes the process id and the name of a trace file to write information into, as shown below.
|
As an alternative to steps 3-4 above, you can also tell TEMU to begin tracing a program before you’ve loaded it, with the tracebyname command. This command will monitor new processes and automatically trace the next instance of the target program. Example usage of this command is shown below.
|
With the taint_sendkey command we can send input to the traced process, and also mark this input as tainted. The taint tracking engine will perform dynamic taint tracking, i.e. mark all data derived from tainted input as tainted. If any of the operands of an executed instruction are tainted, the result is also marked tainted. This command takes 2 arguments – the character (really, keyboard key) to give as input (5 in the example below) and an identifier to identify this input in the trace (given by “1001” in the trace; it should not be zero). The trace of this process will log all data read and written at each instruction, the instruction itself, and the associated data taint in the trace file.
|
Note that TEMU is tracking taint throughout the whole simulated machine, but only tracing in the requested process. The first tainted data message refers to the traced program, and doesn’t show up until a complete line has been typed, because the operating system is buffering the input line before that.
We are done with tainting and tracing, so we use the following commands to turn off the components.
|
At the end, you should have a trace file generated at the file name
you specified (/tmp/foo.trace
in the example). The trace has a
specific binary format which is not human-readable, but you can check
that it contains some data (it should be between about 100k and 800k
for this example). It
contains instructions, concrete values of the operands seen in the
execution of the program, and the associated taint value.
As an aside, if you want to generate traces with network input rather
than keystrokes you can follow the same steps but with two
changes. First, after the plugin is loaded, issue the command
taint_nic 1
to tell TEMU to mark all input
received from the network card to as tainted. Second, instead of
giving taint_sendkey, just simply direct the input to the IP
address/port of the virtual machine. If the input causes the EIP to
become tainted, TEMU will immediately write all trace data and
quit. You can use this to launch network attacks on programs in the
guest OS image.
This section describes some problems users have experienced when using TEMU, along with the most common causes of these problems.
TEMU’s Tracecap plugin links with OpenSSL (copyright 1998-2004 the OpenSSL Project), Sleuthkit (portions copyright 1997-1999 IBM and other authors), XED (copyright 2004-2009 Intel), and llconf (copyright 2004-2007 Oliver Kurth). However like TEMU itself our redistribution of that code is WITHOUT ANY WARRANTY.
Though we cannot give any guarantee of support for TEMU, we are interested in hearing what you are using it for, and if you encounter any bugs or unclear points. Please send your questions, feature suggestions, bugs (and, if you have them, patches) to the bitblaze-users mailing list. Its web page is: http://groups.google.com/group/bitblaze-users.
This document was translated from LATEX by HEVEA.