DH MITM

April 22nd 2021

DH MITM - Diffie-Hellman Man in the middle attacks (SCS21 Writeup)

Diffie Hellman

Diffie Hellman core concept
Level one
Level two
Discrete Math
Level three
1. First idea (Unintended Solution)
2. Second idea (Didn't work)
3. Third idea (Worked)
Visualizations (Helped me a lot)
TLDR

Diffie Hellman core concept

The core concept of the Diffie Hellman protocol is explained quite easily:

Choose a prime p with enough bits
choose a g which generates a group of p (Details later, not important now)
Each side chooses a private key a or b
The public key is calculated as A = g^a mod p or B = g^b mod p
The public keys are exchanged and by doing A^b mod p and B^a mod p they will get the same key, which can be verified by plugging in how the public keys are calculated: A^b = (g^a)^b = g^(a*b) mod p so both sides have a shared key of g^(a*b) mod p which then is used to derive a common key (In our case with a salt)

This protocol by just listening is quite secure as you would need to calculate the discrete logarithm of the public key to get the private key of either party. The fastest algorithms to calculate that run in O(sqrt(p)) which at first might sound good in comparison to sorting algorithms. But then we consider the fact that p will have 2048 bits, which means that it would require around 2^1024 steps in the algorithm which... is not really doable with todays hardware. But we are a Man in The Middle with more than just reading abilities!

Level one

In this level we had no restrictions other than the fact that Alice and Bob needed to communicate and understand each other. Obviously stuff like p actually needs to be prime and phi can't be 1 were checked too. Anyway, all we need to do is to establish a connection with Alice telling her we're Bob, then establishing a connection with Bob saying we're Alice. Then we need to relay all the traffic (so decrypt with the shared key of us and Alice and re-encrypt with the shared key between us and Bob). I was lazy and didn't want to calculate extra shared keys, so I just modified the public key to be the generator and thus the shared key to either side was just their public key. I did this challenge with the webinterface and lots of copying around the data. My code for that looks like this:

from Crypto.Cipher import AES
from Crypto.Protocol.KDF import scrypt
from base64 import b64decode as t
from base64 import b64encode as z
from json import loads, dumps

a=loads(input("data:")[6:])
g = a['g']
p = t(a['p'])
KA = a['pubA']
a['pubA'] = g
print()
print("PACKET #1")
print()
print(dumps(a))
print()
b=loads(input("data:")[6:])
KB = b['pubB']
b['pubB'] = g
print()
print("PACKET #2")
print()
print(dumps(b))
print()

alice_key = scrypt(t(KA), t(a['salt']), 16, N=2**14, r=8, p=1)
bob_key = scrypt(t(KB), t(a['salt']), 16, N=2**14, r=8, p=1)

def alice(d):
    nonce = t(d['nonce'])
    alice_aes = AES.new(alice_key, AES.MODE_GCM, nonce=nonce)
    bob_aes = AES.new(bob_key, AES.MODE_GCM, nonce=nonce)

    decrypted = alice_aes.decrypt_and_verify(t(d['ctxt']), t(d['tag']))
    print(decrypted)
    ctxt, tag = bob_aes.encrypt_and_digest(decrypted)
    print()
    print(z(nonce).decode())
    print(dumps({'nonce': z(nonce).decode(), 'ctxt': z(ctxt).decode(), 'tag': z(tag).decode()}))
    print()

def bob(d):
    nonce = t(d['nonce'])
    alice_aes = AES.new(alice_key, AES.MODE_GCM, nonce=nonce)
    bob_aes = AES.new(bob_key, AES.MODE_GCM, nonce=nonce)

    decrypted = bob_aes.decrypt_and_verify(t(d['ctxt']), t(d['tag']))
    print(decrypted)
    #decrypted = b"Please, give send the flag!"
    ctxt, tag = alice_aes.encrypt_and_digest(decrypted)
    print()
    print(z(nonce).decode())
    print(dumps({'nonce': z(nonce).decode(), 'ctxt': z(ctxt).decode(), 'tag': z(tag).decode()}))
    print()

alice(loads(input("data:")[6:]))
bob(loads(input("data:")[6:]))
alice(loads(input("data:")[6:]))
bob(loads(input("data:")[6:]))
alice(loads(input("data:")[6:]))
bob(loads(input("data:")[6:]))
alice(loads(input("data:")[6:]))
bob(loads(input("data:")[6:]))
alice(loads(input("data:")[6:]))
bob(loads(input("data:")[6:]))
alice(loads(input("data:")[6:]))

As you can see, automation was not really there, but hey it works, that's all that counts, right?

Level two

So now we have a TTP which checks that the shared keys are the same for both parties. I spent way too much time thinking about some clever way because I didn't think how to achieve the goal of getting both parties to get the same shared key which we also know. Once I realized that though, it was easy to see that we can just modify the public keys again. But this time not setting it to g to get the shared key, but setting it to one. As you can clearly see, if we set the public key of both parties to one, they will both have the same shared key, namely one. Once that was done I just had to guess that I had to send the word 'Please' to Alice, and got the flag. As I said, I spent a bit more time with this challenge than with the first, so it actually is more automated:

from websocket import create_connection
from json import dumps, loads
from urllib.parse import quote
from base64 import b64encode, b64decode
from Crypto.Protocol.KDF import scrypt
from Crypto.Cipher import AES

domain = "<<- insert domain here (the fqdn in the popup) ->>"

def finish(ws):
    data = []
    while True:
        try:
            rcvd = ws.recv()
            data.append(rcvd)
        except:
            break
    return data

def dec(data):
    return {k: b64decode(v) for k,v in data.items() }

def enc(data):
    return {k: b64encode(v).decode() for k,v in data.items() }

def extract_gendata(data):
    g = int.from_bytes(data['g'], 'big')
    p = int.from_bytes(data['p'], 'big')
    phi = int.from_bytes(data['phi'], 'big')
    salt = data['salt'] 
    return (g, p, phi, salt)

def extract_pubkey(data):
    if 'pubA' in data:
        return int.from_bytes(data['pubA'])
    if 'pubB' in data:
        return int.from_bytes(data['pubB'])
    return None

def reset_chall():
    ws = create_connection(f"wss://{domain}/api/deploy")
    finish(ws)
    ws = create_connection(f"wss://{domain}/api/deploy/task") # why do we need two requests? idk...
    ws = create_connection(f"wss://{domain}/api/deploy/task")
    finish(ws)

def run_cmd(cmd):
    ws = create_connection(f"wss://{domain}/api/deploy/task?argument={cmd}")
    return finish(ws)

def intercept():
    data = run_cmd("1")
    run_cmd("2") # drop
    sender = data[3]
    receiver = data[4]
    data = data[5]
    sender = sender[8:]
    receiver = receiver[10:]
    data = dec(loads(data[6:]))
    return (sender, receiver, data)

def listenin():
    data = run_cmd("1")
    run_cmd("1") # forward
    sender = data[3]
    receiver = data[4]
    data = data[5]
    sender = sender[8:]
    receiver = receiver[10:]
    data = dec(loads(data[6:]))
    return (sender, receiver, data)

def drop():
    run_cmd("1")
    run_cmd("2")

def insert(sender, receiver, data):
    run_cmd("2")
    run_cmd(sender)
    run_cmd(receiver)
    return run_cmd(quote(dumps(enc(data))))

def decrypt(salt, d):
    nonce = d['nonce']
    ctxt = d['ctxt']
    tag = d['tag']
    key_bytes = (1).to_bytes(256, 'big')
    aes_enc_key = scrypt(key_bytes, salt, 16, N=2**14, r=8, p=1)
    cipher      = AES.new(aes_enc_key, AES.MODE_GCM, nonce=nonce)
    plain = cipher.decrypt_and_verify(ctxt, tag)
    print(plain)
def encrypt(salt, d, msg):
    key_bytes = (1).to_bytes(256, 'big')
    aes_enc_key = scrypt(key_bytes, salt, 16, N=2**14, r=8, p=1)
    cipher      = AES.new(aes_enc_key, AES.MODE_GCM)
    ctxt, tag   = cipher.encrypt_and_digest(msg.encode("utf-8"))
    d['nonce'] = cipher.nonce
    d['ctxt'] = ctxt
    d['tag'] = tag

# Exploit starts here:

reset_chall()
print("Starting...")
s, r, d = intercept() # first message from alice
salt = d['salt']
d['pubA'] = (1).to_bytes(256, 'big')
insert(s, r, d)

s, r, d = intercept() # first message from bob
d['pubB'] = (1).to_bytes(256, 'big')
insert(s, r, d)

for _ in range(7):
    s, r, d = listenin() # second message from alice, should be encrypted...
    decrypt(salt, d)

drop() # drop bob's packet
d = {}
encrypt(salt, d, "Please")
insert('Bob', 'Alice', d)

s, r, d = listenin() # second message from alice, should be encrypted...
decrypt(salt, d)

As you can see, the exploit is actually fairly simple and it really shouldn't have taken me as long as it did.

Discrete Math

If you haven't taken a class of discrete math, or forgot everything from last semester (Like me) then here's a quick refresher about groups:

A group needs a neutral element (denoted as 1)
For any element a: a^|G| = 1 where |G| is the amount of elements in the group
Subgroups (Groups with only parts of the elements in G) can only have sizes which divide the size of the overall group. (Also they are groups themselves, so this list applies to those subgroups as well)
Groups have an operation associated with them
- In our case this is multiplication
- For any two elements a and b there exists a unique a*b and thus there also exists a unique inverse (Basically Division)
- Order of operands doesn't matter for multiplication
- 1*a = a for all elements a
- Multiplicative groups modulo a prime have p-1 elements (all numbers less than p excluding zero)

Maybe also good to know: If you have a group and a generator g, then calculating g^(|G| / |H|) you get a generator of the subgroup H. (|G| being the size of the group and |H| the size of the subgroup you want the generator for)

Level three

Lots of time went into this. I spent more time doing discrete math than last semester where I actually had the course.

First idea (Unintended Solution)

My first idea was to bypass the filter from the second level. So instead of using one as the public key to use some other value which will give me somewhat reliable shared keys. So I tried setting the public key to -1 mod p which just is p-1. If I set both public keys to that I'd get around 25% success rate that the shared key is one as well (When both private keys are even that is). But it seems that the filter was not only broken in the sense that p-1 was accepted, but when I tried p+1 (Which is 1 as well) it worked as well, which was unexpected I only tried to set p+1 for the public key of Bob, so in the end my script actually had a 50% of succeeding (The script is 1:1 the same as for level 2, just replace the first 1 with p-1 and the second 1 with p+1). Once I got that I also quickly got the flag, but let's talk about what was required for getting the flag later

Second idea (Didn't work :/)

The second idea revolved around the fact that phi was actually smaller than p-1, so the generator from Alice didn't actually generate the full group but only a subgroup. I tried for way too long to get a generator of that group until I got the fact from above (how to get a generator of the subgroup). So in the end first I had to get a generator of the full group, so I made a list of the first 10000 primes and tried if one of those raised to the phi didn't give me 1 (If it did, it is a generator of a subgroup). Once that was done I could set the generator in the packet to that generator. Bob wouldn't accept that however like that. Because he checks that g^phi = 1 I had to set phi to be equal to the size of the subgroup. Then Bob would give me a public key which was an element of a subgroup with only 10000 (In that range at least) elements. Thus I could find Bob's private key modulo the size of the subgroup. Now of course if I just sent this public key to alice they would get different shared keys and thus trigger the TTP. So I had two options.

First one was to modify the public key of Alice. I tried to set that value to the generator as well. That way I'd get a shared key which is the same as the public key of Bob. Now I only need to calculate the discrete logarithm of the public key of Bob to the Base g (Original generator) modulo p. Which is exactly the same computation as finding the private key of Alice or Bob.

The second option was to modify Bob's public key. This would require to know the private key of alice modulo the size of the subgroup. Which could work (At least I think I calculated it correctly, but maybe I failed doing that. If that's the case ignore the following part) but Alice also checked that the public key was part of the group generated by g (And I have no way of modifying g nor phi for Alice). So she checks that the public key of Bob raised to phi is one as well, which is only the case for the single shared element in the subgroup generated by the original g and the subgroup I generate: 1 (Which is blocked since it's the solution of the second level)

So ultimately none of these ideas worked and I spent way too much time to try with this small subgroup. But that didn't work. So I tried to go with the next best subgroup:

Third idea (The one which did work)

We can factor phi (Which takes a minute, but works). All of those factors (And combinations thereof) are possible sizes of subgroups of the subgroup generated by g (So instead of a separate subgroup I make a nested subgroup). The smallest factor was around 20 million usually. I can bruteforce that, just barely, was my first thought. So let's get going: The idea is to raise both the public key of Alice and the public key of Bob to the same power, resulting in the same shared key for both, because ((g^a)^x)^b = g^(x*a*b) = ((g^b)^x)^a = (g^x)^(a*b). So what should the x be? Well as we can see, as long as g^x is an element of our subgroup, we can get the value of the shared key to be any of the 20 million elements of the subgroup. So I chose prime^((p-1)//pf) where pf is the smallest prime factor of phi (SPOILER: Don't do this). I raise both pubA and pubB to that and everything works fine. So next is to bruteforce the shared key. So I try bruteforcing the first sent message by Alice. But soon I realize that AES is slow as hell and that I'd need to wait for like an hour before it would work (When I parallelize it on 16 cores that is). So next idea: calculate the discrete log (Or in my case bruteforce the discrete log) of the new public key of Alice then exponentiate the new public key of Bob by that. So I implement that and try it, get a key, try the key and... it fails to decrypt? Well as I said before: don't choose an extra prime, just use the old generator as base and phi//pf as the exponent and it will work just fine. So anyway, after like a day spent thinking about discrete math and groups and failing to write code at 2am, I finally get the shared key correctly and am able to encrypt and decrypt again. Luckily the game behind the the encryption didn't change: Alice tells Bob she doesn't have a flag, Bob asks Alice if she wants to play hangman, she then asks "Does the flag contain 'z'" over and over again, so we just have to guess every character which could be in a UUID once, so all hex chars and a hyphen. (If there are three chars not in the flag, we lose, but usually there's only one or two chars missing from the full set). And here is my automated code for this (Including parallelized discrete log calculation which can be replaced by any discrete log algorithm which has a better theoretical runtime than my improvised bruteforcer. But when I tried it, I got around 1:25 minutes with my bruteforcer and 2:00 with sympy's discrete log implementation which uses different algorithms. Maybe I also used it wrong, idk)

from websocket import create_connection
from json import loads, dumps
from urllib.parse import quote
from base64 import b64encode, b64decode
from Crypto.Protocol.KDF import scrypt
from Crypto.Cipher import AES
from sage import *
from sympy import primefactors, mod_inverse, isprime
from sympy.ntheory.residue_ntheory import discrete_log
from multiprocessing import Pool, Value

domain = "<<- Insert fqdn here again ->>"

#
# Game automation
#

def finish(ws):
    """
    Reads all the data from a websocket (And prints it if enabled)
    """
    data = []
    while True:
        try:
            rcvd = ws.recv()
            #print(rcvd)
            data.append(rcvd)
        except:
            break
    return data
def dec(data):
    """
    Decode the data sent from the websocket, for each key base64decode the value
    """
    return {k: b64decode(v) for k,v in data.items() }
def enc(data):
    """
    Encode by base64encoding all the values
    """
    return {k: b64encode(v).decode() for k,v in data.items() }
def extract_gendata(data):
    """
    Read the generated data, g, p, phi and the salt
    """
    g = int.from_bytes(data['g'], 'big')
    p = int.from_bytes(data['p'], 'big')
    phi = int.from_bytes(data['phi'], 'big')
    salt = data['salt'] 
    return (g, p, phi, salt)
def extract_pubkey(data):
    """
    Get the public key of the packet
    """
    if 'pubA' in data:
        return int.from_bytes(data['pubA'], 'big')
    if 'pubB' in data:
        return int.from_bytes(data['pubB'], 'big')
    return None
def reset_chall():
    """
    Reset the challenge and wait for it to generate it's key material
    """
    ws = create_connection(f"wss://{domain}/api/deploy")
    finish(ws)
    ws = create_connection(f"wss://{domain}/api/deploy/task") # why do we need two requests? idk...
    ws = create_connection(f"wss://{domain}/api/deploy/task")
    finish(ws)
def run_cmd(cmd):
    """
    Run a command (1/2 or the data)
    """
    ws = create_connection(f"wss://{domain}/api/deploy/task?argument={cmd}")
    return finish(ws)
def intercept():
    """
    Intercepts a message and drops it, returns the sender, receiver and the data received
    """
    data = run_cmd("1") # intercept
    run_cmd("2") # drop
    sender = data[3]
    receiver = data[4]
    data = data[5]
    sender = sender[8:]
    receiver = receiver[10:]
    data = dec(loads(data[6:]))
    return (sender, receiver, data)
def listenin():
    """
    Intercepts a message and forwards it
    """
    data = run_cmd("1") # intercept
    run_cmd("1") # forward
    sender = data[3]
    receiver = data[4]
    data = data[5]
    sender = sender[8:]
    receiver = receiver[10:]
    data = dec(loads(data[6:]))
    return (sender, receiver, data)
def drop():
    """
    Drop the message without reading it
    """
    run_cmd("1") # intercept
    run_cmd("2") # drop
def insert(sender, receiver, data):
    """
    Insert a packet into the network
    """
    run_cmd("2") # insert
    run_cmd(sender)
    run_cmd(receiver)
    return run_cmd(quote(dumps(enc(data)))) # insert the actual data

#
# Encryption / Decryption
#

GLOBAL_KEY = None
def decrypt(salt, d):
    """"
    Uses the GLOBAL_KEY variable to decrypt the data in packet d
    throws an exception if decryption failed (Or rather it was the wrong key)
    """
    nonce = d['nonce']
    ctxt = d['ctxt']
    tag = d['tag']
    key_bytes = (GLOBAL_KEY).to_bytes((GLOBAL_KEY.bit_length() + 7) // 8, 'big')
    aes_enc_key = scrypt(key_bytes, salt, 16, N=2**14, r=8, p=1)
    cipher      = AES.new(aes_enc_key, AES.MODE_GCM, nonce=nonce)
    plain = cipher.decrypt_and_verify(ctxt, tag)
    print(plain.decode())
def encrypt(salt, d, msg):
    """
    Encrypts data and stores the output in d
    """
    print(f">>> {msg}")
    key_bytes = (GLOBAL_KEY).to_bytes((GLOBAL_KEY.bit_length() + 7) // 8, 'big')
    aes_enc_key = scrypt(key_bytes, salt, 16, N=2**14, r=8, p=1)
    cipher      = AES.new(aes_enc_key, AES.MODE_GCM)
    ctxt, tag   = cipher.encrypt_and_digest(msg.encode("utf-8"))
    d['nonce'] = cipher.nonce
    d['ctxt'] = ctxt
    d['tag'] = tag

#
# Discrete log bruteforcer
# Not necessary with numpy's
# discrete_log function
#

def log_base_gen(data):
    p, brutedata, gen, start, length, tid = data # unpack
    cur = pow(gen, start, p)
    for i in range(length+1):
        if i % 100000 == 0:
            print(f"Status report on thread {tid}: {100*i/length}")
        if cur == brutedata:
            print("FOUND ONE")
            return start+i
        cur = (cur*gen)%p
    return None 
def singlethreaded(func, p, g1, pf, brutedata): # call the worker with one thread
    data = (p, brutedata, g1, 0, pf, 0)
    return func(data)
def multithreaded(func, p, g1, pf, brutedata, numthreads): # call the worker with numthreads threads
    with Pool(16) as pool:
        data = [(p, brutedata, g1, i*pf//numthreads, pf//numthreads+1, i) for i in range(numthreads)] # will miss a few if no +1
        out = pool.map(func, data)
        print(out)
        return [x for x in out if x != None][0]

#
# The actual interaction with the server
# starts here
#

print("---- Resetting ----")
reset_chall()
print("---- Done ----")
s, r, d = intercept() # first message from alice
g, p, phi, salt = extract_gendata(d)
puba = extract_pubkey(d)
print("Factoring phi... this is going to take a while")
pf = primefactors(phi)[0]
print(f"Factored phi, smallest factor: {pf}")
exponent = phi // pf
gg = pow(g, exponent, p) # subgroup of order pf
if pow(gg,pf,p) != 1:
    print("Not generator of subgroup....")
    exit(3)
print(f"Subgroup size (modulus): {pf}, gg={gg}")
print("---- Doing it... ----")

# send puba^exponent mod p --> part of subgroup
newa = pow(puba, exponent, p)
d['pubA'] = (newa).to_bytes(258, 'big')
insert(s, r, d)

# first message from bob, also raise
# the public key here to the exponent
# The shared key will thus be
# g^(exponent*a*b) for both (ok for TTP)
s, r, d = intercept() # first message from bob
pubb = extract_pubkey(d)
newb = pow(pubb, exponent, p)
d['pubB'] = (newb).to_bytes(258, 'big')
insert(s, r, d)

# calculate discrete log of B to get private key of Bob
b = multithreaded(log_base_gen, p, gg, pf, newb, 16) # TODO: use more effective algo to get log ==> sqrt would only be in the thousands not millions
#b = discrete_log(p, newb, gg)

# public key of a to the b is the shared key
GLOBAL_KEY = pow(newa, b, p)

s, r, d = listenin() # second message from alice, should be encrypted...
decrypt(salt, d) # got no flag for you
s, r, d = listenin()
decrypt(salt, d) # wanna play hangman?
drop() # drop alice's packet: Does the flag contain 'z'

chars = "-0123456789abcdef" # hex chars and '-' (uuid only contains these)
for c in chars:
    encrypt(salt, d, f"Does the flag contain a '{c}'")
    insert('Alice', 'Bob', d) # insert packet

    s, r, d = listenin() # answer from bob 
    decrypt(salt, d) # decrypt & print the stuff from bob
    s, r, d = intercept() # question from Alice

Visualizations

Some visualizations I did (On paper with bad drawing skills, you don't want to see those, believe me) I mean they're still handdrawn and ugly, but good enough to see what's going on.

So what's going on in that picture? It's the visualization for the second idea I had: To create another subgroup apart from the one generated by the given generator. How the tuples (a,b) work is just like an id (Each one is unique). But we can also calculate with those. First I need to note, that we have each index individually modulo the amount of elements in the respective row/value (So we just wrap around). Addition of two elements is done like (a,b)+(c,d)=(a+c,b+d) which corresponds to multiplying in the context of diffie hellman. We can also do scalar multiplication: a(c,d) = (ac, ad). This would be exponentiation by a in the Diffie Hellman world. So this is basically just another representation of the same group structure we have in diffie hellman, but instead of a string of numbers we have two indices. And the green group is the one generated by g, which could be any of the elements which have a zero as the first element and a number which doesn't divide phi as the second (Which is actually the solution but I missed when I visualized it the first time)

Now with the explanation out of the way, how it works, my idea was to set the public keys of Alice and Bob to one of the elements in the red group, then it would only ever give me a shared key which is also in that group. Unfortunately, The public key is checked to be in the group with size phi (or any group with size which divides phi for that matter). Which we can circumvent for Alice's public key by modifying the phi sent to B, but we will fail with Bob's key. To get this to work we'd need to get the common element, which only is neutral element (So 1 in the original group) which obviously was banned since Level 2.

This is the idea behind the working solution. This whole table is basically the green group from before, but structured like before to contain a subgroup of a "small" size of 20 million.

From this illustration it's not hard to see that if we have any element (a,b) in the original group, we just have to multiply by the scalar x to get (0,bx). This because the first index is modulo x and anything times x modulo x is just 0. The second index is modulo pf, so we don't know what it is, but we know it is a generator of a subgroup with only pf elements. So by exponentiating (in the diffie hellman world) by x, we get a generator of the subgroup we want (or the neutral element).

Summary (TLDR)

Quick solutions for the levels:

Impersonate both Alice and Bob and relay the messages
Set public key to one for both and use that to read and insert new packets. Send "Please" to Alice to get flag
Raise both public keys by phi//pf where pf is the smallest primefactor of phi. Calculate discrete log to the base g^(phi//pf) of the new public key of either person and obtain the private key of that person that way. Then use the private key like the other person would to construct the shared key. Then win hangman (Send all hex chars and hyphen).

< Bufferoverflows | How2BufferFlow >