Tags: cryptography
Rating:
# Writeup for Google CTF 2018 Qualification Round
- Challenge: **DOGESTORE**
- By: _Nguyễn Duy Hiếu_, an undergraduate student from _ACTVN (Academy of Cryptography Techniques, Vietnam)_.
- Contact: [email protected]
## Challenge summary
There's an `encrypted_secret` and a server that works as follow:
1. Receives 110-byte input.
2. Decrypts the input by xor'ing it with a fixed keystream.
3. Decodes the decrypted string by interpreting it as `"char, repeat times - 1, char, repeat times - 1, ..."`. For example, `decode("a\x00b\x01c\x02") == "abbccc"`.
4. Outputs the SHA3-256 hash of the decoded string.
## The key idea
There is more than one way to encode a string if it contains some identical consecutive characters. For example, `"aaa"` can be encoded as `"a\x02"` or `"a\x01a\x00"` or `"a\x00a\x01"` or `"a\x00a\x00a\x00"`. Knowing this fact, we could learn something about the secret keystream by examining the cases in which different inputs are fed to the server but same hash returned.
## Solution
Since the challenge's code is written in Rust which I am not used to, the basic operations of the server will be reimplemented in Python 3:
```python
KEY_SIZE = 110
KEYSTREAM = bytearray(KEY_SIZE)
def decrypt(encrypted): return bytes(a ^ b for a, b in zip(encrypted, KEYSTREAM))
def decode(encoded):
s = b''
for i in range(0, len(encoded), 2):
s += bytes([encoded[i]]) * (encoded[i+1] + 1)
return s
from hashlib import sha3_256
def hash(data): return sha3_256(data).digest()
```
Next, let's define some functions which might be useful for us:
```python
def make_data(*ivvs):
'''This function returns 110-byte data, default to all zeros.
User can specify some (index, value_1, value_2)'s to modify the data as needed.
'''
data = bytearray(110)
for ivv in ivvs:
data[ivv[0]] = ivv[1]
data[ivv[0]+1] = ivv[2]
return bytes(data)
from socket import create_connection
ADDR = ('dogestore.ctfcompetition.com', 1337)
def post(data):
'This function posts data to the server and returns the hash received'
while True:
try:
sock = create_connection(ADDR)
sock.send(data)
return sock.recv(4096)
except: pass
def post_data(*ivvs): return post(make_data(*ivvs))
```
### Key leakage
#### Even-indexed bytes
Let the first 4 bytes of an input data be `0, 0, x, 0` (Figure 1). If `x == k[0] XOR k[2]` then the corresponding decypted byte `x XOR k[2]` will be `k[0]`, same as the first one, and the decrypted string will be decoded into something which has `k[1] + k[3] + 2` letter `k[0]`'s at the beginning. Now, we have a chance to modify the first and third byte of the input data (`d[0]`, `d[2]`) without changing the decoded string. If `(d[1] XOR k[1]) + (d[3] XOR k[3]) == k[1] + k[3]`, things will be as expected.
![input_process](imgs/data_processing.png)
_Figure 1: The processing of an input with the first 4 bytes: `0, 0, x, 0`._
Recall that `a XOR 1` is equal to `a + 1` or `a - 1`, depends on the least significant bit (LSB) of `a`, so `(k[1] XOR 1) + (k[3] XOR 1) == k[1] + k[3]` if the LSB's of `k[1]` and `k[3]` are different. In case the LSB's are equal, `(k[1] XOR 1) + k[3]` must be the same as `k[1] + (k[3] XOR 1)`. Therefore, if we could find an `x` such that:
`post_data((0, 0, 0), (2, x, 0)) == post_data((0, 0, 1), (2, x, 1))` or `post_data((0, 0, 1), (2, x, 0)) == post_data((0, 0, 0), (2, x, 1))` then `x` must be equal to `k[0] XOR k[2]` (Figure 2) or, we would otherwise detect a SHA3-256 collision which seems impossible.
![k0 xor k2 leak](imgs/k0_xor_k2_leak.png)
_Figure 2: `k[0] XOR k[2]` leak._
In a similar way, we could find out `k[2] XOR k[4]`, `k[4] XOR k[6]`, `k[6] XOR k[8]`,... if we want.
#### Odd-indexed bytes
Now, since `a XOR 2^i` is equal to `a + 2^i` or `a - 2^i`, depends on the i-th bit of `a`, it follows that `(a XOR 2^i) + (b XOR 2^i) == a + b` if and only if the two i-th bits of `a` and `b` are different. Using this fact, we could find out if the two i-th bits of `k[1]` and `k[3]` are equal or not for `i = 0,1,2,...,7` (Figure 3), then derive `k[1] XOR k[3]`.
![k1 xor k3 leak](imgs/k1_xor_k3_leak.png)
_Figure 3: `k[1] XOR k[3]` leak._
Again, we could obtain `k[3] XOR k[5]`, `k[5] XOR k[7]`,... if we want.
#### Putting it all together
Now, connect to the server, get all `k[i] XOR k[i+2]`'s and store them in a bytearray named `DELTA`:
```python
DELTA = bytearray(KEY_SIZE - 2)
for i in range(0, len(DELTA), 2):
print('Currnet index: {}/{}'.format(i, KEY_SIZE-2))
# brute force x to find k[i] XOR k[i+2]
for x in range(256):
print('Trying x = {}... '.format(x), end='')
_a = post_data((i, 0, 0), (i+2, x, 0))
_b = post_data((i, 0, 1), (i+2, x, 1))
if _a == _b:
DELTA[i] = x
DELTA[i + 1] = 1 # the two LSB's are different
break
_c = post_data((i, 0, 1), (i+2, x, 0))
_d = post_data((i, 0, 0), (i+2, x, 1))
if _c == _d:
DELTA[i] = x
DELTA[i + 1] = 0 # the two LSB's are equal
break
print('Not matched!')
print('Matched!')
print('k[{}] XOR k[{}] = {}'.format(i, i+2, x))
# find k[i+1] XOR k[i+3]
for j in range(1,8): # the 0-th bit (LSB) case has been done.
_e = post_data((i, 0, 1<