CSAW CTF 23 Write-Up: Linear Aggressor

After competing in the capture-the-flag event hosted by CSAW, I wanted to do a write up of the "Linear Aggressor" challenge. This challenge looks kind of obtuse at first, but it's fairly simple once you know what the trick is.

Here is the description of the challenge:

Okay, no idea what this is about so far. Let's connect to that server and see what it says.

Okay, it's a "linear regression prediction". What's that?

A linear regression is a model that predicts what some data in the future may be, based on data that you have from the past. Essentially, a linear regression works by drawing a line that is the best fit of the data you have.

Here is some random data (the blue dots), and a linear regression which approximates our data (the red line). The line corresponds to y(x) = 0.381*x + 0.183.

We can say that this model has two parameters, 0.318 and 0.183. If we change either of these two parameters, the line will change as well.

Linear regressions aren't the only kind of regressions used for modeling data. You could imagine modeling the same data above using a curve instead of a line. However, linear regressions are used quite a bit in data science. They tend to give good results for a comparatively small amount of complexity.

Another interesting property of linear functions such as these is that the combination of any two linear functions, a linear combination, is itself a linear function. This is why in cryptography it's important to have non-linear steps in an encryption algorithm. If every step was a linear operation, then the algorithm as a whole could be expressed as a linear operation, and would be vulnerable to linear cryptanalysis.

Anyways, let's give the model some numbers:

Okay, so it takes in 30 numbers and outputs 1 number. If we assume that the model really is a linear regression like they say, then it should look someting like this:

y(x) = a1*x1 + a2*x2 + a3*x3 + a4*x4 + a5*x5 + ... + a30*x30 + a31

As you can see, there are 31 parameters. 30 for each of the inputs plus a constant parameter at the end. This model is more difficult to imagine as a line, because it'd be a 31-dimensional line, but hopefully you get the idea.

Let's write some code. First of all, to make this easier, I'll write a function that takes a vector of 30 numbers and gives us the number returned by the model.


# julia

using Sockets

function result(N)
    sock = connect("misc.csaw.io", 3000)

    for n in N
        write(sock, string(n) * '\n')
    end

    while true
        line = readline(sock)

        if isempty(line)
            error("failed to read output")
        elseif line == "Your result is:"
            return(parse(Int64, readline(sock)))
        end
    end
end
            

We can test that it works:


result(rand(1:100, 30))

# => 153712
            

Now, the idea is to steal the parameters from the model. One parameter is easy to get: if we give the models all zeros, the only thing left will be the last parameter, a31.


a31 = result(fill(0, 30))

# => 125
            

The other parameters are slightly more complicated.

Let's consider the first parameter, a1. We can give the model 1 as the first number, and 0 for all other numbers. This gives us a1 + a31, because the zeroes eliminate all of the other parameters.

Since we know the value of a31, we can subtract this from the result that we receive, and we are left with our parameter.

The same logic holds for each of the other parameters. First, let's create a matrix, known as the identity matrix.


I = [Int(x==y) for x in 1:30, y in 1:30]
            

The matrix looks like this:


 1  0  0  0  0  0  0  0  0  0  0  0  …  0  0  0  0  0  0  0  0  0  0  0  0
 0  1  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  1  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  1  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  1  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  1  0  0  0  0  0  0  …  0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  1  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  1  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  1  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  1  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  1  0  …  0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  1     0  0  0  0  0  0  0  0  0  0  0  0
 ⋮              ⋮              ⋮     ⋱        ⋮              ⋮
 0  0  0  0  0  0  0  0  0  0  0  0     1  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  1  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  …  0  0  1  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  1  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  1  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  1  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  1  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  …  0  0  0  0  0  0  0  1  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  1  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  1  0  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  1  0
 0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  1
            

Each value is zero except the values along the main diagonal, which are one.

Now, we can use each row of this matrix to get the parameters of the model. Remember, like above, we have to subtract the constant parameter.


A = [result(row) - a31 for row in eachrow(I)]

# => 99, 115, 97, 119, 99, 116, 102, 123, 109, 48, 100, 51, 49, 95, 53,
#    116, 51, 52, 49, 105, 110, 103, 95, 105, 53, 95, 98, 52, 100, 125
            

These are the remaining parameters of the model! Also, they look suspiciously like ASCII values. Let's try to print them.


println(join(Char.(A)))

# => csawctf{m0d31_5t341ing_i5_b4d}
            

Like many other CTF puzzles, it's easy once you know the trick! I hope you found this write-up useful.