PBCTF: NP - Not Password

October 12th 2021

Not password

Aka C++ rev, aka Pain, aka what?

Task

Not Password is a program where you, well... type not a password and totally does not check your password for correctness.

Solve

We get a 224Kb C++ binary and a ~1Gb "lockfile".

After I opened the binary in Ghidra and started reversing the main function, I quickly realized that it just reads two lines, once a "not-password" and once a nonce. It then checks for the correct flag format and converts the rest (36 characters) of the flag into 32-bit integers (Yielding 9 of them). We then have a for loop which just calls round with the all of those ints and an iterator and a base64 iterator of your nonce as well as a serial_graph read from the lockfile, a serialized graph consists of a u32 for the number of vertices, a u32 for the number of edges and then number of edges times pairs of u32s for the source and destination of each edge in the graph.

There are nine of those graphs inside the lockfile (What a surprise), each of which has around 5000 vertices and around 100 million edges (Basically every vertex to every other vertex)

Inside the round function it sets up a SAT problem with an AST which has our input encoded, the nonce gets base64-decoded (Actually this is a guess, I never used the nonce, and even with the final flag I was unable to pass the check with the nonce).

It's a very basic AST, there are only five types of nodes, variables, not, and, or, and xor. The AST is generated with very describtive function names It reserves a number with 32 bits, which is the variable it wants to solve for, then adds the 32 bits of our input encoded as nodes by encoding a set bit as an AND(x) node and an AND(NOT(x)) node if the bit was not set. Then it calls add_num with a parameter set to true - which makes the function actually subtract instead of add (The first carry is set to the flag and if the flag is true every bit of the second operand is inverted, so a subtraction). Next it calls any_bits which just checks if any bits have been set in the output. (I skipped some nodes in the AST, but they're not that important).

Then it seems to set the first variable to the nonce-bits (Again, didn't figure out what that means).

The AST then gets sent through the four stages (Like this challenge sent me through the five stages of grief for having done this challenge).

Our AST is put into a function (as a vector inside a struct) stage1_pre, which sets up other stuff in the struct for later use.

Then we go through the actual stage 1 - oh boy, was this fun. After a lot of time, I figured out that this just converts the AST into triples of variables, which are in 3CNF form to encode the different gates (not the variables). So a y=NOT(x) got turned into (y or y or x) and ((not y) or (not y) or (not x)) (So two triples of three variables, the struct for those variables being two shorts, one for storing the symbolic variable (index into an array) and the other if it's negated or not. So this turned every gate into some amount of 3CNF groups (NOT was 2 of them, AND and OR three, and XOR four).

The next stage, stage 2, then converted the SAT problem into a vertex problem by going through all pairs of variables in different 3CNF groups and creating an edge if the variable is not the same or they have the same sign (both negated or both not negated) between the indices of those specific literals.

Stage 3 flipped every edge by going through all possible generated edges and inserting the ones not in the previous graph into the output for stage three.

Stage 4 then takes that graph (Which has only a few edges because before it had lots of them) and first adds all edges from an index i to all indices greater than i (up to the amount of vertices there were in the old graph). Then it created a new node for every edge and connected the two connected vertices to the new vertex.

The output of that was then compared against the serialized graph from the lockfile and then verified in the verify function which just checks if the problem has been solved correctly (Basically if my nonce is correct, I guess this was either just a red herring or used in an earlier stage of development that we also have to solve the problems to get a flag on a remote or something like that). Anyway, we just need to reverse all the steps done in the stages to get the input AST, which should encode the expected inputs which of course are the 32-bit ints from the flag.

However instead of reversing all four stages I can skip stage3 by realizing that if we do that we just have to flip the condition to generate an edge in stage2, so instead of all edges if the variables are different or the sign is the same we get all the edges if the variable are the same and the sign is different, which is of course a lot less. I did the whole parsing in C and compiled with O3 and still it took around 8 seconds to reverse all the graphs from the lockfile into this stage... But hey, now we have a graph which has a lot less edges (Only around 18k instead of 100million). My code for this is at the end of the writeup, together with the second decoding stage.

Now since we know the order inside each of the encoded gates we can detect certain patterns, for example the NOT gate was encoded as (y or y or x) and ((not y) or (not y) or (not x)) so we see that index 0 and index 3 are connected because they are the same variable but different sign, same for all the other variations of y and not y and for the x variables. And with that I could detect the different gates and where they were.

Next I could identify the inputs and outputs of the gates by taking two gates and checking if the connections from the variables match up, so if if one gate is z = AND(x, y) and another gate is a = NOT(b) and I see that all the connections between a and x match up, I can say that a is the input for the AND gate. There were a lot of empty spaces (Which were just the variables, which were not encoded into the graph problem in stage 1). But I could still extract the embedded number because I knew it was added as the second node (Due to how the transformations were done, it actually was encoded from the back of the gates in the graph). Each bit which should be set is a connection to the variable (The solver tries to make all the nodes true). And not set variables have a NOT gate in between, so by going through the last 32 AND gates inside the decoded graph and looking if the second input is another gate or not, I could extract the bits (NOT gate present ==> bit was not set, no second input ==> bit was set). Then I just need to print those four bytes and after a day of reversing get the flag:

pbctf{W0w_NP-C0mpl3t3n3ss_1s_pr33ty_c0ol!!}

Code (The ugly details)

To generate the intermediate representation (Just stage 4 reversed - or an inverted tree from after stage 2) I wrote the following code:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#include <string.h>

typedef uint8_t u8;
typedef uint32_t u32;

char buf[4];
u32 r32(FILE *f) {
  fread(buf, 1, 4, f);
  return *(u32*)buf;
}
void w32(FILE *f, u32 d) {
  *(u32*)buf = d;
  fwrite(buf, 1, 4, f);
}

int main(int argc, char **args) {
  FILE * f = fopen("lockfile", "rb");
  FILE * of = fopen("one", "wb");

  for(int graph = 0; graph < 9; graph++) {
    u32 num_nodes = r32(f);
    u32 num_edges = r32(f);
    short * out = (short*) calloc(2, num_nodes);
    size_t off = ftell(f);
    printf("Reading graph @ %d\n", off);

    for(u32 i = 0; i < num_edges; i++) {
      u32 src = r32(f);
      u32 dst = r32(f);
      out[src] = true; // outgoing edge => not a edge-node
    }

    u32 np = 0;
    u32 ne = 0;
    for(u32 i = 0; i < num_nodes; i++) {
      if(out[i]) np++;
      else ne++;
    }
    printf("Number of nodes & edges in the graph: %d, %d\n", np, ne);
    w32(of, np); // write num of nodes
    w32(of, ne); // write num of edges 
    fseek(f, off, SEEK_SET); // rewind file
    memset(out, 0, 2*num_nodes);
    u32 act_ne = 0;

    for(u32 i = 0; i < num_edges; i++) {
      u32 src = r32(f);
      u32 dst = r32(f);
      if(dst >= np)
        if(out[dst] == -1) {
          printf("OOOOF");
          exit(1);
        } else if(out[dst]) {
          // printf("Edge in original graph from %x to %x\n", out[dst], src, ne++);
          u32 a = out[dst] < src ? out[dst] : src;
          u32 b = out[dst] >= src ? out[dst] : src;
          w32(of, a);
          w32(of, b);
          act_ne++;

          out[dst] = -1; // max two connections to a node ==> represent the edge
        } else {
          out[dst] = src; // First endpoint
        }
    }
    if(act_ne != ne) {
      printf("Wrong number of edges %d, %d\n", act_ne, ne);
      exit(1);
    }
    printf("Num edges: %d\n", ne);
    free(out);
  }

  fclose(f);
  fclose(of);

    return 0;
}

This will parse all the edges and reconstruct the edges in the graph before stage 4 and also reduce the number of vertices.

Then to get the gates (And the flag) I used the following code:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#include <string.h>

typedef enum {
  NOT = 0,
  AND,
  OR,
  XOR
} Type;

typedef uint8_t u8;
typedef uint32_t u32;
typedef struct {
  short pos;
  u8 len;
  Type type;
} gate_t;

char buf[4];
u32 r32(FILE *f) {
  fread(buf, 1, 4, f);
  return *(u32*)buf;
}
void w32(FILE *f, u32 d) {
  *(u32*)buf = d;
  fwrite(buf, 1, 4, f);
}
bool c(char *a, u32 nn, u32 s, u32 d) {
  return a[s*nn+d] != 0;
}

int main(int argc, char **args) {
  FILE * f = fopen("one", "rb");
  FILE * of = fopen("two", "wb");

  for(int graph = 0; graph < 9; graph++) {
    u32 num_nodes = r32(f);
    u32 num_edges = r32(f);
    //printf("==> %d %d\n", num_nodes, num_edges);

    char* connections = (char*) calloc(1,num_nodes * num_nodes);

    for(u32 i = 0; i < num_edges; i++) {
      u32 src = r32(f);
      u32 dst = r32(f);
      connections[src*num_nodes + dst] = 1;
    }

    char * io = (char*) malloc(num_nodes);
    gate_t * gates = (gate_t*) malloc(num_nodes);
    u32 gate = 0;

    for(u32 i = 0; i < num_nodes; i++) {
      if(
          c(connections, num_nodes, i, i + 3) &&
          c(connections, num_nodes, i, i + 4) &&
          c(connections, num_nodes, i + 1, i + 3) &&
          c(connections, num_nodes, i + 1, i + 4) &&
          c(connections, num_nodes, i + 2, i + 5)) {
        //printf("Not @ %d (+6)\n", i);

        gate_t a = {i, 6, NOT};
        gates[gate++] = a;
      }

      if(
          c(connections, num_nodes, i, i + 6) &&
          c(connections, num_nodes, i + 1, i + 5) &&
          c(connections, num_nodes, i + 2, i + 5) &&
          c(connections, num_nodes, i + 3, i + 6) &&
          c(connections, num_nodes, i + 4, i + 7)) {
        //printf("And @ %d (+9)\n", i);
        gate_t a = {i, 9, AND};
        gates[gate++] = a;
      }

      if(
          c(connections, num_nodes, i, i + 6) &&
          c(connections, num_nodes, i + 1, i + 6) &&
          c(connections, num_nodes, i + 3, i + 6) &&
          c(connections, num_nodes, i + 2, i + 5) &&
          c(connections, num_nodes, i + 2, i + 8) &&
          c(connections, num_nodes, i + 4, i + 7)) { 
        //printf("Or  @ %d (+9)\n", i);
        gate_t a = {i, 9, OR};
        gates[gate++] = a;
      }

      if(
          c(connections, num_nodes, i, i + 6) &&
          c(connections, num_nodes, i, i + 9) &&
          c(connections, num_nodes, i + 1, i + 4) &&
          c(connections, num_nodes, i + 1, i + 10) &&
          c(connections, num_nodes, i + 2, i + 5) &&
          c(connections, num_nodes, i + 2, i + 8) &&
          c(connections, num_nodes, i + 3, i + 6) &&
          c(connections, num_nodes, i + 4, i + 7) &&
          c(connections, num_nodes, i + 5, i + 11) &&
          c(connections, num_nodes, i + 7, i + 10) &&
          c(connections, num_nodes, i + 8, i + 11)) {
        //printf("Xor @ %d (+12)\n", i);
        gate_t a = {i, 12, XOR};
        gates[gate++] = a;
      }
    }

    u32 out = 0; // hope for null after this one here lol

    for(u32 i = 0; i < gate - 1; i++) {
      gate_t g = gates[i];
      //if(g.type == NOT) printf("Not @ %d\n", g.pos);
      //else printf("Gate @ %d\n", g.pos);
      u32 has_a = 0, has_b = 0;
      for(u32 j = i+1; j < gate; j++) {
        gate_t h = gates[j];

        if(g.type == AND && h.type == AND) {
          if(
              c(connections, num_nodes, g.pos + 4, h.pos + 0) &&
              c(connections, num_nodes, g.pos + 4, h.pos + 3) &&
              c(connections, num_nodes, g.pos + 7, h.pos + 6)
            ) {
            has_a = h.pos;
          }
        } else if (g.type == AND && h.type == NOT) {
          if(
              c(connections, num_nodes, g.pos + 1, h.pos + 3) &&
              c(connections, num_nodes, g.pos + 2, h.pos + 3) &&
              c(connections, num_nodes, g.pos + 1, h.pos + 4) &&
              c(connections, num_nodes, g.pos + 2, h.pos + 4) &&
              c(connections, num_nodes, g.pos + 5, h.pos + 0) &&
              c(connections, num_nodes, g.pos + 8, h.pos + 0) &&
              c(connections, num_nodes, g.pos + 5, h.pos + 1) &&
              c(connections, num_nodes, g.pos + 8, h.pos + 1)
            ) {
            has_b = h.pos;
          }
        }
      }
      if(has_a) {
        //printf("%d has inputs a: %d, b: %d\n", g.pos, has_a, has_b);
        out = (out<<1)|(has_b == 0);
      }
    }

    printf("%s", &out);
  }
  printf("\n");

  fclose(f);
  fclose(of);

  return 0;
}

Note that this only works if there's a null byte after the u32 out because of the way I print it (Always worked for me, so yeah, probably works out with the padding or sth). I removed a lot of code to detect edges between different gates (I left the code to detect the gates, just removed code the detect inputs for them because in the end only and to not or and mattered).


< ImaginaryCTF Round 12 Writeup | PBCTF: Switching it Up >