NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
eBPF Summit Day 1 Recap (cilium.io)
ArtWomb 1274 days ago [-]
This caught my eye: Lou Xun (CCP Games) - Traffic Control the Rabbit with Rust using RedBPF. An xdp context transparently available to rust ;)

https://github.com/aquarhead/protect-the-rabbit

https://github.com/redsift/redbpf

tptacek 1274 days ago [-]
Redbpf is a very cool project. But note that the Rust you get to work with is extremely limited: you don't get loops or native function calls (redbpf inlines them in the LLVM IR), and so essentially no library support. There are some BPF programs you can express in C (anything that relies on a bounded loop, for instance to walk a packet a byte at a time) that I don't think you can in redbpf right now.

(Bounded loops are really fussy anyways and you probably won't use them in most programs, so writing in a single language rather than Rust and C might easily be enough of a win).

mehrdadn 1273 days ago [-]
Generic BPF question: do they actually impose a limit based on the loop bound? (If so, what is the limit on the loop bound?)
kleendr12 1273 days ago [-]
Loops must be bounded, that means, the verifier must be able to see that the loop will eventually terminate based on the condition. The verifier will simulate all iterations of the loop and as such it is limited by the verifier complexity, that is, it'll do analysis of up to 1 million walked insns for the entire program until the verifier rejects it.
mehrdadn 1273 days ago [-]
Okay thanks! So what I don't get is, what's the point of bounding loops then? If it's already simulating the program up to 1M instructions and rejecting it if simulation doesn't prove termination (bounded model checking?), then can't it still do that when there's no loop bound?

Because I kind of expected the algorithm would say "hey this loop is bounded up to 4K, this nested one is up to 3K, therefore I can't prove the total is below 1M, therefore I reject", but from your description it sounds like it actually does some kind of bounded model checking up to 1M instructions instead of taking shortcuts like this? Or did you mean it actually does take shortcuts like this?

tptacek 1273 days ago [-]
A shot in the dark but maybe the disconnect here is: BPF statically verifies your program at load time, but a loaded BPF program is executed many, many times after that, with varied inputs. It can't be verified every time it is run, only the first time it's loaded.
kleendr12 1273 days ago [-]
When there's no loop bound it cannot prove termination, see halting problem. Goal is to avoid getting an infinite loop and then freezing the kernel of course.
mehrdadn 1273 days ago [-]
Right but I guess the point I'm getting at is that termination seems neither necessary nor sufficient to me. A loop (or nested loops...) that goes up to 2^63 may as well be infinite, so on the face of it it's not obvious why boundedness gets you anywhere by itself—you'd need to prove something stronger anyway. Conversely, it's not impossible to have loops that nevertheless (provably) terminate within N instructions. But if you're already bounding the number of instructions, why do you need bounded loops to begin with? Is that a necessary lemma of some sort before the algorithm is capable of proving something stronger? At first glance I don't see why the simulation (BMC?) approach you suggested would require bounded loops, since it would seem it might be capable of proving the termination of some unbounded loops within 1M instructions too.
tptacek 1273 days ago [-]
Forget 2^63, and forget nested loops. Think instead a loop with maybe 2000 iterations --- that seems like the frontier of what the verifier will let you do right now. The verifier's budget is 1MM cycles, for the whole program, including each tick of every loop in it. That is to say that the verifier is literally going to execute your program symbolically, and give up after that many cycles, or at any loop where it can't easily see that the induction variable is bounded by a program constant.

The idea here is that without any kind of loop, it's quite tricky to do some basic packet processing. For instance, if you want to clamp the MSS of a TCP connection, you have to grovel through TCP options, which do not occur at fixed offsets or in a particular order; you want to write a "for" loop over the (inherently limited) range of bytes at the computed offset of the TCP options. You can trivially bound that loop by the MTU of the link your program is loaded on (and also the limited possible size of TCP options), the loop won't iterate that many times, and it won't do much inside the loop.

But it's very much not the case that bounded loops give you general-purpose programming, like to implement your own data structures. BPF programs rely on kernel helpers and userland programs that maintain maps and read perf to do that stuff.

kleendr12 1273 days ago [-]
I'm not quite sure I follow your comment. If the program never terminates then it will loop forever and potentially freeze the machine depending from where it is invoked in the kernel. Example of loops that can be detected to terminate:

  int nested_loops(volatile struct pt_regs* ctx)
  {
    int i, j, sum = 0, m;

    for (j = 0; j < 300; j++)
      for (i = 0; i < j; i++) {
        if (j & 1)
          m = ctx->rax;
        else
          m = j;
        sum += i * m;
      }

    return sum;
  }
Or for example another one that is also part of selftests with induction variable i:

  int while_true(volatile struct pt_regs* ctx)
  {
    int i = 0;

    while (true) {
      if (ctx->rax & 1)
        i += 3;
      else
        i += 7;
      if (i > 40)
        break;
    }

    return i;
  }
Overall this is very useful to avoid unrolling loops & keeping the code dense and icache friendly, and to parse (e.g.) IPv6 extension headers and such.
mehrdadn 1273 days ago [-]
Oh! I thought bounded loops meant every loop has to have a bound (hence while (true) wouldn't work). If it can handle more complicated situations then that answers my question. The second example is identical to a do-while loop though, so it's not clear to me if it can actually handle more complicated situations that don't directly map to for/while/do-while loops.

For example, can it handle something like the following, where there's no bound, but the loop necessarily always terminates? (I assumed this loop would be called "unbounded", but maybe I'm confused by the terminology?)

  int test(unsigned i, unsigned j) {
    while (true) {
      i ^= j; j ^= i; i ^= j;
      if (i <= j) { return 0; }
      i ^= j; j ^= i; i ^= j;
      if (j <= i) { return 1; }
      i ^= j; j ^= i; i ^= j;
    }
    return 2;
  }
kleendr13 1273 days ago [-]
Very interesting question, I just gave that a run with i and j being unknown and seems it's getting rejected by the verifier as it still probes the else path. Note that LLVM will convert the xor patterns to moves:

  ; __u64 i = PT_REGS_FP(ctx), j = PT_REGS_RC(ctx);
  0: (79) r2 = *(u64 *)(r1 +32)
  ; __u64 i = PT_REGS_FP(ctx), j = PT_REGS_RC(ctx);
  1: (79) r1 = *(u64 *)(r1 +80)
  ; 
  2: (bf) r3 = r1
  3: (bf) r1 = r2
  4: (bf) r2 = r3
  ; if (i <= j) { return 0; }
  5: (2d) if r3 > r1 goto pc-4
  from 5 to 2: R1=inv(id=2) R2=inv(id=1) R3=inv(id=1) R10=fp0
  ; 
  2: (bf) r3 = r1
  3: (bf) r1 = r2
  4: (bf) r2 = r3
  ; if (i <= j) { return 0; }
  5: (2d) if r3 > r1 goto pc-4
  from 5 to 2: R1_w=inv(id=1) R2_w=inv(id=2) R3_w=inv(id=2) R10=fp0
  ;
  2: (bf) r3 = r1
  3: (bf) r1 = r2
  4: (bf) r2 = r3
  ; if (i <= j) { return 0; }
  5: (2d) if r3 > r1 goto pc-4
  ;
  infinite loop detected at insn 2
mehrdadn 1273 days ago [-]
Ah I see, thanks for running it! Yeah so it's not this particular loop that's interesting (there's probably always going to be some simple-looking loop a solver can't prove—and I'm sure we could come up with simpler examples), but rather, the interesting question is whether it can figure out anything that doesn't map directly to bounded for/while/do-while loops. It's interesting because:

1. If the answer is no, then what is the precise reason? Is there a legitimate reason for it? After all, a bounded loop that loops for too long is just as bad as one that never terminates, so clearly they need a way to upper-bound the instruction count for any loop—at which point, why is the bound even relevant? The only reason I can think of is that they do simplistic analysis (e.g. multiplying the bounds on nested loops to naively approximate an overall bound), but your examples suggest they have more sophisticated (SMT/BMC?) solvers, and it's not obvious to me why a modern solver would fail on all unbounded loops.

2. If the answer is yes, then it would seem they actually do allow unbounded loops after all?

The other possibility is they're using the word "bounded" differently (e.g. maybe as a synonym for "terminating"), in which case it would be true that they would need bounded loops by definition.

tptacek 1273 days ago [-]
The verifier is extremely fussy. Different forms of the exact same loop (as far as the programmer is concerned) will get different results, and my experience is that I spend time rewriting the same loop in different ways just to get programs to pass. A bona-fide non-unrolled loop in a BPF program right now is a special thing that takes extra time to implement, and you're probably not going to use them casually.

The answer to whether BPF effectively allows unbounded loops is "no". The verifier essentially emulates the instructions in your loop, iteration by iteration, and gives itself a fixed budget to do so. If it can't prove the loop invariably exits in that budget, it rejects the program. Allowing an unbounded loop would be an important security vulnerability, and is kind of the whole original point of the verifier.

Probably, this is just a terminology issue; what the verifier in fact cares about is indeed whether the program terminates.

mehrdadn 1273 days ago [-]
Thanks! Yeah that makes sense; it sounds like it's just the terminology then.
danobi 1274 days ago [-]
A possibly lighter weight alternative is libbpf-rs [0]. libbpf-rs is designed to take advantage of BPF's Compile-Once-Run-Everywhere functionality where you can ship a pre-compiled object file to production instead of an entire compiler toolchain.

Disclaimer: I wrote libbpf-rs.

[0]: https://github.com/libbpf/libbpf-rs

tptacek 1274 days ago [-]
Do people generally do that? Is this an artifact of people using iovisor/bcc and Python or whatever? We just build .o files with clang and ship 'em.
brycekahle 1273 days ago [-]
Yes. If you ship any eBPF programs that utilize kprobes and/or read kernel structures that change between kernel versions, you have portability problems. See https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf... for a good overview of the problem that CO-RE intends to solve.
tptacek 1273 days ago [-]
Ah! That makes so much sense. It hadn't occurred; when we rolled out BPF at Fly, the first thing we did was commit to standardize our kernels. After all, there's not much CO-RE can do about your BPF compiler not having tail calls, or not allowing bounded loops. But we're almost entirely XDP; I can see kernel structs being a much bigger problem for people doing observability work.
rtkaratekid 1274 days ago [-]
I use libbpf-sys but that’s user space
ShaniceWilliams 1274 days ago [-]
interesting!
fdfddd 1273 days ago [-]
day 2 stream is here: https://youtu.be/jw8tEPP6jwQ
fdfddd 1273 days ago [-]
love the wide variety of lightning talks as well
ithkuil 1274 days ago [-]
I really love this concise yet informative summary!
genbit 1273 days ago [-]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 15:10:35 GMT+0000 (Coordinated Universal Time) with Vercel.