p̸h̴o̸⁴c̵e̴ₓa̵
Follow

If you need a compiled compiler to compile compilers, how do you know that there's no spyware secretly propagating itself?

hex0 is "a hex assembler written in hex" -- basically several bytes of raw machine code "with a shitload of comments" which translates text to bytes

hex0 translates hex1
which translates M0
which assembles M1/hex2
which assembles M2-Planet
which assembles mes
🆕 which compiles tinycc
which compiles gcc
which compiles the OS!

github.com/oriansj/mescc-tools

gitlab.com/janneke/mes/blob/ma

@pho4cexa Wait. What? I was aware of mes and the further stages. But not these earlier ones. I need to sit down and look at this. [scratches head dumbfoundedly]

@pho4cexa >tcc compiles gcc

hot, i can barely get gcc to compile itself due to my musl system

@pho4cexa Back in the GOOD OL DAYS people would sit down and assemble a Pascal compiler by hand, or so I hear tell.

Back when beards had men attached, women kept nanoseconds in their purses, and we had rooms filled with computer instead of a room filled with computers.

@pho4cexa Though that's frankly for machines and this is a pretty awesome bootstrap setup.

@pho4cexa What if they're ahead of you and put a backdoor somewhere else (the kernel? The CPU?) That detects when GCC or something else is running and inserts the backdoor there?

@onf 🤔 yeah i can think of a few ways to defeat it on one system, but now that we have everything needed to bootstrap a compiler from "nothing," we can get a load of people all doing it on different machines and comparing notes (and cryptographic hashes). Check out bootstrappable.org/benefits.ht !

@pho4cexa Well, that's certainly a good idea, but this "nothing," if I'm not mistaken, looks a lot like an ELF binary containing x64 code, which would limit it at least to running on 64-bit (somewhat recent) Intel and AMD and similar CPUs, and on modern 64-bit operating systems like Linux distros. And how much can these systems be trusted not to all have a backdoor somewhere deep?

@onf @pho4cexa That might happen, but you should not consider these experiments as a finished product. The idea (or, at least, my idea) is to go gradually deeper. My current effort is at gitlab.com/giomasce/asmc, and it aims at not depending on a binary OS. You can still mistrust the CPU of course. At some point I hope to port the whole thing to open CPUs like RISC-V, but it is better to do things gradually.

@pho4cexa This was vaguely mentioned by @rrix, but something like this actually happened in the early days of UNIX, described in Ken Thompson's turing award speech. ece.cmu.edu/~ganger/712.fall02

@pho4cexa @er1n Fascinating, last time I tried compiling gcc w/ tinycc I got a segfault in the resulting stage 2 xgcc. I wonder if things changed...

@pho4cexa wow, that's hallucinating. 8 compilers, wherein 3 of them are complex and POSIX-only, just to bootstrap a free software system, is bad imho

I mean, even bootstrappable.org says there must be as few layers as possible

@pho4cexa huh... I just remembered few pages of hex print-out in some old journal I've got from my parents. That was called either monitor or hyper-visor.

Presence of ELF header somehow makes me think there is a lot of places where some malicious software can taint the code ranging from storing on filesystem and ending in loading it with supervised execution.

Hex-to-binary converter when you start from such a monster like Linux kernel?.. And rely on tons of source code from other systems...

Sign in to participate in the conversation
Tiny Tilde Website

ttw is the unofficial Mastodon instance of tilde.town. We're only smol, but we're friendly. Please don't be a dick.