The Two Doofuses Rhys Weatherley rweather@zip.com.au Fri, 14 Jun 2002 22:11:28 +1000 ---------------------------------------------------------------------- This is something I put together about a year ago on a lark. Enjoy! ---------------------------------------------------------------------------- The Two Doofuses or Why Type Safety and the Garbage Collector Really Exist Dramatis Personae ----------------- Doofus 1: A programmer, implementing their first virtual machine. Doofus 2: Another programmer, helping Doofus 1 implement the virtual machine. Narrator: The person that is telling the story. Audience: The people that are listening to the story. Act 1, Scene 1 -------------- Narrator: We find our two doofuses in a cubicle farm, designing the Next Big Thing. It's called a virtual machine. Audience: Next Big Thing? Haven't virtual machines's been around since the dawn of computing? What's so new about that? Narrator: Why yes they have! How very astute. But our two doofuses aren't very bright, which is why they think it is the Next Big Thing. Audience (muttering behind their hands): Stupid doofuses, reinventing wheels. Narrator: Anyway, we find our two doofuses in a cubicle farm ... Doofus 1: Look at this really cool system that I thought up! You compile programs for this here bytecode format and the programs run on any platform that has an engine! Doofus 2: Wow! What languages can it run? Doofus 1: Anything you want! C, C++, Pascal, whatever. The instruction set is pretty general purpose. Doofus 2: That's really cool! (thinks for a bit) But if we download the program over the Internet, won't security be a problem? Doofus 1: No it won't. It's not machine code. It's bytecode. Doofus 2: Yes it will. It doesn't matter what kind of code it is. Doofus 1: NO IT WON'T! Doofus 2: YES IT WILL! Narrator: This continues for a while, bringing us to ... Act 1, Scene 2 -------------- Doofus 1: OK Mr Smarty Pants, what is the security problem? Doofus 2: See these two opcodes here: "push integer" and "dereference pointer"? Doofus 1: Yes. Doofus 2: Well, what happens when someone uses them one after the other? Let's say they push the integer 3, and then they attempt to dereference it. Hackers could use this ... Narrator: Ahem! The correct terminology is "cracker". Doofus 2: Sorry. Where was I? Oh yes! *Crackers* could use this to walk around areas of memory where they don't belong. Thus creating a security problem. Doofus 1: Damn! You're right. But we should be able to prevent that. Doofus 2: How? Doofus 1: We'll add a checking procedure that looks over the code for such code sequences, and then refuses the run the program if such a sequence is found. Doofus 2: Brilliant! Let's go code it up. Act 2, Scene 1 -------------- Narrator: Several months have gone by and the two doofuses have finally stumbled onto a very old truth ... Doofus 1: Damn, this is hard! What have we done wrong? Doofus 2: (reading an old computer science textbook that they should have been reading from day 1) Ah! Here it is! We have been trying to write a program that proves that every path through the code is heap-safe. Doofus 1: Yes, I know that! We've got 50,000 lines of code written and it's still full of security holes! Doofus 2: This is a variant of the halting problem, which is undecidable. In other words, there is no such algorithm. Doofus 1: Damn! Now what do we do? Doofus 2: Maybe we can avoid the halting problem if we add some extra information to the bytecode stream? Doofus 1: Yes! If we add types and stuff, we can use that to make the process decidable. Act 2, Scene 2 -------------- Narrator: Many more months go by. The two doofuses have discovered that once you start adding types to the bytecode stream, it never stops. Every little instruction must be fully checked, with a large amount of type information backing it up. Pretty soon, we get to ... Doofus 1 & 2 Together: Look at this cool object-oriented programming language that we have invented! Audience: Yawn! Seen one OO language; seen them all. Your stupid engine can now only run one such language and so you've lost generality. Doofus 1 & 2 Together: Yeah, but look at this cool object-oriented programming language that we have invented! Narrator: Sorry to interrupt, but what if some *cracker* were to free a block of memory and then dereference the same pointer afterwards? Wouldn't they be accessing invalid memory once more? Doofus 1 & 2 Together: Damn! We'll go modify our checker to prevent that as well. Act 3, Scene 1 -------------- Doofus 1: This just isn't working. The code paths are simply too long to analyse. There's no way to check for a dereference at some future point after a free. We're back to the halting problem again. Doofus 2: (reading a different old computer science textbook) What about getting rid of the free instruction and adding a garbage collector? Doofus 1: A what? Doofus 2: A garbage collector. (shows Doofus 1 the textbook) The problem is one of pointer liveness: while there is still a live pointer in the system, we can't really free the block. So why don't we have the system keep track of that instead of the programmer? Then the programmer can never mess things up. Doofus 1: Brilliant! Let's go code it up. Act 3, Scene 2 -------------- Doofus 1 & 2 Together: Look at this cool garbage-collected, object-oriented language that we have invented! Epilog ------ Narrator: We find two other doofuses in a different cubicle farm several years later ... Doofus 3: Stupid Doofus 1 & 2! They've designed a system that can only run one language. I've come up with an even better design that has more of the instructions that are needed to run C, C++, Pascal, whatever. Doofus 4: That's really cool! (thinks for a bit) But if we download the program over the Internet, won't security be a problem? Doofus 3: NO IT WON'T! Doofus 4: YES IT WILL! Narrator: And so it continues ... The End.