Saturday, April 22, 2006

What is managed code?

Recently I have been working on pulling together some background information just to improve my knowledge bit further and I thought I'd share it here.

What is managed code?

Managed code is code that has its execution managed by the .NET Framework Common Language Runtime. It refers to a contract of cooperation between natively executing code and the runtime. This contract specifies that at any point of execution, the runtime may stop an executing CPU and retrieve information specific to the current CPU instruction address. Information that must be query-able generally pertains to runtime state, such as register or stack memory contents.

The necessary information is encoded in an Intermediate Language (IL) and associated metadata, or symbolic information that describes all of the entry points and the constructs exposed in the IL (e.g., methods, properties) and their characteristics. The Common Language Infrastructure (CLI) Standard (which the CLR is the primary commercial implementation) describes how the information is to be encoded, and programming languages that target the runtime emit the correct encoding. All a developer has to know is that any of the languages that target the runtime produce managed code emitted as PE files that contain IL and metadata. And there are many such languages to choose from, since there are nearly 20 different languages provided by third parties – everything from COBOL to Camel – in addition to C#, J#, VB .Net, Jscript .Net, and C++ from Microsoft.

Before the code is run, the IL is compiled into native executable code. And, since this compilation happens by the managed execution environment (or, more correctly, by a runtime-aware compiler that knows how to target the managed execution environment), the managed execution environment can make guarantees about what the code is going to do. It can insert traps and appropriate garbage collection hooks, exception handling, type safety, array bounds and index checking, and so forth. For example, such a compiler makes sure to lay out stack frames and everything just right so that the garbage collector can run in the background on a separate thread, constantly walking the active call stack, finding all the roots, chasing down all the live objects. In addition because the IL has a notion of type safety the execution engine will maintain the guarantee of type safety eliminating a whole class of programming mistakes that often lead to security holes.

Contrast this to the unmanaged world: Unmanaged executable files are basically a binary image, x86 code, loaded into memory. The program counter gets put there and that’s the last the OS knows. There are protections in place around memory management and port I/O and so forth, but the system doesn’t actually know what the application is doing. Therefore, it can’t make any guarantees about what happens when the application runs


Managed code is code executed by a .NET virtual machine, such as Microsoft's .NET Framework Common Language Runtime, The Mono Project, or DotGNU Project.

In a
Microsoft Windows environment, all other code has come to be known as unmanaged code. In non-Windows and mixed environments, managed code is sometimes used more generally to refer to any interpreted programming language.

Managed refers to a method of exchanging information between the program and the
runtime environment. It is specified that at any point of execution, the runtime may stop an executing CPU and retrieve information specific to the current CPU instruction address. Information that must be accessible generally pertains to runtime state, such as processor register or stack memory contents.

The necessary information is then encoded in
Common Intermediate Language (formerly known as Microsoft Intermediate Language) and associated metadata.

Before the code is run, the Intermediate Language is compiled into native
machine code. Since this compilation happens by the managed execution environment's own runtime-aware compiler, the managed execution environment can guarantee what the code is going to do. It can insert garbage collection hooks, exception handling, type safety, array bounds, index checking, etc.

This is traditionally referred to as
Just-in-time compilation. However, unlike most traditional just in time compilers, the file that holds the pseudo machine code that the virtual machine compiles into native machine code can also contain pre-compiled binaries for different native machines (eg x86 and PowerPC). This is similar in concept to the Apple Universal binary format.