Alexander Morou
Sponsor
Well, I'm not sure if there is a forum for this. So I figured it'd go under 'Application Development'. Since what I've been doing lately would be categorized under 'programming research', I figured this would be fine. If it's in the wrong area, let me know.
Lately I've been further familiarizing myself with Common Intermediate Language (CIL for short), which is an integral part of the Common Language Infrastructure as the core language that drives the infrastructure.
Today my research point is Arrays, specifically fixed-length literal arrays that you define in code. For example in C♯:
Constant Arrays - Quick Initialization
The following creates a multi-dimensional array which is 3x3x4 elements in size, or 36 bytes in length. You might think that C♯ would simply re-create your array, using an array create call, and assigning each value individually, but it has one better.
It actually creates a <PrivateImplementationDetails>{Version-GUID} (lets call it 'PID' for short). That name is obviously invalid in C♯, but valid in CIL provided you escape the name with single quotes. The 'Version-GUID' is a random GUID generated on compile an example of the class generated is: "<PrivateImplementationDetails>{17D2CA44-BFFB-4117-B3DF-49EC5806703D}". As for why it uses it I can only surmise that it deters, someone reverse-engineering the project, and depending upon the PrivateImplementationDetails of that given build.
If you used a fixed-length blob of data, like above, it would generate a '__StaticArrayInitTypeSize=x' where 'x' would be the number of bytes your data covers. The actual structure is just an empty shell, but here's where the interesting part comes in, the structure is actually packed per byte, and it's given a fixed size of the length of your data. It then creates a private field using that structure and assigns it to a value at a given .data location. For example:
The data would be defined separately in the assembly, the IL version of it is:
So, if you wanted to write CIL that would show as the C♯ equivalent in .NET reflector, here's how you would do it, using the array shown above:
Naturally doing this kind of work by hand would be silly, but it can be done. I found out the hard way that you can't just use .NET reflector to view what's what. Main reason for this is the .NET reflector does all the work itself, without using reflection. This means it's prone to 'miss' things. Thankfully ildasm does not. It took a while to find the specifics of how these things are managed but it's fairly straight forward once you see how it's done.
I think it should be fairly easy to mimic such behavior when I make the CIL Translator. All I would need to do is do a simple type-check of the array element type and the rest is pretty easy. Arrays allow you multi-dimensional step-through using an enumerator, just determine the element size, iterate and gather the bits that way, translate to a hexadecimal bit format, and you're done with the encode.
I'll be posting later on the specifics of Short-circuiting. I have to program the transitory code necessary to handle operator overloads, implicit conversions and so on. Should be 'fun'; however, I plan on streamlining the process so I can use it in more than one area. Namely there's the CIL Code translator and there'll also be the CILTranslator that builds types/methods et cetera using the Dynamic Type Building system introduced recently into .NET (I think version 2.0, but you have to know CIL to use it, so it's not common knowledge).
I'm posting this here, in case anyone is interested.
Lately I've been further familiarizing myself with Common Intermediate Language (CIL for short), which is an integral part of the Common Language Infrastructure as the core language that drives the infrastructure.
Today my research point is Arrays, specifically fixed-length literal arrays that you define in code. For example in C♯:
Constant Arrays - Quick Initialization
Code:
byte[, ,] o0 = new byte[,,] {
{
{ 0x01, 0x02, 0x03, 0x1C },
{ 0x04, 0x05, 0x06, 0x1D },
{ 0x07, 0x08, 0x09, 0x1E }
},
{
{ 0x0A, 0x0B, 0x0C, 0x1F },
{ 0x0D, 0x0E, 0x0F, 0x20 },
{ 0x10, 0x11, 0x12, 0x21 }
},
{
{ 0x13, 0x14, 0x15, 0x22 },
{ 0x16, 0x17, 0x18, 0x23 },
{ 0x19, 0x1A, 0x1B, 0x24 }
}
};
It actually creates a <PrivateImplementationDetails>{Version-GUID} (lets call it 'PID' for short). That name is obviously invalid in C♯, but valid in CIL provided you escape the name with single quotes. The 'Version-GUID' is a random GUID generated on compile an example of the class generated is: "<PrivateImplementationDetails>{17D2CA44-BFFB-4117-B3DF-49EC5806703D}". As for why it uses it I can only surmise that it deters, someone reverse-engineering the project, and depending upon the PrivateImplementationDetails of that given build.
If you used a fixed-length blob of data, like above, it would generate a '__StaticArrayInitTypeSize=x' where 'x' would be the number of bytes your data covers. The actual structure is just an empty shell, but here's where the interesting part comes in, the structure is actually packed per byte, and it's given a fixed size of the length of your data. It then creates a private field using that structure and assigns it to a value at a given .data location. For example:
Code:
.field assembly static valuetype '<PrivateImplementationDetails>{17D2CA44-BFFB-4117-B3DF-49EC5806703D}'/'__StaticArrayInitTypeSize=36' '$$method0x6000001-1' at I_000020D0
Code:
.data cil I_000020D0 = bytearray (01 02 03 1C 04 05 06 1D 07 08 09 1E 0A 0B 0C 1F 0D 0E 0F 20 10 11 12 21 13 14 15 22 16 17 18 23 19 1A 1B 24)
So, if you wanted to write CIL that would show as the C♯ equivalent in .NET reflector, here's how you would do it, using the array shown above:
Code:
.method public static hidebysig void main() cil managed
{
.entrypoint
.maxstack 3
.locals init (
[0] uint8[0...] ja)
//Load the constant 4-byte integer '36' onto the stack
ldc.i4.s 36
//Create a new instance of an array and push it onto the stack.
newobj instance void uint8[0...]::.ctor(int32)
//Duplicate the reference on the stack, since we'll be passing
//it to a method that doesn't return a value.
dup
//Load the RuntimeFieldHandle of the data blob onto the stack.
ldtoken field valuetype '<PrivateImplementationDetails>{17D2CA44-BFFB-4117-B3DF-49EC5806703D}'/'__StaticArrayInitTypeSize=36' '<PrivateImplementationDetails>{17D2CA44-BFFB-4117-B3DF-49EC5806703D}'::'$$method0x6000001-1'
//Initialize the array.
call void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle)
//Store the array
stloc.0
}
I think it should be fairly easy to mimic such behavior when I make the CIL Translator. All I would need to do is do a simple type-check of the array element type and the rest is pretty easy. Arrays allow you multi-dimensional step-through using an enumerator, just determine the element size, iterate and gather the bits that way, translate to a hexadecimal bit format, and you're done with the encode.
I'll be posting later on the specifics of Short-circuiting. I have to program the transitory code necessary to handle operator overloads, implicit conversions and so on. Should be 'fun'; however, I plan on streamlining the process so I can use it in more than one area. Namely there's the CIL Code translator and there'll also be the CILTranslator that builds types/methods et cetera using the Dynamic Type Building system introduced recently into .NET (I think version 2.0, but you have to know CIL to use it, so it's not common knowledge).
I'm posting this here, in case anyone is interested.