Sunday, August 8, 2010

Just In Time Compiler for Managed Platform: Problem with exception handling

So far we have generated code that is 100% equivallent to the methods byte code. But in case of exception this is a little problematic. Let me explain.

For each generated method we have Prolog and Epilog that must be executed. If we skip the Epilog the native stack that us being by VM itself gets unstable-

Again since we have compiled the entire method we should not do a separate compilation for the exception handler.

Now that we wnat to execute the Epilog and also want to use the already compiled handler code.

This part does not seems to be hard- but we dont know in advance which method will catch the exception.

To solve the problem we can add an exception blog at the end of each method (a few bytes overhead for each method) and return a value from the native code to signal exception. The return value we used so far is always zero. So for exception we can use any non-zero value and also do the cleanup.

When we find exception block we execute that block and continue. The way the compiler generates methods byte code actually do the nice job of managing code flow properly so our compiler already converted those code to native code and we dont need anything extra.

Just In Time Compiler for Managed Platform- Part 7: Native Methods

Execution of native methods are easiest of all methods :D. Since we dont need to generate anything for them - they are already native.

We can not accept any native method for execution- Those should understand the stack. For native methods we fix the stack top to point to actual data- since there is nothing we need to generate- we dont need the clever stack pointer-

The signature of each native method must match the following type signature:

typedef Variable (*NativeMethod)(Context* pContext);


Lets define our helper for native method. And here also we do not deal with machine instruction since there is nothing to parse- (well except the return type of the method)-
u4 ExecuteNativeMethod(Context* pContext, CString strClassName, CString strMethod, CString strDesc)
{    
    JavaClass *pClass = pContext->pClass;    
    LOG(_T("Execute NativeMethod %s.%s%s  \n"),strClassName , strMethod, strDesc);
    CString strSignature= GetNativeMethodSignature(strClassName, strMethod, strDesc);
    NativeMethod nativeMethod=pContext->pVMEnv->pNativeMethodProvider->GetNativeMethod(strSignature);

    if(nativeMethod == NULL)
    {        
        ASSERT(FALSE);
        return -1;
    }
    else
    {
        Variable retVal = nativeMethod(pContext);

        //if returns then get on stack    
        if(strDesc.Find(_T(")V")) < 0)
        {
            if(strDesc.Find(_T(")J")) < 0 && (strDesc.Find(_T(")D")) < 0))
            {
                //todo validate
                pContext->stack[0]=retVal;
            }
            else
            {
                pContext->stack[0].intValue=0;
                pContext->stack[1]=retVal;
            }
        }
    }
    return 0;
}



OK, lets now define a simple native method that adds two numbers:

Variable Add(Context* pContext)
{    
    Variable returnVal;
    //The stack top is right in native methods-
    returnVal.intValue 
          = pContext->stack[pContext->stackTop].intValue + pContext->stack[pContext->stackTop-1].intValue;
    return returnVal;
}


Thats all for native methods for now. We'll add some native methods to dynamically load native method-

Friday, August 6, 2010

Just In Time Compiler for Managed Platform- Part 6: Put Field and Get Field

Now we have our object on the heap. We need mechanism to putfield and getfield. The operation is very simple indeed.

First we need a helper to get a field index (variable) in the object memory in the heap:

//And also we put the method names in HelperMethods structure


int GetFieldIndex(JavaClass *pTargetClass, Context *pContext, u2 index)
{
    CONSTANT_Fieldref_info *pFieldInfo 
                 = (CONSTANT_Fieldref_info *)pContext->pClass->constant_pool[index];
    
    ASSERT(pFieldInfo->tag == CONSTANT_Fieldref); 

    u2 classIndex = getu2(&((u1 *)pFieldInfo)[1]);
    u2 nameAndTypeIndex = getu2(&((u1 *)pFieldInfo)[3]);

    CONSTANT_NameAndType_info *pNameTypeInfo 
                  = (CONSTANT_NameAndType_info *)  pContext->pClass->constant_pool[nameAndTypeIndex];

    ASSERT(pNameTypeInfo->tag == CONSTANT_NameAndType);

    u2 nameIndex = getu2(&((u1 *)pNameTypeInfo)[1]);
    u2 descIndex = getu2(&((u1 *)pNameTypeInfo)[3]);
    CString strFieldName, strFieldDesc;
    
    if(!pContext->pClass->GetStringFromConstPool(nameIndex, strFieldName))  
    {
        ASSERT(FALSE); 
        return -1;
    }
    
    if(!pContext->pClass->GetStringFromConstPool(descIndex, strFieldDesc))  
    {
        ASSERT(FALSE);
        return -2;
    }

    int superClassSize = 0;
   
    JavaClass *pCurClass = pTargetClass;
    
    int fieldIndex = -1;
    while(true)
    {        
        fieldIndex = pCurClass->GetFieldIndex(strFieldName, strFieldDesc);
        pCurClass = pTargetClass->GetSuperClass();
        if(fieldIndex >= 0)
        {
            fieldIndex += pCurClass ? pCurClass->GetObjectFieldCount() : 0;
            break;
        }
        else
        {            
            if(!pCurClass)
            {
                break;
            }
        }
    }

    ASSERT(fieldIndex>=0);
    return fieldIndex;
}


Now we define the method to execute the two instructions. We do not generate code foe this operaions and callback from the generated code since the operation is static and no parsing or instruction execution required-

#define PUT_FIELD_HELPER_INDEX 4
#define GET_FIELD_HELPER_INDEX 5 

void PutField(Context *pContext, u2 index)
{
    Variable obj=pContext->stack[pContext->stackTop-2];
    Variable value=pContext->stack[pContext->stackTop-1];
    Variable *pVarList = pContext->pVMEnv->pObjectHeap->GetObjectPointer(obj.object);

    JavaClass *pTargetClass = (JavaClass *)pVarList[0].ptrValue;
    ASSERT(pTargetClass && pTargetClass->magic == 0xCAFEBABE);

    int fieldIndex = GetFieldIndex(pTargetClass, pContext, index);
    
    pVarList[fieldIndex+1]=value;

    pContext->stackTop-=2;
}

void GetField(Context *pContext, u2 index)
{
    Variable obj=pContext->stack[pContext->stackTop-1]; 
    Variable *pVarList=pContext->pVMEnv->pObjectHeap->GetObjectPointer(obj.object);

    JavaClass *pTargetClass = (JavaClass *)pVarList[0].ptrValue;
    ASSERT(pTargetClass && pTargetClass->magic == 0xCAFEBABE);

    int fieldIndex = GetFieldIndex(pTargetClass, pContext, index);

    pContext->stack[pContext->stackTop-1]=pVarList[fieldIndex+1];
}


Now we generate instruction for them. Here is how we do it for putfield:

void EmitCallPutField(u1* code, int &ip, u4 index)
{
    u1 c[]={
        //((void (*)(Context *pContext, u2 index))pContext->pVMEnv->ppHelperMethods[PUT_FIELD_HELPER_INDEX])(pContext, 0x1234);
        0x8B, 0xF4, //            mov         esi,esp 
        0x68, 0x00, 0x00, 0x00, 0x00, //   push        index 
        0x8B, 0x45, 0x08, //         mov         eax,dword ptr [pContext] 
        0x50, //               push        eax  
        0x8B, 0x4D, 0x08, //         mov         ecx,dword ptr [pContext] 
        0x8B, 0x51, 0x10, //         mov         edx,dword ptr [ecx+10h] 
        0x8B, 0x42, 0x08, //         mov         eax,dword ptr [edx+8] 
        0x8B, 0x48, 0x10, //         mov         ecx,dword ptr [eax+10h] 
        0xFF, 0xD1, //            call        ecx  
        0x83, 0xC4, 0x08, //         add         esp,8 
    };

    memcpy(c+3, &index, sizeof(index));
    memcpy(&code[ip], c, sizeof(c));
    ip+=sizeof(c);
}

void ExecutePutField(u1* code, int& ip, u2 index)
{
    EmitCallPutField(code, ip, index);
}

For getfield we just need to change the method pointer (add 4)

mov ecx,dword ptr [eax+14h] 

So we can assign value to a class field and get that value when we need it. Next we need array handling. Comes next day.

Thursday, August 5, 2010

Just In Time Compiler for Managed Platform- Part 5: Creating new object on heap

Lets create object on heap today.

Since we have the object creation code in JavaClass it is very easy to create an object on the heap. We just call the CreateObject method of the current class that is already pushed on the stack by previous instructions:

int CreateNewObject(Context *pContext, u2 index)
{
    if(!pContext->pClass->CreateObject(index, pContext->pVMEnv->pObjectHeap, pContext->stack[pContext->stackTop].object))
        return -1; 
    pContext->stackTop++;
        return 0;
}


Now let us create the helper methods for the new instruction:

void EmitExecuteNew(u1* code, int &ip, u4 index)
{
    u1 c[] = {    
        //((int (*)(Context *pContext, u2 index))pContext->pVMEnv->ppHelperMethods[EXECUTE_NEW_HELPER_INDEX])(pContext, index);
        0x8B, 0xF4, //            mov         esi,esp 
        0x68, 0x00, 0x00, 0x00, 0x00, //   push        index 
        0x8B, 0x45, 0x08, //         mov         eax,dword ptr [pContext] 
        0x50, //               push        eax  
        0x8B, 0x4D, 0x08, //         mov         ecx,dword ptr [pContext] 
        0x8B, 0x51, 0x10, //         mov         edx,dword ptr [ecx+10h] 
        0x8B, 0x42, 0x08, //         mov         eax,dword ptr [edx+8] 
        0x8B, 0x48, 0x04, //         mov         ecx,dword ptr [eax+4] 
        0xFF, 0xD1, //            call        ecx  
        0x83, 0xC4, 0x08, //         add         esp,8 
    };

    memcpy(c+3, &index, sizeof(index));
    memcpy(&code[ip], c, sizeof(c));
    ip+=sizeof(c);
}

void ExecuteNew(u1* code, int& ip, u2 index)
{
    EmitExecuteNew(code, ip, index);
}


With the call mechanism that wer built to call method is used to callback the object creation method here.

And from the Compile method we just add a call for the new instruction:

case _new:// 187(0xbb) -'new' is a keyword in C++ so we use '_new' :)
        ExecuteNew(codes, ip, getu2(&bc[pc+1]));
        pc+=3;
        break;

We actually have all the basic mechanism built. We just need to add things like native method support, garbase collector and helpers for remaining instruction. Those will be managed mostly in C++ since there is nothing to parse for them the optimizing C++ compiler does all the good thing for us.

Tuesday, August 3, 2010

Just In Time Compiler for Managed Platform- : Why stack is implemented wrong?

Well, its not really wrong- its just efficient-

The stack top is is always topvalue + 1. This is not right behaviour for stack usually- but what we do in our opetation is decrement the value first. It makes the offset zero which is efficient in terms of machine cycles.

But why do that in the first place?

This is because if we do not decrement the stack first before execution of the instruction it becomes complex in case of jmp (branch) instructions. Since we must decrement the stack no matter what we do this first.

Now if we use a negative offset it'll take some extra cycles to decode the instruction with offset than witout the offset.

This post is just to make sure we remember the stack top value points to invalid data. We must decrement it by one to get the top value.

Monday, August 2, 2010

Just In Time Compiler for Managed Platform- Part 4A: Conditional Branch Correction

The Jxx instructions I used for branching was not working right- It treated values unsigned. The "IA-32 Intel Architecture Software Developers Manual- Vol 2" describes this (page- 3-355): The terms "less" and "greater" are used for comparisons of signed integers and the terms "above" and "below" are used for unsigned integers. So, for signed comparison, we use JL, JG, JLE and JGE instead of JB, GA, JBE and JAE.

With that change all branching instruction seems working now. We need two helper method for all of them.

First one is comparison with zero:

void IfXX(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap, u1 XX)
{
     u1 c[] = {
        //pContext->stackTop--;
        0x8B,0x45,0x08,         // mov         eax,dword ptr [pContext] 
        0x8B,0x48,0x04,         // mov         ecx,dword ptr [eax+4] 
        0x83,0xE9,0x01,         // sub         ecx,1 
        0x8B,0x55,0x08,         // mov         edx,dword ptr [pContext] 
        0x89,0x4A,0x04,         // mov         dword ptr [edx+4],ecx 

        //if(pContext->stack[pContext->stackTop].intValue [XXoperator] 0)                
        0x8B, 0x45, 0x08, //         mov         eax,dword ptr [pContext] 
        0x8B, 0x48, 0x04, //         mov         ecx,dword ptr [eax+4] 
        0x8B, 0x55, 0x08, //         mov         edx,dword ptr [pContext] 
        0x8B, 0x02, //            mov         eax,dword ptr [edx]
        0x83, 0x3C, 0xC8, 0x00, //      cmp         dword ptr [eax+ecx*8],0         
        0x0F, 0x00, 0x00, 0x00, 0x00, 0x00, // JXX         
     };

    memcpy(&code[ip], c, sizeof(c));
    ip+=sizeof(c);

    code[ip-5] = XX;
    CreateJmpLink(&code[ip-5], targetpc, pJmpTargetMap);
}

void Ifle(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfXX(code, ip, targetpc, pJmpTargetMap, JLE);
}

void Ifeq(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfXX(code, ip, targetpc, pJmpTargetMap, JE);
}

void Ifne(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfXX(code, ip, targetpc, pJmpTargetMap, JNE);
}

void Iflt(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfXX(code, ip, targetpc, pJmpTargetMap, JL);
}

void Ifge(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfXX(code, ip, targetpc, pJmpTargetMap, JGE);
}

void Ifgt(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfXX(code, ip, targetpc, pJmpTargetMap, JG);
}



Second one is comparison of any two numbers:

void IfICmpXX(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap, u1 XX)
{
    u1 c[]={
         //pContext->stackTop -= 2;
         0x8B, 0x45, 0x08, //         mov         eax,dword ptr [pContext] 
         0x8B, 0x48, 0x04, //         mov         ecx,dword ptr [eax+4] 
         0x83, 0xE9, 0x02, //         sub         ecx,2 
         0x8B, 0x55, 0x08, //         mov         edx,dword ptr [pContext] 
         0x89, 0x4A, 0x04, //         mov         dword ptr [edx+4],ecx 

         //if(!(pContext->stack[pContext->stackTop -2+2].intValue [XXOperator] pContext->stack[pContext->stackTop-1+2].intValue))
         0x8B, 0x45, 0x08, //         mov         eax,dword ptr [pContext] 
         0x8B, 0x48, 0x04, //         mov         ecx,dword ptr [eax+4] 
         0x8B, 0x55, 0x08, //         mov         edx,dword ptr [pContext] 
         0x8B, 0x02, //            mov         eax,dword ptr [edx] 
         0x8B, 0x55, 0x08, //         mov         edx,dword ptr [pContext] 
         0x8B, 0x52, 0x04, //         mov         edx,dword ptr [edx+4] 
         0x8B, 0x75, 0x08, //         mov         esi,dword ptr [pContext] 
         0x8B, 0x36, //            mov         esi,dword ptr [esi] 
         0x8B, 0x04, 0xC8, //         mov         eax,dword ptr [eax+ecx*8] 
         0x3B, 0x44, 0xD6, 0x08, //      cmp         eax,dword ptr [esi+edx*8+8] 
         0x0F, 0x00, 0x00, 0x00, 0x00, 0x00, // JXX  
    };

    memcpy(&code[ip], c, sizeof(c));
    ip+=sizeof(c);

    code[ip-5] = XX;

    CreateJmpLink(&code[ip-5], targetpc, pJmpTargetMap);
}

void IfIcmple(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfICmpXX(code, ip, targetpc, pJmpTargetMap, JLE);
}

void IfIcmpne(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfICmpXX(code, ip, targetpc, pJmpTargetMap, JNE);  
}

void IfIcmpge(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfICmpXX(code, ip, targetpc, pJmpTargetMap, JGE);
}

void IfIcmplt(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfICmpXX(code, ip, targetpc, pJmpTargetMap, JL);
}

void IfIcmpgt(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfICmpXX(code, ip, targetpc, pJmpTargetMap, JG);
}

void IfIcmpeq(u1* code, int& ip, int targetpc, CMapPtrToPtr *pJmpTargetMap)
{
    IfICmpXX(code, ip, targetpc, pJmpTargetMap, JE);
}


Thats it. We have all the branching instructions working correctly now.