Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instruction RWInfo incomplete/incorrect in some cases #321

Open
ZehMatt opened this issue Jan 19, 2021 · 1 comment
Open

Instruction RWInfo incomplete/incorrect in some cases #321

ZehMatt opened this issue Jan 19, 2021 · 1 comment
Labels

Comments

@ZehMatt
Copy link
Contributor

ZehMatt commented Jan 19, 2021

There are some instructions that have hidden operands such as call/push/pop that would read/write the stack pointer. There are also some instructions where I remember the information being not entirely correct.

Heres what I have discovered so far, there might be more.

        switch (ins->id())
        {
            case x86::Inst::kIdSahf:
                regsRead.add(x86::rax);
                break;
            case x86::Inst::kIdImul:
                if (ins->opCount() == 1)
                {
                    regsWrite.add(x86::rax);
                    regsRead.add(x86::rdx);
                }
                break;
            case x86::Inst::kIdRdtscp:
                regsWrite.add(x86::rax);
                regsWrite.add(x86::rdx);
                regsWrite.add(x86::rcx);
                break;
            case x86::Inst::kIdRdtsc:
                regsWrite.add(x86::rax);
                regsWrite.add(x86::rdx);
                break;
            case x86::Inst::kIdDiv:
            case x86::Inst::kIdMul:
                regsWrite.add(x86::rax);
                regsWrite.add(x86::rdx);
                regsRead.add(x86::rax);
                regsRead.add(x86::rdx);
                break;
            case x86::Inst::kIdLahf:
                regsWrite.add(x86::rax);
                regsRead.add(x86::rax);
                break;
            case x86::Inst::kIdCmpxchg8b:
            case x86::Inst::kIdCmpxchg16b:
                regsRead.add(x86::rdx);
                regsRead.add(x86::rax);
                regsRead.add(x86::rcx);
                regsRead.add(x86::rbx);
                regsWrite.add(x86::rdx);
                regsWrite.add(x86::rax);
                break;
            case x86::Inst::kIdCmpxchg:
                regsRead.add(x86::rax);
                regsWrite.add(x86::rax);
                break;
            case x86::Inst::kIdCbw:
            case x86::Inst::kIdCwde:
            case x86::Inst::kIdCdqe:
                regsRead.add(x86::rax);
                regsWrite.add(x86::rax);
                break;
            case x86::Inst::kIdCwd:
            case x86::Inst::kIdCdq:
            case x86::Inst::kIdCqo:
                regsRead.add(x86::rax);
                regsWrite.add(x86::rdx);
                break;
            case x86::Inst::kIdPush:
            case x86::Inst::kIdPop:
            case x86::Inst::kIdRet:
            case x86::Inst::kIdPushfq:
            case x86::Inst::kIdPushfd:
            case x86::Inst::kIdPushf:
            case x86::Inst::kIdPopfq:
            case x86::Inst::kIdPopfd:
            case x86::Inst::kIdPopf:
                regsRead.add(x86::rsp);
                regsWrite.add(x86::rsp);
                break;
            case x86::Inst::kIdLods:
                regsWrite.add(x86::rsi);
                regsRead.add(x86::rsi);
                regsWrite.add(x86::rax);
                break;
            case x86::Inst::kIdStos:
                regsWrite.add(x86::rsi);
                regsRead.add(x86::rsi);
                regsRead.add(x86::rax);
                break;
            case x86::Inst::kIdMovs:
                regsWrite.add(x86::rsi);
                regsWrite.add(x86::rdi);
                regsRead.add(x86::rsi);
                regsRead.add(x86::rdi);
                break;
            case x86::Inst::kIdInt:
            case x86::Inst::kIdInt3:
            case x86::Inst::kIdCpuid:
                regsRead.add(x86::rax);
                regsRead.add(x86::rcx);
                regsWrite.add(x86::rax);
                regsWrite.add(x86::rcx);
                regsWrite.add(x86::rdx);
                regsWrite.add(x86::rbx);
                break;
        }

        // Repeat types
        if ((ins->instOptions() & x86::Inst::kOptionRep) || (ins->instOptions() & x86::Inst::kOptionRepne))
        {
            switch (ins->id())
            {
                case x86::Inst::kIdStos:
                case x86::Inst::kIdLods:
                case x86::Inst::kIdCmps:
                case x86::Inst::kIdScas:
                case x86::Inst::kIdOuts:
                case x86::Inst::kIdIns:
                case x86::Inst::kIdMovs:
                    regsWrite.add(x86::rcx);
                    regsRead.add(x86::rcx);
                    break;
            }
        }

Its possible that some things got fixed but most of them still lack the correct information.

I also have this code for flags

     if (ins->id() == asmjit::x86::Inst::kIdRcl || ins->id() == asmjit::x86::Inst::kIdRcr)
            flagsR &= ~(1 << 11);

So thats probably wrong too, most of this is was tested against Zydis Disassembler.

@kobalicek kobalicek added the bug label Jan 19, 2021
@kobalicek
Copy link
Member

kobalicek commented Jan 20, 2021

I think we will really need an API to convert an instruction into some canonical form, and then we will need to add hidden registers to the RW query, because now the query only considers only the operands it sees.

For the Compiler infrastructure the current design works well, because Compiler by itself cannot manage anything that would be hidden, so only explicit forms of instructions work. However, Assembler/Builder interfaces don't have such limitation and anyone can use implicit forms of instructions, which is actually what is the biggest problem in this case.

I'm kinda undecided whether supporting implicit forms was a mistake or whether asmjit should continue supporting such forms, or whether it should automatically make all inputs canonical (explicit) when RW query is used. That doesn't solve hidden registers though, so the API has to be extended in any way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants