Skip to content
  • Alessandro Di Federico's avatar
    61cfbdfc
    Introduce support for dynamic binaries · 61cfbdfc
    Alessandro Di Federico authored
    This commit introduces support for dynamic programs. The current
    implementation translate the main binary and uses native libraries. This
    works only if the target architecture is the same as the source
    one. Currently we only handle x86-64.
    
    * The `ExternalJumpsHandler` class has been introduced. It basically
      takes care of extending the dispatcher handling the case in which the
      program counter is an address outside the range of executable
      addresses of the input program. In this case, a `setjmp` is perfomed,
      the CPU state is serialized to physical registers and jump to the
      value of the program counter is performed.
    
      Once the target code will try to return to the translated program, a
      segmentation fault will be triggered, a `longjmp` is performed and the
      CPU state is deserialized so that the execution can resume (from the
      dispatcher).
    
    * `early-linked.c` has been introduced. Its purposes is to provide
      declarations of variables and functions defined in `support.c`. In the
      past, we had to manually create these definitions, a cumbersome and
      error prone we now avoid by letting `clang` compile `early-linked.c`
      and then linking it in.
    
    * The old `support.h` is now known as `commonconstants.h`. `support.h`
      now contains declarations that have to be consumed by
      `early-linked.c`.
    
    * Each architecture now provides additional information:
    
      1. Which registers are part of the ABI and have to be preserved. If
         necessary the QEMU name can be provided. For each register it's
         also possible to provide their position within the `mcontext_t`
         structure, provided by the signal handler.
      2. Three assembly snippets, one to write a register, one to read it
         and one perform an indirect jump.
    
      Some of this information is also exposed in the output module as
      metadata.
    
    * `support.c` now installs a SIGSEGV signal handler. Since pages that
      were originally executable are no longer executable, jumping there
      (typically, from a library) will trigger a SIGSEGV that we will
      handle. This allows us to properly deserialize the CPU state and
      resume execution of the translate code.
    
    * Now also a dynamic version of each test program is translated and
      tested.
    
    * The `merge-dynamic.py` script has been introduced: it takes case of
      rewriting the translated binary so to tell the linker to performe both
      the relocations of the translate program and the relocations of the
      original program. It does so by rewriting a large portion of the
      sections employed by the dynamic linker such as `.dynamic`, `.dynsym`
      and so on.
    
    * The `compile-time-constants.py` script has been introduced: it a
      user-specified compiler on a source file producing an object
      file. This object file is inspected and the value of global read-only
      variables is produced in a CSV.
    61cfbdfc
    Introduce support for dynamic binaries
    Alessandro Di Federico authored
    This commit introduces support for dynamic programs. The current
    implementation translate the main binary and uses native libraries. This
    works only if the target architecture is the same as the source
    one. Currently we only handle x86-64.
    
    * The `ExternalJumpsHandler` class has been introduced. It basically
      takes care of extending the dispatcher handling the case in which the
      program counter is an address outside the range of executable
      addresses of the input program. In this case, a `setjmp` is perfomed,
      the CPU state is serialized to physical registers and jump to the
      value of the program counter is performed.
    
      Once the target code will try to return to the translated program, a
      segmentation fault will be triggered, a `longjmp` is performed and the
      CPU state is deserialized so that the execution can resume (from the
      dispatcher).
    
    * `early-linked.c` has been introduced. Its purposes is to provide
      declarations of variables and functions defined in `support.c`. In the
      past, we had to manually create these definitions, a cumbersome and
      error prone we now avoid by letting `clang` compile `early-linked.c`
      and then linking it in.
    
    * The old `support.h` is now known as `commonconstants.h`. `support.h`
      now contains declarations that have to be consumed by
      `early-linked.c`.
    
    * Each architecture now provides additional information:
    
      1. Which registers are part of the ABI and have to be preserved. If
         necessary the QEMU name can be provided. For each register it's
         also possible to provide their position within the `mcontext_t`
         structure, provided by the signal handler.
      2. Three assembly snippets, one to write a register, one to read it
         and one perform an indirect jump.
    
      Some of this information is also exposed in the output module as
      metadata.
    
    * `support.c` now installs a SIGSEGV signal handler. Since pages that
      were originally executable are no longer executable, jumping there
      (typically, from a library) will trigger a SIGSEGV that we will
      handle. This allows us to properly deserialize the CPU state and
      resume execution of the translate code.
    
    * Now also a dynamic version of each test program is translated and
      tested.
    
    * The `merge-dynamic.py` script has been introduced: it takes case of
      rewriting the translated binary so to tell the linker to performe both
      the relocations of the translate program and the relocations of the
      original program. It does so by rewriting a large portion of the
      sections employed by the dynamic linker such as `.dynamic`, `.dynsym`
      and so on.
    
    * The `compile-time-constants.py` script has been introduced: it a
      user-specified compiler on a source file producing an object
      file. This object file is inspected and the value of global read-only
      variables is produced in a CSV.
Loading