- Aug 12, 2017
-
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
Most of the times, when we need to get the next instruction, we actually want to skip over "marker" function calls (e.g., calls to `newpc` and `function_call`). `nextNonMarker` does exactly this. `FunctionCallIdentification::isCall` and `JumpTargetManager::setCFGForm` have also been extended to correctly handle such situations.
-
Alessandro Di Federico authored
`JumpTargetManager::translateIndirectJumps` has been pushed into `JumpTargetManager::finalizeJumpTargets`. Moreover, an safety check about the removal of `exitTB` has been introduced.
-
Alessandro Di Federico authored
The basic block handling the default case of the dispatcher used not to be tagged with `revamb.block.type`, now it is.
-
- Jul 07, 2017
-
-
Alessandro Di Federico authored
This commit fixes an assertion triggered by the fact that a segment includes exclusively zero-initialized data (i.e., size on file is 0, memory size is not). In this case LLVM detects the fact that the global variable associated to the segment is composed exclusively composed by 0s and uses a `ConstantAggregateZero` as an initializer instead of a `ConstantDataArray`. Currently the solution is ignore that data, however, in the future it might be beneficial to be able to read data from `.bss`, even if we just have zeros there. Thanks to Thorbjoern Schulz for reporting this bug.
-
- Mar 31, 2017
-
-
Alessandro Di Federico authored
Landing pads are basically the `catch` blocks in C++ `try`/`catch` statements. So far we were missing them since they are encoded in a particular way in a way similar to DWARF debugging information in the `.eh_frame` and, more specifically, in the `.gcc_except_table` sections of ELF programs. This commit parses these sections so that the basic blocks associated to landing pads are correctly identified. Personality functions are detected too. A test is also introduced to assess the effectiveness of our code.
-
- Mar 23, 2017
-
-
Alessandro Di Federico authored
-
- Mar 06, 2017
-
-
Alessandro Di Federico authored
This commit changes the way instruction and basic block are purged when re-translation is necessary. Specifically, the purge is now performed through a post-order visit, which should prevent the removal of any instructions still holding users. This commit also introduces the `SubGraph` class, which is useful to be able to navigate portions of a graph (e.g., a `Function`) in post-order easily.
-
- Mar 02, 2017
-
-
Alessandro Di Federico authored
This commit should fix some bugs due to the fact that when we're splitting a basic block we don't retranslate the basic block at the split point but preserve the existing code. This lead to problems, in particular in x86-64 where certain QEMU local variables were not available. This change should fix it. Basically, every time we split a basic block in `JumpTargetManager::registerJT` we note down that the new basic block must be purged, and in `JumpTargetManager::harvest` we perform the purge. `harvest` has been chosen since it's a particularly quiet moment, i.e., there should be no pending references/iterator to code we have to delete.
-
Alessandro Di Federico authored
If we need this again, we can do it in revamb-dump.
-
- Dec 08, 2016
-
-
Alessandro Di Federico authored
Currently we're identifying basic blocks that are a jump target by adding metadata on the terminator instruction. This is a problem in many cases, therefore we now use the third parameter of `newpc` calls to understand if a basic block is a jump target. The third argument was set only at the very end of all our analysis, before producing the output. We anticipate this so that is done before each jump target harvesting, so that this information is available through `GeneratedCodeBasicInfo`.
-
Alessandro Di Federico authored
This commit introduces two new passes: * `GeneratedCodeBasicInfo`: recovers from the IR some basic information like the size of delay slots in the input architecture, the name of the program counter and so on. It can also identify the type of a basic block (e.g., dispatcher, jump target...). * * `FunctionCallIdentification`: identifies function calls and injects a marker before the associated terminator instruction. The idea of these two passes is to try to progressively move information we used to keep in `JumpTargetManager` into the IR, so that it is more easily accessible and passes do not need a reference to `JTM`. In particular by having markers for function calls available during jump target discovery we don't have to have duplicated and suboptimal implementation of `isCall`. This commit also introduce some additional helper functions and an helper class to quickly.
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
Let functions such as `JumpTargetManager::readRawValue` take a parameter specifying if the value should be read from the segment using the endianess of the original architecture or of the target architecture. This commit fixes a bug with big endian architectures (i.e., MIPS) since when materializing a value on the operation stack of SET, the endianess was changed twice, once in `readRawValue` and the second time while applying the `bswap` instruction which is registered on the stack.
-
- Dec 03, 2016
-
-
Alessandro Di Federico authored
`NoFunctionCallsCFG` is a form of the CFG where all the function call edges are replaced with jumps to the return address. This is beneficial in certain analysis to pretend we're working on a function-level. To implement such a form of CFG we now emit right before the terminator of each caller basic block a call to the "function_call" function passing as the first parameter the callee basic block and as the second one the return basic block. Using this function calls, switching to `NoFunctionCallsCFG` and back becomes straightforward.
-
Alessandro Di Federico authored
This commit introduces `JumpTargetManager::setCFGForm` which allows to choose which type of CFG the user currently wants. The default and final form should be `SemanticPreservingCFG`, which is the most conservative one. However for certain analysis might be beneficial to have a reduced CFG with almost no dispatcher (in particular for OSRA and SET). This new function handles the switching between the two currently available forms of CFG by changing the behavior of the `anyPC` and `unexpectedPC` basic blocks and rebuilding the dispatcher as appropriate.
-
Alessandro Di Federico authored
Every time we don't know where an indirect jump can go, we used to emit a jump to the dispatcher, however this complicates our analyses, in particular the computed dominator tree provides less useful information than it could. This commit transforms all the jumps to the dispatcher into jumps to a "anypc" basic block which during analysis just contains an unreachable instruction, but during finalization this instruction is replaced with a jump to the dispatcher. A similar (temporary) situation is for the "unexpectepc" case. This commit also makes the `visit(Sucessors|Predecessors)` functions more idiomatic by employing a trait for black lists.
-
Alessandro Di Federico authored
`JumpTargetManager::readRawValue` used to take into account the endianess information from `DataLayout`, i.e., the output endianess, while the input endianess should be take into account. The commit also checks that during final basic block finalization we have no empty basic blocks.
-
Alessandro Di Federico authored
This commit removes all the ELF-specific code from the `CodeGenerator` class by creating a new class, `BinaryFile` which contains all the information about the program that might be needed in an image format independent way. However, `BinaryFile` has some fields which are specific to ELF, we might want to address this when additional file formats are supported. A key benefit of isolating this code is that we can anticipate the parsing of the input file, so that we have its architecture available earlier than when `CodeGenerator` is instantiated, therefore we can drop the `--architecture` parameter.
-
- Sep 27, 2016
-
-
Alessandro Di Federico authored
This commit introduces the usage of symbols, if they are available. We employ them to produce meaningful names for basic block names. * Collect the symbols from `.symtab`/`.dynsym` * Box the `Segments` into a new data structure (`BinaryInfo`) which also handles symbols. * `JumpTargetManager::nameForAddress`: produce a meaningful name using symbols, if possible. * Spread some `const`-ness
-
- Sep 20, 2016
-
-
Alessandro Di Federico authored
-
- Sep 17, 2016
-
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
This commit introduces the `noreturn` analysis, whose aim is to detect all the basic blocks the are doomed to lead to a `noreturn` syscall such as `execve` or `exit`. * Implement `NoreturnAnalysis`. * Include and initialize in the `Architecture` data structure all the necessary information to detect `noreturn` syscalls. Specifically, the name of the QEMU helper for syscalls, the name of the register holding the syscall number and the syscall numbers representing `noreturn` syscalls. * `ReachingDefinitionsPass`: make reaching definitions available both in reaching definitions mode and reached loads mode. This part needs further cleanup. We also might be willing to implement this with a `Boost.Bimap`. * Use `SET` to collect information useful for the `NoreturnAnalysis`. Also restructure how the `OperationsStack` works to be more streamlined and keep track of multiple information about the instruction currently being tracked.
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
* Clear all the data that's not part of the analysis results at the end of the `runOnFunction` method * Clear all the data that's part of the analysis results when the `PassManager` tells us so (`Pass::releaseMemory`) * Do not use the `clear()` method, since it doesn't release memory * Add some debugging information
-
Alessandro Di Federico authored
This commit registers for each jump target how we met it, as a flag. It also keeps track of which pointers in global data have been involved in materialization performed by SET: those who are not are of special interest for us, since they are likely function pointers, and are therefore marked with a specific flag.
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
* Add an "s" in the name * Transform the pass in analysis and let OSRA use it
-
- Aug 20, 2016
-
-
Alessandro Di Federico authored
* When generating the code for setting a label or jumping to it, give sensible names to the new basic blocks. * Keep track of the last seen PC during translation so it can be used to obtain a sensible name for the basic block. * Let `JumpTargetManager::getBlockAt` set a proper name to the basic block before returning, if it doesn't already have one.
-
Alessandro Di Federico authored
`forceFallthroughAfterHelper` handles the situation where there isn't a PC-store between a call to an helper and to `exitTB`, in this case, we force a branch to the fallthrough PC. This commit also simplifies `InstructionTranslator::translateCall`: remove jump to the dispatcher after a call to an helper in case the PC was saved and it has changed. We don't really need to do this, QEMU will generate a call to `exitTB` has necessary or `forceFallthroughAfterHelper` will take care of the thing.
-
Alessandro Di Federico authored
* The function now can take a `std::set` of basic blocks to ignore. * The visitor function has now several options on how to proceed, and can express them through its return value. * A serious bug in the implementation was also fixed.
-
Alessandro Di Federico authored
This pass helps us handling instructions like ARM's `blt` which compute the result of the comparison by bit-fiddling with the bit sign of the operands of a subtraction. The idea is to have a series of known boolean expressions using `a`, `b' and `c` as variables (e.g. the boolean expression corresponding to "signed greater than") and compare their truth table against the one being analyzed. In case of match, the comparison can be simplified.
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
-
Alessandro Di Federico authored
-