Commits · 5b8fb1af1478de0be61dfc853c56cfebf2363524 · revng-bar-2019 / revamb

Mar 23, 2017

`BoundedValue`: support for multiple ranges · 5b8fb1af

Alessandro Di Federico authored 7 years ago

This commit introduces radically changes the implementation of
`BoundedValue`: it no longer represents a single, contiguous range, but
an arbitrary number of ranges.

The bounds are now represented through a
`llvm::SmallVector<std::pair<uint64_t, uint64_t>, 3>`.

* Introduce the `BoundedValue::bounds()` method, which allows to iterate
  over all the ranges that a `BoundedValue` represents. The `bounds`
  method returns a `Bounds` object, which can be used as a range
  composed by `BoundsIterator`.
* All the methods dealing with the `BoundedValue`'s bounds have been
  rewritten.
* New debugging information: "bv-merge". Print all the computations
  performed by `BoundedValue::mergeImpl`.
* Drop dead code: `BoundedValue::setBound` and `isPositive`
* Introduce `BoundedValue::isRightOpen` and drop
  `BoundedValue::isSingleRange`

5b8fb1af

Improve `OSRA::pathSensitiveMerge` · 4089c203

Alessandro Di Federico authored 7 years ago

Some subtle bugs have been fixed in `OSRA::pathSensitiveMerge`:

* Do not alter the current `BoundedValue` if merging a component would
  lead to bottom.
* Do not deactivate a reacher in case an incoherent condition is met.

4089c203

`ReachingDefinitionsPass`: free loads are definers · 1d967349

Alessandro Di Federico authored 7 years ago

In our reaching definition analysis we used to consider all the loads
not reached by any store as definitions. However we forgot to actually
register them as such, with the result that two consecutive loads from
the same CSV would end up being two free loads.

1d967349

`ConditionNumberingPass`: last user, still user · 244f0dc9

Alessandro Di Federico authored 7 years ago

In `ConditionNumberingPass` we used to consider as resetting the last
basic block possibly interested in a certain numbered
condition. However, what we really meant, was that its successors were
resetting basic blocks. This commit fixes this issue.

244f0dc9

`OSR::constant()`: consider `Base` and `Factor` · f409fd65

Alessandro Di Federico authored 7 years ago

`OSR::constant()` used to forward the result of
`BoundedValue::constant()`, but this is wrong, since the factor and the
base value have to be considered too.

f409fd65

OSRA: x | bottom = x, not bottom · bddc0d03

Alessandro Di Federico authored 7 years ago

or-merging bottom with anything used to produce a bottom value, which is
wrong. The non-bottom value should be produced instead.

bddc0d03

Propagated constraints should be and-merged · 0f70a6ef

Alessandro Di Federico authored 7 years ago

Constraints associated to a memory instruction are propagated to
reached loads. However, if a constraint on the same `Value` is already
present, the new constraint should be and-merged, not or-merged.

0f70a6ef

Do not remove predecessors while iterating on them · ab160b03
Alessandro Di Federico authored 7 years ago

ab160b03

Minor improvements · 8bb3a382

Alessandro Di Federico authored 7 years ago

* Introduce some additional helpers
* Spread some `const`ness
* Improve documentation
* New debugging information: "osr-bv". Prints every update operation
  performed in `BVMap::update`.
* Remove dead code
* Whitespace fixes
* Some new TODOs
* Fix some typos in comments

8bb3a382

Mar 10, 2017

Switch `BoundedValue::merge` to `boost:icl` · 850fc09a

Alessandro Di Federico authored 8 years ago

This commit drops the original handcrafted implementation of
`BoundedValue` merging, in favor of an implementation based on Boost
intervals. The old implementation was the source of intermittend bugs,
using Boost should be a more reliable solution. Moreover, this commit
enables moves us towards supporting multiple ranges in `BoundedValues`.

850fc09a

Mar 09, 2017
- BoundedValue: information hiding · af209172
  Alessandro Di Federico authored 8 years ago
  
  af209172
- Minor fixes to `BoundedValue::merge` · 51019ddf
  Alessandro Di Federico authored 8 years ago
  
  51019ddf
- Introduce OSRA::dump() · fd10c8d8
  Alessandro Di Federico authored 8 years ago
  
  fd10c8d8
Mar 08, 2017

OSRA: track register+constant pointers too · b21f3b18

Alessandro Di Federico authored 8 years ago

In OSRA we used to track `GlobalVariable`s and `AllocaInst` only,
despite the `MemoryAccess` infrastructure supported memory accesses of
the type register + constant. Enabling this, we're able to handle the
following x86-64 snippet found in the `omnetpp` SPEC benchmark:

    cmp    DWORD PTR [rbx+0x8],0x5
    ja     elsewhere
    mov    eax,DWORD PTR [rbx+0x8]

b21f3b18

Mar 06, 2017

Improve handling of load insturctions in SET · 14f4cd74

Alessandro Di Federico authored 8 years ago

SET, when getting data from OSRA, used to check that the first and last
materialized address were within a certain range, under the assumption
that the last value would be greater that the first one. Turns out that
this is not always the case when in the operation stack we have a load
instruction. This commit improve the way such a situation is handled.

14f4cd74

Purge translated code in post-order · d398213a

Alessandro Di Federico authored 8 years ago

This commit changes the way instruction and basic block are purged when
re-translation is necessary. Specifically, the purge is now performed
through a post-order visit, which should prevent the removal of any
instructions still holding users.

This commit also introduces the `SubGraph` class, which is useful to be
able to navigate portions of a graph (e.g., a `Function`) in post-order
easily.

d398213a

Reorganize OSRA · 87dc88c2

Alessandro Di Federico authored 8 years ago

The main goal of this patch is to reduce the size of
`OSRAPass::runOnFunction()`. To do this we created the `OSRA` class
which handles everything `runOnFunction` was taking care of but without
the ugly lambdas nor being an endless function. Each class of
instruction is now handled by a dedicated function.

This also has the side effect of heavily reducing the amount of clutter
exposed by `OSRAPass` to its users.

87dc88c2

Mar 02, 2017

Anticipate `cpu_loop_exit` removal · 1e9163c7

Alessandro Di Federico authored 8 years ago

Fix of another bug showing up only with LLVM in debug mode: splitting a
malformed basic block is not allowed, and we had a function call after a
`ret` instruction.

1e9163c7

When splitting a basic block, retranslate · 04a4591f

Alessandro Di Federico authored 8 years ago

This commit should fix some bugs due to the fact that when we're
splitting a basic block we don't retranslate the basic block at the
split point but preserve the existing code. This lead to problems, in
particular in x86-64 where certain QEMU local variables were not
available. This change should fix it.

Basically, every time we split a basic block in
`JumpTargetManager::registerJT` we note down that the new basic block
must be purged, and in `JumpTargetManager::harvest` we perform the
purge. `harvest` has been chosen since it's a particularly quiet moment,
i.e., there should be no pending references/iterator to code we have to
delete.

04a4591f

Fix deletion order of temporary parts in RDA · 7babafac

Alessandro Di Federico authored 8 years ago

This commit fixes a bug that appears only with debug builds of LLVM: in
RDA we were erasing a temporary common predecessor basic block before
removing the references to it in a `switch` statement.

7babafac

Dismiss basic block statistics collection · d6b257dd
Alessandro Di Federico authored 8 years ago
```
If we need this again, we can do it in revamb-dump.
```
d6b257dd

OSRA: ignore unknown signedness in comparisons · f8c7ebab

Alessandro Di Federico authored 8 years ago

When performing a comparison, we try to attach its signedness to the
base OSR it's working on. However, this is not always
effective. Typically, even after this, the OSR remains with an unknown
signedness due to the fact that we don't have information about its
bounded value from all the predecessors, and therefore it goes to top.

f8c7ebab

Reorganize the tracing system · 4789524a

Alessandro Di Federico authored 8 years ago

`support.c` used to have a basic tracing system which would print on `stderr`
all the program counters. This commit reorganizes this system so that it has a
buffer, whose size can be specified through the `REVAMB_TRACE_BUFFER_SIZE`
environment variable, and it's possible to specify the output file
`REVAMB_TRACE_PATH`. The new tracing system also supports flushing the buffers
in case of clean exit of crash.

We also cleaned up `support.c`.

4789524a

Compile `support.c` to LLVM IR · dae2f7e6

Alessandro Di Federico authored 8 years ago

`support.c` used to be compiled using the system compiler and then
linked to the module generated by `revamb` as a separate translation
unit. This commit introduces a change that lets `clang` compile
`support.c`. This will allow us to make the CSV static, which should
enable more aggressive optimizations.

* Change the signature of the `root` function so that it accepts an
  argument: the initial value of the stack pointer, which the main is
  supposed to set up. QEMU now provides us with the offset of the stack
  pointer.
* Let the build system compile `support.c` for each supported
  architecture, both in normal and "tracing" mode.
* Remove the `--tracing` option, this is now handled by `support.c`, in
  particular depending on which version of `support.c` you link, you can
  have tracing enabled or not.
* In `support.c` drop global variables representing the stack pointer,
  we no longer need them.
* In `support.c` fix some warnings while handling the stack on 32-bit
  architectures.
* Extende the `translate` script to handle the new way we link the final
  binary and the tracing mechanism.

dae2f7e6

Feb 20, 2017

Let the entry point go through the dispatcher · 451a321b

Alessandro Di Federico authored 8 years ago

We used to jump directly to the program entry point (typically
`_start`), without initializing the program counter. We now initialize
the program counter and the let the program start with the dispatcher,
since it seems a safer solution.

451a321b

Force root function's name · 1a3950a3
Alessandro Di Federico authored 8 years ago

1a3950a3

Jan 23, 2017

Handle calls with multiple successors · 3b89c5f3

Alessandro Di Federico authored 8 years ago

This commit handles the situation where we have a function call with
more than one successor. This might be the case if there's an indirect
call but we're able to enumerate the possible targets statically.

This commit simply avoids an assertion, in the future we will register
all the possible destinations in the `function_call` marker.

3b89c5f3

Function detection: drop normalized address space · c3fbcf31

Alessandro Di Federico authored 8 years ago

We used to have a normalized address space formed only functions,
skipping all the holes. This was employed to identify "skipping
jumps". However, this method was ineffective, and we changed the
definition of skipping jump as a jump skipping over CFEPs that are
highly likely to be actual functions.

This commit removes the leftovers of the normalized address space
computation, which was also the cause of a bug in function detection.

c3fbcf31

Alessandro Di Federico authored 8 years ago

This commit fixes a bug triggered by specifying the `--entry` switch:
the entry point would not be registered as a jump target, instead the
code would try to get the basic block associated to that address,
resulting in an assertion.

The manually specified entry point is now registered as a jump target
coming from global data.

6b91aeb9

Jan 11, 2017
- Initial import of documentation · 5154cb57
  Alessandro Di Federico authored 8 years ago
  
  5154cb57
- SET: ignore 128-bit integers · 72337caf
  Alessandro Di Federico authored 8 years ago
  
  72337caf
- Cleanup final IR from temporary functions · 75d38a71
  Alessandro Di Federico authored 8 years ago
  
  75d38a71
- Introduce metadata `func.entry` and rename `func` · 9fa311bd
  Alessandro Di Federico authored 8 years ago
  
  Attach a `func.entry` metadata node to the entry basic block of a function and rename `func` to `func.member.of`.
  9fa311bd
- Introduce the `--no-link` option · 4501dad2
  Alessandro Di Federico authored 8 years ago
  
  Introduce an option to prevent `revamb` from linking in all the QEMU helpers. This is useful if the output doesn't need to be compiled, but just analyzed.
  4501dad2
Dec 08, 2016

Merge branch 'feature/limit-osra-constraint-propagation' · 3f760876
Alessandro Di Federico authored 8 years ago

3f760876

Change the way we denote JT basic blocks · 3e8e9c23

Alessandro Di Federico authored 8 years ago

Currently we're identifying basic blocks that are a jump target by
adding metadata on the terminator instruction. This is a problem in many
cases, therefore we now use the third parameter of `newpc` calls to
understand if a basic block is a jump target.

The third argument was set only at the very end of all our analysis,
before producing the output. We anticipate this so that is done before
each jump target harvesting, so that this information is available
through `GeneratedCodeBasicInfo`.

3e8e9c23

Fix function call identification · 00e39af8

Alessandro Di Federico authored 8 years ago

This commit fixes a couple of bugs in function call
identification.

* Adopt a new version of `visitPredecessors` more similar to
  `visitSuccessors`.
* If we meet a call to `newpc` which has an unexpected address (i.e.,
  it's not the previous with respect to the last we saw) give up.
* Ensure we went through the required amount of instructions before
  finding the return address (in case the architecture has delay slots).
* When counting the number of successors of a function call to check if
  there's a single one, ignore the dispatcher-related basic blocks.

00e39af8

OSRA: heavily reduce constraint propagation · 1bbe758e

Alessandro Di Federico authored 8 years ago

This commit reduces the amount of constraint we propagate. We do this in
two ways. First, by computing the set of all the instruction that will
ever be affected by the current instruction (recursively). Second, by
preventing propagation on constraints across function calls.

In quick test on `ls` compiled for MIPS we reduce the execution time by
55% of the peak memory usage by 68%. This makes me quite happy.

1bbe758e

OSRA: always create an OSR for Trunc/ZExt · 57fa395c
Alessandro Di Federico authored 8 years ago

57fa395c
Correctly initialize `Architecture::DelaySlotSize` · c798e3ea
Alessandro Di Federico authored 8 years ago

c798e3ea