Roadmap

Where we are, where want to go

Building a decompiler is a big endeavour. Follow our latest advancements and the plan for the months to come towards the 1.0 release.

Tier 2: Closed beta (part 1)Tier 2.5: Closed beta (part 2)Tier 3: Open beta (part 1)Tier 3.5: Open beta (part 2)Tier 4: 1.0 release

Tier 2: Closed beta (part 1)

#851

Adopt new Hub frontend

Dismiss the old Web 1.0 Hub frontend in favor of the new Next.js-based Hub frontend.

#850

Tier 2 Public Relations

Publish the new website, release a blog post, send newsletter, write tutorial for Tier 2 participants.

#849

No timeouts/crashes on selected binaries

Make sure we can run the whole pipeline on a predetermined set reasonably-sized x86-64 Linux binaries without crashes and within a reasonable time frame.

#777

Perform in-depth QA on `hostname`

Make sure the hostname binary is decompiled in a sensible way.

Tier 2.5: Closed beta (part 2)

#886

Scope graph canonicalization

Preliminary step and elementary building block of the new control-flow restructuring algorithm.

#879

Implement performance assessment

Implement logic to assess our performance compared to other tools.

#869

Clift backend

Implement a the new C backend, using our custom Clift MLIR Dialect to generate decompiled code.

#868

Adapt variable producers

Migrate all the passes of the old decompilation pipeline to use LLVM alloca/load/store to represent C local variables. This is preliminary work for the new Clift-based C backend.

#867

Clifter

Implement LLVM-to-Clift conversion for the new C decompilation backend.

#862

Handle all the forms of memcpy

Ensure we emit memcpy gracefully for all the various architectures we support.

#860

Comments in function's body

Enable users to input comments associated to a specific instruction of the program and show it in both the decompiled code and the disassembly.

#857

Mass testing

Test on a massive amount of binaries and promote binaries that are decompiled without crashes and withing a reasonable time frame to the regression suite.

#848

Implicit conversions

Detect and remove casts that in C would be implicit. This will significantly reduce the number of casts the user sees in the decompiled code.

#846

EFA QA

Perform Quality Assurance on EFA results on a vast number of functions with a diverse set of arguments and return values.

#829

Push variable declarations ALAP

In decompiled C code, make sure we declare local variables as late as possible.

#794

Primitives inlining

Change the model layout so that primitive types (e.g., uint32_t) are defined inline, instead of having an entry in model::Binary::Types.

#784

EFA4

Implement the 4th version of Early Function Analysis, which will significantly improve detection of register-based arguments and return values.

#759

Invalidation logic

Implement the logic to detect what artifacts needs to be recomputed, instead of recomputing everything at every change.

#758

Documentation

Provide public documentation of the model, the CLI and our Python/TypeScript wrappers.

#756

Declutter the UI

Make the necessary changes to VSCode to remove everything that's not strictly necessary for our use case.

Tier 3: Open beta (part 1)

#887

Scope-inducing transformations

Implement a set of transformations of the CFG to detect loops and nested scopes.

#885

Find references to global variables

Add support for the UI to enumerate all the uses of a global variable, specifically a field of a struct describing a segment.

#874

Clift: pre-backend passes

Various optimizations on Clift, aimed at generating better looking C code: integer literals, implicit casts, parentheses based on operator precedence.

#873

EmitFieldAccesses

Transform integer arithmetic into field accesses expression in the new Clift-based C backend.

#870

Clift canonicalizations

Clift canonicalization: fold &*, fold *&, two's complement arithmetic normalization, remove empty branches of if-statements, match advanced loops (while, do-while), handle noreturn.

#859

CRUD all model parts in UI

In the UI we need to provide a way to create, edit and remove types, functions, segments and so on.

#858

Initial auto-analyses twice

We need to be able to run the analysis pipeline twice without crashing.

#856

Rebase QEMU

Rebase QEMU to the latest version. This will enable us to support additional architectures and start working on for proper floating point support.

#842

Preserve debug info

Review the decompilation pipeline to ensure that debug information, which we use to trace decompiled code back to assembly instructions, are preserved as much as possible. This ensure we don't lose the link between decompiled code and assembly in most situations.

#838

Drop kinds

Get rid of kinds from revng-pipeline.

#806

Model upgrade

Implement infrastructure to automatically upgrade among model versions.

#797

Collaboration QA

Ensure collaboration works smoothly.

#788

HexView

Implement a basic hexadecimal view.

#785

Support multiple binaries

Make sure a single project can handle multiple projects. Also, switch to record hashes of binaries in the model, instead of asking the user to provide it.

#783

Outlining/inling/tail calls

The Inline attribute of model::Function has known limitations. Make sure we can inline any function.

#781

Hub: expose snippets

Implement in Hub a feature to embed decompiled code snippets.

#764

Reorganize repositories

Merge the revng and revng-c repositories.

#763

revng-pypeline

Implement a more git-like CLI for revng and move most of the revng-pipeline logic to Python.

#760

Python client

Implement a dev-friendly Python library to interact with revng-daemon's GraphQL API.

Tier 3.5: Open beta (part 2)

#889

DLA2

Design and implement the second version of Data Layout Analysis (DLA).

#888

All analyses should import model

Some analysis are currently designed to be run only once. We need to upgrade them to be able to incrementally improve the model given its current state.

#878

`goto` optimization

Implement algorithm for reducing the number of emitted gotos, with heuristic for preventing excessive code duplication.

#877

VMA2

Implement type propagation within the body of a function.

#876

Adopt alias analysis in SwitchToStatements

Inform the Clift-based C decompilation pipeline that the stack frame does not alias other memory, to avoid redundant accesses to it in decompiled C code.

#861

Tackle stack slot reuse

Devise a way to handle stack slot being used in different ways across the body of a function. Core idea: promote to SSA value.

#841

Model verify on the client

Enable the VSCode client to verify the model without making a remote request. This ensures that the user can make interactive changes and immediately have a feedback if the changes are valid or not.

#810

Find references to `struct` field

Make sure the UI can perform backward navigation even between references that are not available in the call graph. This might require to materialize all artifacts in background.

#801

Perform QA on various architectures

We need to fix platform-specific issues, bug and limitations that pop up on architectures that have not gone through QA yet.

#796

Support variadic arguments

Implement support for variadic arguments for the various ABIs we support.

#795

Floating point support

Improve support for floating point instruction and data types.

#786

Segment with designated initializers

In the C view, show segments as global variables using C's designated initializers.

#780

Implement undo/redo

Implement the undo/redo feature.

#775

Decompilation headers QA

Cleanup the C headers we emit.

#770

Strings view

Implement a simple view to show all the strings we detected in the binary.

#767

DLA: import model + subgraph

Ensure DLA can import existing model information and can correctly run on a portion of the call graph. This will enable us to re-run DLA after the initial analysis.

#751