Ever wanted to dump all the executable pages of a process? Do you crave something capable of dealing with packed processes?

We've got you covered! May I introduce PageBuster, our tool to gather dumps of all executable pages of packed Linux processes. Keep reading to find out its details and what happens under the hoods!

First things first, the the code on GitHub and the demo:

In one of our many research projects here at rev.ng, we are dealing with Big Data (is a 1..10 TB compressed database dump big? Well, probably not, but it is for us). Our first approach was to extract the data and store it in an SQL database, then run a bunch of queries and finally export the processed tables for other purposes. See the problem there? We used to use the database just like a, err... data processing tool? Unfortunately this wasn't working very well: we were having all kinds of performance bottlenecks since we were doing bulk inserts and bulk selects.

We then thought of using Spark or some other fancy stuff like that in order to stream process everything and just use text files. But, you know, we are a binary analysis company so most of the people here don't like garbage collectors (except me, the author of this blogpost, who like them very much). Anyway, we went from using MySQL to MongoDB+MySQL to MongoDB+PostgreSQL to, you guessed it, text files + good ol' Bash.

In this article I will persuade you, CEO at a brand-new Spark'ing startup that, sometimes, Bash'ing is all you need.

Note: If you haven't checked out our Big Match post, go read it.

Why don't you subscribe to our newsletter and get access to nightly builds?
rev.ng Labs Srl - P. IVA: IT02776470359 - Via Inama 19 - 20133 - Milano, Italy -
Twitter - GitHub - Privacy policy