Rewriting stackcollapse-xdebug in Rust
A week or so ago, I saw the inferno project mentioned on the Rust subreddit. It was a rewrite of the great FlameGraph library into Rust.
All of the work was being livestreamed by Jon Gjengset. I ended up watching some of the livestreams and had the idea of porting the stackcollapse-xdebug.php
file to Rust, potentially so it could be included in the project in the future.
I have been meaning to do this for a while. I use the stackcollapse-xdebug.php
script in my work on PHP Flame Graphs and made a small frontend to list the Xdebug traces and generate the Flame Graphs. It tends to work well, but parsing the large files can take some time and sometimes hits the PHP maximum execution time.
Rewriting it in Rust only took me a couple of hours and I spent the most time learning the format of the input file. The existing PHP file was based on an older version but still worked on the latest versions.
My initial tests (using hyperfine) saw a 3x speedup:
@Jonhoo I know it's a little premature but I rewrote stackcollapse-xdebug.php into Rust (which I will merge when Inferno fleshes out a little) and got a 3x speedup! pic.twitter.com/yn4gbnfESI— Daniel Lockyer (@DanielLockyer) February 7, 2019
I switched out the PHP version for my Rust version on my frontend and saw a drop in load time from 7.0 seconds to 2.3 seconds, a similar result.
The code is currently sitting in my fork of the inferno repository but I hope to get it merged when the project matures.
Update:
I posted about this on Twitter and got a bunch of tips for getting even more performance. It also got posted on /r/rust.
I profiled it using perf
and found .collect()
was taking a log of time. I switched that out for functions over the iterator and took off another 60%. I'm also in the process of merging a PR someone sent to get up to 4.7x faster than the PHP version.
I did some other modifications and then I got a PR to remove a lot of the allocation and duplicate calculations. We're now at 5x the PHP speed! #rustlang #php pic.twitter.com/Do25gCvbZQ— Daniel Lockyer (@DanielLockyer) February 12, 2019
Update 2:
I received another pull request which brought it down to 91ms, or about 10x faster than the PHP version! The PR focussed on more efficient string splitting.