Wednesday, April 15, 2020

HawkTracer - low-overhead instrumentation-based profiler


Disclaimer: this post introduces HawkTracer profiler but doesn't try to go deep into details of that. Follow the links attached to find all the features (and limitations) of the tool.

A while ago, at Amazon we've open-sourced instrumentation-based profiler - HawkTracer - which introduces very low overhead so it can be used on low-end platforms, where development environment is somehow limited (e.g. no ssh access, very limited disk storage etc). We used it to fix performance issues of Prime Video app on living room devices (SmartTVs, Streaming sticks, Game consoles etc).

The profiler can be used to measure resource usage (mainly time spend in scope, but also others like CPU usage etc) in your C/C++ code (but it's easy to write bindings for other languages, there are already some in Python and Rust) and then visualize it using Trace Event Profiling Tool or as FlameGraphs. Example instrumentation can be found below:

void foo()
{
  HT_G_TRACE_FUNCTION();

  very_expensive_call();
}
void bar()
{
  HT_G_TRACE_FUNCTION();
  for (int i = 0; i < 100; i++)
  {
    foo();
    {
      HT_G_TRACE_OPT_STATIC("InternalOp");
      recursive(10);
    }
  }
}

That example generates following results:

How does it work?

The idea behind HawkTracer is very trivial and well-known - the design is a simple server-client architecture, where the profiling application (usually running on an embedded device) runs as a server that emits tracing events. The serialized events are transmitted to a client through an user-specific protocol (by default TCP/IP and File are supported, but that can be extended by user). The client then can convert received events to some human-readable format - again, only few of them are supported by default (ChromeTracing and FlameGraph) but user can extend the client by adding more conversion methods.
The HawkTracer itself is a library that's just linked to your executable, so there's no need to run separate profiling process.



But it's C... and we all need Rust now!

Well, if you're a Rust lover, I have a good news for you. There are already bindings for HawkTracer in Rust! The snippet below shows how can you instrument your codebase:

#[macro_use]
extern crate rust_hawktracer;
use rust_hawktracer::*;
use std::{thread, time};

#[hawktracer(trace_this)]
fn method_to_trace() {
    thread::sleep(time::Duration::from_millis(1));
}

fn main() {
    let instance = HawktracerInstance::new();
    let _listener = instance.create_listener(HawktracerListenerType::TCP {
        port: 12345,
        buffer_size: 4096,
    });

    println!("Hello, world!");
    {
        scoped_tracepoint!(_test);
        thread::sleep(time::Duration::from_millis(10));

        {
            for _ in 0..10 {
                scoped_tracepoint!(_second_tracepoint);
                thread::sleep(time::Duration::from_millis(10));
            }
        }
    }
}

The tracepoints can be disabled at build time (as they're implemented as macros) - see the repository for details.

Want to know more?

This blogpost is just a very short introduction to HawkTracer. I'll be posting here more about the profiler, but in the meantime, checkout the following links to have more details about it: