EXPAND ALL
  • Home

Dynamic Logging

Note: This is a quick overview of the dynamic tracing capability released in Pixie's v0.3.15 to allow developers to add tracepoints without any instrumentation. We will be adding detailed docs and tutorials as part of our v0.4 release.

Why is this needed?

Allow developers to save hours and days debugging code-level performance issues by giving the ability to dynamically add tracepoints in production code without any instrumentation.

Supported Use-cases

Function Argument Tracing

What are the arguments being passed to Foo(x, y, z)?

Function Input-Output Tracing

What are the arguments and return value of calls to Foo(x, y, z)?

Function Latency Profiling

What is the latency (call to return) of calls to Login(username)?

Capability Overview

Here's a quick overview video where we dynamically injecting tracepoints a function (link) within the "Online-Boutique" demo application's checkout service:

As reference, here's the PXL script used in the video:

import pxtrace
import px
checkout_upid = "0000021f-0037-e4aa-0000-00007463d375"
# func Sum(l, r pb.Money) (pb.Money, error)
@pxtrace.goprobe('github.com/GoogleCloudPlatform/microservices-demo/src/checkoutservice/money.Sum')
def probe_func():
return [{'lUnits': pxtrace.ArgExpr('l.Units')},
{'lNanos': pxtrace.ArgExpr('l.Nanos')},
{'rUnits': pxtrace.ArgExpr('r.Units')},
{'rNanos': pxtrace.ArgExpr('r.Nanos')},
{'retUnits': pxtrace.RetExpr('$2.Units')},
{'retNanos': pxtrace.RetExpr('$2.Nanos')}]
pxtrace.UpsertTracepoint('money_sum_trac10',
'money_sum_table',
probe_func,
px.uint128(checkout_upid),
'10m')
px.display(px.DataFrame('money_sum_table'))

FAQs

What compiled languages does it work for?

Currently it has been tested on Go with limited support for C++. Other compiled languages such as Rust, Haskell, etc. are well supported by our approach.

Does it work for Java? Or interpreted languages?

Our system does not currently work with interpreted or VM based languages. These languages usually have fairly sophisticated debug environments that we will integrate with in the future.

Does it work without Debug symbols?

We currently require Dwarf information to be present in the binary for it to work. We support optimized binaries (there are issues like inlined functions, that stirling does not yet fully support) but they need to contain the debug symbols. Future versions of Pixie will add support for remotely hosted symbol files. We are actively seeking feedback about how remote symbol files are used in practice, in order to design proper features.

Can we stream with sampling?

Dynamic tracepoints connect up to the Pixie platform. Native streaming support is core to Pixie and will be in the near future.

Is this extendable to general BPF probes?

We currently only support tracepoints that are generated by Pixie. We can leverage our approach to add support for this in the future if there is a significant demand for this feature.

How can we visualize the results?

Since Dynamic Tracepoints natively slot into Pixie they can leverage the platform's visualization environment. We will add support for views such as flame graphs in the future.

What are the different kinds of probes?

We currently support capturing function arguments, return values and latencies.

Can we mutate?

We don’t currently support any operators that will mutate the state of the application.

Can we call functions such as String()?

Not currently supported, but since this is such a useful feature we will explore adding it.

Can we deploy this outside of K8s?

Dynamic tracepoints don’t rely on any K8s specific features. They will be supported outside of K8s when Pixie can be installed there.

Can we have this managed using CRDs on K8s?

Tracepoints work on a declarative specification. Since Pixie is designed to work both inside and outside of K8s we don’t leverage CRDs to transmit the specification. In the future we might add support for providing specs from CRDs that are read into Pixie.

What is the performance overhead?

Minimal. A few tracepoints should have very little to no visible impact on non-trivial applications. Our studies on BPF probes have shown <1% overhead to capture full messages from a simple HTTP server. How often a tracepoint is triggered, and the amount of data being collected will affect this number.

What are security/privacy implications?

Since Dynamic Tracepoints can basically observe any function and its respective arguments there are significant privacy and security concerns. We will alleviate this by adding in RBAC support with the ability to have specific templates that are reviewed and allowed to be deployed. This feature can also leverage PII masking and other future enhancements to Pixie.

Difference between this and GDB, FTRACE, etc.?

Unlike most existing approaches we don’t actually stop execution of the program or mutate state. This allows us to easily capture data in production environments with limited overhead.

How do you turn off tracepoints?

Tracepoints have a TTL (time to live) when registered. This will allow automatic garbage collection of old tracepoints They can also be manually deleted.

Copyright © 2020 Pixie Labs Inc.