Poster
in
Workshop: Temporal Graph Learning Workshop @ NeurIPS 2023
Gen-T: Reduce Distributed Tracing Operational Costs Using Generative Models
Saar Tochner · Giulia Fanti · Vyas Sekar
Abstract:
Distributed tracing (DT) is an important aspect of modern microservice operations. It allows operators to troubleshoot problems by modeling the sequence of services a specific request traverses in the system. However, transmitting traces incurs significant costs. This forces operators to use coarse-grained prefiltering or sampling techniques, creating undesirable tradeoffs between cost and fidelity. We propose to circumvent these issues using generative modeling to capture the semantic structure of collected traces in a lossy-yet-succinct way. Realizing this potential in practice, however, is challenging. Naively extending ideas from the literature on deep generative models in timeseries generation or graph generation can result in poor cost-fidelity tradeoffs.In designing and implementing Gen-T, we tackle key algorithmic and systems challenges to make deep generative models practical for DT. We design a hybrid generative model that separately models different components of DT data, and conditionally stitches them together.Our system Gen-T, which has been integrated with the widely-used OpenTelemetry framework, achieves a level of fidelity comparable to that of 1:15 sampling, which is more fine-grained than the default 1:20 sampling setting in the Opentelemetry documentation,while maintaining a cost profile equivalent to that of 1:100 lossless-compressed sampling (i.e., a 7$\times$ volume reduction).
Chat is not available.