Serilog + OpenTelemetry:从请求日志到链路追踪的关联落地
这篇直接给落地方案,不再讲结构化日志的背景概念。目标只有一个:在 ASP.NET Core 服务里,把 Serilog 日志和 OpenTelemetry 链路追踪打通,排障时可以从一条错误日志直接跳到完整 Trace。
1. 问题背景:这篇要交付什么
按下面步骤做完,你会得到一条可执行排障链路:
- 日志里稳定带上
TraceId、SpanId、RequestPath、RequestMethod - 业务日志统一输出
OrderId、TenantId这类主键字段 - 下游 HTTP 调用失败时,日志可直接回跳到同一
TraceId - 现场排障固定为 3 步:查日志 -> 拿 TraceId -> 打开链路
这篇示例默认运行环境:
- .NET 8
- Serilog
- OpenTelemetry + OTLP
2. 原理解析:实施前约定与字段契约
先把约定定下来,后面配置才不会反复返工。
2.1 平台层字段(必须统一)
TraceId:整条请求链路唯一键SpanId:当前节点唯一键RequestPath:请求路径RequestMethod:请求方法StatusCode:响应码
2.2 业务层字段(按场景补充)
- 订单域:
OrderId、CustomerId - 多租户:
TenantId - 外部依赖:
DependencyName、DownstreamStatusCode
2.3 字段命名规则(一次定死)
- 统一使用 PascalCase
- 同义字段只保留一个名字,比如只用
TraceId - 业务字段保持稳定,不随文案调整
3. 示例代码:按步骤落地
3.1 安装依赖
Serilog.AspNetCoreSerilog.Sinks.ConsoleSerilog.Enrichers.EnvironmentOpenTelemetry.Extensions.HostingOpenTelemetry.Instrumentation.AspNetCoreOpenTelemetry.Instrumentation.HttpOpenTelemetry.Exporter.OpenTelemetryProtocol
3.2 第一步:配置 Serilog 与 OpenTelemetry
把这段放进 Program.cs,先打通基础链路:
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using Serilog;
using Serilog.Events;
var builder = WebApplication.CreateBuilder(args);
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Information()
.MinimumLevel.Override("Microsoft", LogEventLevel.Warning)
.Enrich.FromLogContext()
.Enrich.WithEnvironmentName()
.WriteTo.Console(outputTemplate:
"[{Timestamp:HH:mm:ss} {Level:u3}] TraceId={TraceId} SpanId={SpanId} {Message:lj}{NewLine}{Exception}")
.CreateLogger();
builder.Host.UseSerilog();
const string serviceName = "order-api";
builder.Services.AddOpenTelemetry()
.ConfigureResource(resource => resource.AddService(serviceName))
.WithTracing(tracing => tracing
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddOtlpExporter(options =>
{
options.Endpoint = new Uri("http://localhost:4318/v1/traces");
options.Protocol = OpenTelemetry.Exporter.OtlpExportProtocol.HttpProtobuf;
}));
builder.Services.AddHttpClient("payment");
var app = builder.Build();
3.3 第二步:入口中间件统一注入链路字段
using System.Diagnostics;
using Serilog.Context;
app.Use(async (context, next) =>
{
var activity = Activity.Current;
using (LogContext.PushProperty("TraceId", activity?.TraceId.ToString() ?? string.Empty))
using (LogContext.PushProperty("SpanId", activity?.SpanId.ToString() ?? string.Empty))
using (LogContext.PushProperty("RequestPath", context.Request.Path.Value ?? string.Empty))
using (LogContext.PushProperty("RequestMethod", context.Request.Method))
{
await next();
}
});
这一步做完后,应用日志会自动带上 TraceId 和 SpanId。
3.4 第三步:业务接口输出结构化日志并串联下游调用
app.MapPost("/api/orders/{orderId:long}/confirm", async (
long orderId,
IHttpClientFactory httpClientFactory,
ILogger<Program> logger,
CancellationToken ct) =>
{
logger.LogInformation("Start confirm order. OrderId={OrderId}", orderId);
var client = httpClientFactory.CreateClient("payment");
using var response = await client.PostAsync(
$"https://payment.internal/api/payments/{orderId}/capture",
content: null,
ct);
if (!response.IsSuccessStatusCode)
{
logger.LogError(
"Confirm order failed by downstream. OrderId={OrderId}, StatusCode={StatusCode}",
orderId,
(int)response.StatusCode);
return Results.Problem(title: "confirm order failed", statusCode: 502);
}
logger.LogInformation("Confirm order succeeded. OrderId={OrderId}", orderId);
return Results.Ok(new { orderId, status = "Confirmed" });
});
await app.RunAsync();
3.5 第四步:给出统一错误日志模板
错误日志建议统一按这个模板输出:
logger.LogError(
"Order confirm failed. TraceId={TraceId}, SpanId={SpanId}, OrderId={OrderId}, StatusCode={StatusCode}, DependencyName={DependencyName}",
Activity.Current?.TraceId.ToString() ?? string.Empty,
Activity.Current?.SpanId.ToString() ?? string.Empty,
orderId,
(int)response.StatusCode,
"payment-service");
3.6 第五步:现场排障固定流程
- 在日志平台按
OrderId或错误码找到失败事件 - 拿到日志里的
TraceId - 在 APM 按
TraceId打开完整链路 - 看 span 耗时占比,确认瓶颈在入口、中间件、下游 HTTP 还是数据库
3.7 常见误配与修正
// 误配 1:只打文本,不打字段
logger.LogError($"confirm order failed, order={orderId}, trace={Activity.Current?.TraceId}");
// 修正:字段化输出,日志平台可检索
logger.LogError(
"Confirm order failed. OrderId={OrderId}, TraceId={TraceId}",
orderId,
Activity.Current?.TraceId.ToString() ?? string.Empty);
4. 总结
这套方案落地后,现场排障可以固定成可执行动作:先查失败日志,再拿 TraceId 回跳链路,最后按 span 耗时定位瓶颈段。