目录一、概述二、支持的策略三、支持的配置选项四、策略的执行顺序及优先级五、具体示例及说明示例1显式不采样优先丢弃、错误全留、慢请求留、健康检查低采样示例2采用不同tail_sampling策略将trace数据分别导入到不同jaeger存储一、概述在 OpenTelemetry Collector 中Processor位于 Receiver 与 Exporter 之间负责对遥测数据做增强、过滤、聚合与采样等处理。采样常见有两类头采样Head Sampling在 Trace 开始时就决定是否采样性能高、开销低但决策时上下文信息有限。尾采样Tail Sampling在 Trace 结束或等待一段时间后再决策能基于更完整的上下文如最终状态码、总时延、属性组合做更精准采样。TailSamplingProcessor的作用是把同一trace_id的 span 聚合后按策略集policies统一决定保留或丢弃整条 Trace。它适用于以下场景需要“错误必留、慢请求优先留、健康检查低比例采样”等精细化策略。希望结合多策略属性、状态码、时延、概率、限流等共同决策。需要在成本与可观测性之间动态平衡。注意要保证同一 Trace 的所有 Span 落到同一 Collector 实例否则尾采样决策会失真。二、支持的策略策略类型说明always_sample始终采样所有 Trace。latency按 Trace 总时长采样threshold_ms/upper_threshold_ms。numeric_attribute按数值属性范围采样最小/最大值。probabilistic按百分比概率采样。status_code按 Span 状态码OK/ERROR/UNSET采样。string_attribute按字符串属性匹配采样支持精确匹配与正则。trace_state按 TraceState 键值匹配采样。trace_flags若 Trace 中任一 Span 的 sampled flag 被置位则采样。rate_limiting基于 token bucket 按spans_per_second限速采样。bytes_limiting基于 token bucket 按bytes_per_second限流量采样。span_count按 Trace 内 Span 数量区间采样。boolean_attribute按布尔属性采样。ottl_condition按 OTTL 布尔表达式span/spanevent/resource/scope采样。and多子策略同时满足才采样。not对单个子策略结果取反。drop命中则明确丢弃不采样常用于高优先级排除。composite组合策略支持子策略顺序与配额分配rate allocation。具体的策略配置示例如下processors:tail_sampling:policies:[{name:test-policy-1,type:always_sample},{name:test-policy-2,type:latency,latency:{threshold_ms:5000,upper_threshold_ms:10000}},{name:test-policy-3,type:numeric_attribute,numeric_attribute:{key:key1,min_value:50,max_value:100}},{name:test-policy-4,type:probabilistic,probabilistic:{sampling_percentage:10}},{name:test-policy-5,type:status_code,status_code:{status_codes:[ERROR,UNSET]}},{name:test-policy-6,type:string_attribute,string_attribute:{key:key2,values:[value1,value2]}},{name:test-policy-7,type:string_attribute,string_attribute:{key:key2,values:[value1,val*],enabled_regex_matching:true,cache_max_size:10}},{name:test-policy-8,type:rate_limiting,rate_limiting:{spans_per_second:35,burst_capacity:70}},{name:test-policy-9,type:bytes_limiting,bytes_limiting:{bytes_per_second:1024000,burst_capacity:2048000}},{name:test-policy-10,type:span_count,span_count:{min_spans:2,max_spans:20}},{name:test-policy-11,type:trace_state,trace_state:{key:key3,values:[value1,value2]}},{name:test-policy-12,type:boolean_attribute,boolean_attribute:{key:key4,value:true}},{name:test-policy-13,type:ottl_condition,ottl_condition:{error_mode:ignore,span:[attributes[\test_attr_key_1\] \test_attr_val_1\,attributes[\test_attr_key_2\] ! \test_attr_val_1\,],spanevent:[name ! \test_span_event_name\,attributes[\test_event_attr_key_2\] ! \test_event_attr_val_1\,]}},{name:test-policy-14,type:ottl_condition,ottl_condition:{error_mode:ignore,span:[resource.attributes[\service.name\] \checkout\,span.attributes[\http.status_code\] 500,],spanevent:[spanevent.name \exception\,]}},{name:and-policy-1,type:and,and:{and_sub_policy:[{name:test-and-policy-1,type:numeric_attribute,numeric_attribute:{key:key1,min_value:50,max_value:100}},{name:test-and-policy-2,type:string_attribute,string_attribute:{key:key2,values:[value1,value2]}},]}},{name:not-policy-1,type:not,not:{not_sub_policy:{name:test-not-policy-1,type:latency,latency:{threshold_ms:1000}}}},{name:drop-policy-1,type:drop,drop:{drop_sub_policy:[{name:test-drop-policy-1,type:string_attribute,string_attribute:{key:url.path,values:[\/health,\/metrics],enabled_regex_matching:true}}]}},{name:composite-policy-1,type:composite,composite:{max_total_spans_per_second:1000,policy_order:[test-composite-policy-1,test-composite-policy-2,test-composite-policy-3],composite_sub_policy:[{name:test-composite-policy-1,type:numeric_attribute,numeric_attribute:{key:key1,min_value:50}},{name:test-composite-policy-2,type:string_attribute,string_attribute:{key:key2,values:[value1,value2]}},{name:test-composite-policy-3,type:always_sample}],rate_allocation:[{policy:test-composite-policy-1,percent:50},{policy:test-composite-policy-2,percent:25}]}},]三、支持的配置选项配置项默认值说明policies无必填采样策略列表。sampling_strategytrace-complete决策模式trace-complete基于累计 Trace 数据或span-ingest按到达批次评估。decision_wait30s决策等待时间不同策略模式下影响决策时机/清理终结时机。decision_wait_after_root_received0s收到 root span 后的加速决策等待时间0s表示禁用。num_traces50000内存中保留的 Trace 数量上限。expected_new_traces_per_sec0预估每秒新 Trace 数用于优化内部结构分配。decision_cache.sampled_cache_size0已采样决策缓存大小LRU用于处理迟到 Span。decision_cache.non_sampled_cache_size0未采样决策缓存大小LRU。sample_on_first_matchfalse隐含某策略一旦命中即可提前做决策。drop_pending_traces_on_shutdownfalse隐含关闭时直接丢弃待决策 Trace而非用部分数据决策。maximum_trace_size_bytes不限制隐含单 Trace 最大字节数超限立即丢弃以保护系统。配置项示例processors:tail_sampling:decision_wait:10snum_traces:100expected_new_traces_per_sec:10decision_cache:sampled_cache_size:100_000non_sampled_cache_size:100_000policies:[...]四、策略的执行顺序及优先级TailSamplingProcessor 会先让每个策略产出一个决策再合并成最终结果。核心优先级可理解为任一drop决策命中→ 最终不采样最高优先级。其余情况下若有sample决策→ 最终采样。否则 → 不采样。补充要点composite策略内部可通过policy_order控制子策略执行顺序并通过rate_allocation做配额分配。sample_on_first_match开启时命中即决策可减少评估开销但会影响“全策略评估后再决策”的行为。文档中提到的 inverted 决策InvertSampled/InvertNotSampled已标记弃用推荐用drop与not表达。五、具体示例及说明示例1显式不采样优先丢弃、错误全留、慢请求留、健康检查低采样下面给出一个可直接使用的tail_sampling示例实现“显式不采样优先丢弃、错误全留、慢请求留、健康检查低采样”。processors:tail_sampling:decision_wait:10snum_traces:10000expected_new_traces_per_sec:200decision_cache:sampled_cache_size:200000non_sampled_cache_size:200000policies:[# 1) 显式不采样最高优先级排除{name:drop-do-not-sample,type:drop,drop:{drop_sub_policy:[{name:do-not-sample-flag,type:boolean_attribute,boolean_attribute:{key:app.do_not_sample,value:true}}]}},# 2) 错误 Trace 全量保留{name:keep-errors,type:status_code,status_code:{status_codes:[ERROR]}},# 3) 慢请求保留5s~30s{name:keep-latency,type:latency,latency:{threshold_ms:5000,upper_threshold_ms:30000}},# 4) 健康检查低比例采样0.1%{name:health-probe-low-rate,type:and,and:{and_sub_policy:[{name:probe-route,type:string_attribute,string_attribute:{key:http.route,values:[/live,/ready],enabled_regex_matching:true}},{name:probe-prob,type:probabilistic,probabilistic:{sampling_percentage:0.1}}]}},# 5) 其余流量兜底 10%{name:fallback-prob,type:probabilistic,probabilistic:{sampling_percentage:10}}]说明该配置先用drop做硬性排除避免无价值 Trace 占用预算。错误与慢请求通过status_code、latency高优先级保留保障问题定位能力。健康检查等高频低价值流量用低概率采样。最后用兜底概率策略控制总体成本保证有代表性的全局样本。示例2采用不同tail_sampling策略将trace数据分别导入到不同jaeger存储再补充一个更贴近实际的场景统一使用 OTLP 作为 receiver把同一批 Trace 分流到两个 tail sampling pipeline再分别导出到不同的 Jaeger 存储中。其中一个 pipeline 按 80% 概率采样另一个只在显式采样键app.force_sampletrue时保留 Trace。receivers:otlp:protocols:grpc:{}http:{}processors:tail_sampling/prob80:decision_wait:10snum_traces:50000expected_new_traces_per_sec:1000policies:[{name:probabilistic-80,type:probabilistic,probabilistic:{sampling_percentage:80}}]tail_sampling/force-true:decision_wait:10snum_traces:50000expected_new_traces_per_sec:1000policies:[{name:force-sample-true,type:boolean_attribute,boolean_attribute:{key:app.force_sample,value:true}}]exporters:jaeger/prob80:endpoint:jaeger-a:14250tls:insecure:truejaeger/force-true:endpoint:jaeger-b:14250tls:insecure:trueservice:pipelines:traces/prob80:receivers:[otlp]processors:[tail_sampling/prob80]exporters:[jaeger/prob80]traces/force-true:receivers:[otlp]processors:[tail_sampling/force-true]exporters:[jaeger/force-true]说明两条 pipeline 共享同一个otlpreceiver因此应用侧只需要统一上报一次。traces/prob80负责常规流量按 80% 概率进入第一个 Jaeger 存储。traces/force-true只放行带有app.force_sampletrue的 Trace适合做人工强制保留或问题追踪。参考https://opentelemetry.io/docs/concepts/sampling/https://opentelemetry.io/docs/collector/configuration/#processorshttps://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/tailsamplingprocessor/README.md