Appendix: Public Issue Sample for LLM Gateway Product Research

Appendix: Public Issue Sample for LLM Gateway Product Research

Date: 2026-05-25
Corpus size: 75 public GitHub issue observations
Scope: LLM gateway, AI gateway, API gateway with AI plugins, and adjacent LLM serving projects

A. Research Purpose

The issue sample supports a product-research claim in the blog post: public LLM gateway friction is not limited to “calling more models”; recurring problems also appear around provider schema fidelity, streaming parity, deployment, logging, reliability, agent/tool state, security, and multimodal resource handling.

The sample does not prove that every product must implement a separate L3 layer. It supports a narrower claim: when a product involves media generation, long-running tasks, or agent/tool workflows, resource persistence and compliance evidence become a distinct concern that may sit above generic L1/L2 gateway functions.

B. Sampling Frame

Issues were collected from public GitHub issue search pages on 2026-05-25. The project frame includes:

  • APIPark, LiteLLM, Higress, New API, Portkey, Bifrost
  • APISIX, Envoy AI Gateway, Helicone
  • vLLM, SGLang, llama.cpp

Queries used recent issue listings and high-relevance keywords such as stream, tool, deploy, log, openai, retry, and image. Additional exact issues were retained when they were already cited in the article and were strongly relevant to serving or compatibility behavior.

C. Coding Protocol

Coding unit: one issue observation.

Primary category is mutually exclusive:

CodeDefinition
provider_adapter_schemaProvider compatibility, model/channel configuration, OpenAI-compatible schema, parameter preservation.
streaming_protocol_parityStreaming vs non-streaming behavior, SSE/chunk handling, Responses API deltas, reasoning/thinking fields.
agent_context_stateMCP, tool/function calling, agent context, parser/renderer, state offload, context compression.
deployment_opsOffline install, Docker/Compose/Kubernetes/Helm, Gateway API compatibility, private-network deployment.
logging_audit_costRequest log archive, audit traces, metrics, usage, cost calculation, billing and token accounting.
reliability_routing_keyRetry, fallback, timeout, latency, rate limits, key rotation/disablement, health checking.
multimodal_resourceImages, audio, video, files, URLs, base64, object storage and persistent resource handling.
security_guardrailsAuth, permission, tenant isolation, secrets, guardrails, prompt/tool safety and abuse prevention.

Secondary tags are non-exclusive and preserve overlapping interpretations. build_locus marks whether a capability is more suitable for community build-in, platform self-operation, or a mixed approach. relevance_to_l3 does not refer to OSI Layer 3; it estimates whether the issue touches the L3 business asset and compliance layer, including durable resources, object storage/CDN persistence, permissions, audit evidence, long-term logs, cost ledger, and policy enforcement.

D. Aggregate Results

Primary categories:

CategoryCountShare
Provider adapter / schema1824.0%
Deployment / ops1216.0%
Agent / tool state1114.7%
Logging / audit / cost912.0%
Streaming / protocol parity912.0%
Reliability / routing / key79.3%
Security / guardrails56.7%
Multimodal / resource45.3%

Build locus:

LocusCountShare
Community build-in3952.0%
Self-operated2128.0%
Mixed1520.0%

L3 relevance:

RelevanceCountShare
Medium3445.3%
High2938.7%
Low1216.0%

E. Interpretation

The sample supports three cautious observations:

  1. Provider adapter and schema fidelity is the largest class in this sample. This supports treating adapter tests, schema preservation, OpenAI-compatible edge cases, and smoke cases as community-buildable infrastructure.
  2. Deployment, streaming parity, and agent/tool state are large enough to be considered first-class gateway concerns, not occasional edge cases.
  3. Multimodal resource and security samples are numerically smaller but have higher L3 relevance. They often imply persistence, permission, audit, or policy behavior outside a pure HTTP proxy.

F. Limits

This is a purposeful public issue sample, not a probability sample. Counts are influenced by project popularity, maintainer workflow, user reporting habits, and GitHub search ranking. The sample is suitable for appendix evidence and taxonomy-building, but not for estimating true defect incidence across the whole AI gateway market.

Full row-level data is stored in llm-gateway-issue-sample.csv.