Agentgateway Implementation Analysis Part 3 - Http Proxy

book-cover-mockup

This article attempts to analyze the implementation details of the Http Proxy main flow in Agentgateway. It allows readers to understand the working principle and implementation method of Agentgateway as an Http Proxy at the L7 layer. Agentgateway is essentially an HTTP Proxy, but adds support for AI (LLM/MCP/A2A) stateful protocols on top of HTTP. Therefore, analyzing the main flow of the HTTP Proxy layer is analyzing the main flow of Agentgateway.

This article is excerpted from the Http Proxy Main Flow section of the open source book “Agentgateway Insider” I am writing, and has been organized and supplemented for publication. For more details, please refer to the book.

Agentgateway Introduction

Agentgateway is an open-source and cross-platform data plane designed for AI agent systems, capable of establishing secure, scalable, and maintainable bidirectional connections between agents, MCP tool servers, and LLM providers. It makes up for the deficiencies of traditional gateways in handling MCP/A2A protocols in terms of state management, long sessions, asynchronous messaging, security, observability, multi-tenancy, etc., providing enterprise-level capabilities such as unified access, protocol upgrades, tool virtualization, authentication and permission control, traffic governance, metrics and tracing. It also supports Kubernetes Gateway API, dynamic configuration updates, and embedded developer self-service portals, helping to quickly build and scale agent-based AI environments. I think the current stage of agentgateway is more like an outbound bus (external dependency bus) for AI Agent applications.

Http Proxy Analysis

Agentgateway Configuration File

The Http Proxy main flow analyzed in this section is based on the following Agentgateway configuration file

https://github.com/labilezhu/pub-diy/blob/main/ai/agentgateway/ag-dev/devcontainer-config.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79


config:  
  logging:
    level: debug
    fields:
      add:
        ... 
  adminAddr: "0.0.0.0:15000"  # Try specifying the full socket address

  tracing:
    otlpEndpoint: http://tracing:4317
    # otlpProtocol: http
    randomSampling: true
    clientSampling: true
    fields:
      add:
        span.name: '"openai.chat"'
        # openinference.span.kind: '"LLM"'
        llm.system: 'llm.provider'
        llm.params.temperature: 'llm.params.temperature'
        # By default, prompt and completions are not sent; enable them.
        request.headers: 'request.headers'
        request.body: 'request.body'
        request.response.body: 'response.body'

        llm.completion: 'llm.completion'
        llm.input_messages: 'flattenRecursive(llm.prompt.map(c, {"message": c}))'
        gen_ai.prompt: 'flattenRecursive(llm.prompt)'
        llm.output_messages: 'flattenRecursive(llm.completion.map(c, {"role":"assistant", "content": c}))'
binds:
- port: 3100
  listeners:
  - routes:
    - policies:
        urlRewrite:
          authority: #also known as “hostname”
            full: dashscope.aliyuncs.com
          # path:
          #   full: "/compatible-mode/v1"
        requestHeaderModifier:
          set:
            Host: "dashscope.aliyuncs.com" #force set header because "/compatible-mode/v1/models: passthrough" auto set header to 'api.openai.com' by default
        backendTLS: {}
        backendAuth:
          key: "sk-abc"

      backends:
      - ai:
          name: qwen-plus
          hostOverride: dashscope.aliyuncs.com:443
          provider:
            openAI: 
              model: qwen-plus
          policies:
            ai:
              routes:
                /compatible-mode/v1/chat/completions: completions
                /compatible-mode/v1/models: passthrough
                "*": passthrough

- port: 3101
  listeners:
  - routes:
    - policies:
        cors:
          allowOrigins:
            - "*"
          allowHeaders:
            - mcp-protocol-version
            - content-type
            - cache-control
        requestHeaderModifier:
          add:
            Authorization: "Bearer abc"            
      backends:
      - mcp:
          targets:
          - name: home-assistant
            mcp:
              host: http://192.168.1.68:8123/api/mcp           

Trigger LLM Request

1
2
3
4
5
6
7
8


curl -v http://localhost:3100/compatible-mode/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "any-model-name",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'  

Response:

* Host localhost:3100 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:3100...
* Connected to localhost (::1) port 3100
* using HTTP/1.x
> POST /compatible-mode/v1/chat/completions HTTP/1.1
> Host: localhost:3100
> User-Agent: curl/8.14.1
> Accept: */*
> Content-Type: application/json
> Content-Length: 104
> 
* upload completely sent off: 104 bytes
< HTTP/1.1 200 OK
< vary: Origin,Access-Control-Request-Method,Access-Control-Request-Headers, Accept-Encoding
< x-request-id: 120e5847-3394-923c-a494-8eb9f81cb36e
< x-dashscope-call-gateway: true
< content-type: application/json
< server: istio-envoy
< req-cost-time: 873
< req-arrive-time: 1766635349727
< resp-start-time: 1766635350601
< x-envoy-upstream-service-time: 873
< date: Thu, 25 Dec 2025 04:02:30 GMT
< transfer-encoding: chunked
< 
* Connection #0 to host localhost left intact
{"model":"qwen-plus","usage":{"prompt_tokens":10,"completion_tokens":20,"total_tokens":30,"prompt_tokens_details":{"cached_tokens":0}},"choices":[{"message":{"content":"Hello! ٩(◕‿◕｡)۶ How can I assist you today?","role":"assistant"},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","created":1766635351,"system_fingerprint":null,"id":"chatcmpl-120e5847-3394-923c-a494-8eb9f81cb36e"}

Http Proxy Main Flow Chart

1. L4 Connection Accept Flow Chart

Through vscode Debug, we can see the Http Proxy main flow as shown in the figure below:

Figure: Http Proxy Main Flow

Open with Draw.io

The ⚓ icon in the figure links to the local vscode source code when double-clicked. See the Source Code Navigation Diagram Links to VSCode Source Code section in the book.

It can be seen that the main http proxy logic is placed in the Gateway struct. There are two key spawn points:

In Gateway::run(), a Gateway::run_bind() async future is spawned for each listening port. This task is responsible for listening to the port and accepting new connections.
After Gateway::run_bind() accepts a new connection, it spawns a Gateway::handle_tunnel() async future for each connection. This task is responsible for handling all events for each connection.

If the connection’s tunnel protocol is Direct (i.e., direct connection), it calls Gateway::proxy_bind() to hand it over to the HTTPProxy module for processing.

2. L7 HTTP Layer Flow

Gateway::proxy() calls the HTTP Server module of hyper-util to read and interpret HTTP request headers. After interpretation is complete, it calls back to HTTPProxy::proxy().

3. L8 AI Proxy Route Layer

HTTPProxy::proxy_internal() executes various Policies and Routes, until HTTPProxy::attempt_upstream() initiates a call to the upstream (in the current configuration, the LLM AI Provider backend).

4. Upstream(backend) call

HTTPProxy::make_backend_call() calls the HTTP Client module of hyper-util to build and send HTTP requests to the upstream. It includes connection pool management logic.

Chat