Blog: Controlling AI Spend w/ AppNet+agentgateway#5698
Blog: Controlling AI Spend w/ AppNet+agentgateway#5698therealmitchconnors wants to merge 1 commit intoAzure:masterfrom
Conversation
|
Note to self: need to update parameters to point to AppNet control plane, not OSS istio... |
There was a problem hiding this comment.
Pull request overview
This PR adds a new Docusaurus blog post describing a platform-layer pattern to control shared AI quota/spend by combining Azure Kubernetes Application Network (AppNet) identity (mTLS) with agentgateway token-based rate limiting.
Changes:
- Adds a new blog post under
website/blog/2026-04-09-appnet-agentgateway/. - Documents an architecture and example manifests for per-application token rate limiting.
- Includes an example validation flow showing success (200) and throttling (429).
| description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys. | ||
| author: Mitch Connors, John Howard | ||
| ms.author: mconnors | ||
| ms.topic: conceptual | ||
| ms.service: azure-kubernetes-service | ||
| ms.subservice: application-network | ||
| ms.date: 04/03/2026 |
There was a problem hiding this comment.
Front matter doesn’t match the blog post conventions used elsewhere in website/blog/ (expects date, authors as keys from website/blog/authors.yml, and tags as keys from website/blog/tags.yml). The current author: + ms.* metadata likely won’t be picked up by Docusaurus and may break listing/attribution/tag pages.
| description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys. | |
| author: Mitch Connors, John Howard | |
| ms.author: mconnors | |
| ms.topic: conceptual | |
| ms.service: azure-kubernetes-service | |
| ms.subservice: application-network | |
| ms.date: 04/03/2026 | |
| date: 2026-04-09 | |
| description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys. | |
| authors: [mitch-connors, john-howard] | |
| tags: [aks, application-network, ai] |
| # Control AI spend with per-application token rate limiting using Application Network and agentgateway | ||
|
|
There was a problem hiding this comment.
This post includes an explicit H1 (# ...) even though the blog layout already renders the title from front matter. Other posts in website/blog/ don’t include a redundant H1, and keeping both typically results in duplicate top-level headings.
| # Control AI spend with per-application token rate limiting using Application Network and agentgateway |
| This article describes a **platform-oriented approach** to controlling AI spend using **Azure Kubernetes Application Network** and **agentgateway**. By leveraging **workload identity already present in the network**, you can enforce **per-application, token-based rate limiting** without issuing API keys to every application. | ||
|
|
||
| **Azure Kubernetes Application Network** (AppNet, currently in Public Preview) is Azure's fully-managed L7 network for AKS, providing Security, Observability, and Control for your L7 network out-of-the-box. You can learn more about AppNet here, but in this article, we're focusing on AppNet's secure, automatic mTLS Authentication. | ||
|
|
There was a problem hiding this comment.
The intro is missing the <!-- truncate --> marker that most posts use to control the blog listing excerpt. Add <!-- truncate --> after the opening 1–3 paragraphs so the homepage/blog index doesn’t render the full article.
| <!-- truncate --> |
| App2 --> Z1 -->|id:b| AG --> AI | ||
| ``` | ||
|
|
||
| In contrast with our initial scenario, the Azure Foundry API Key is only acessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire. |
There was a problem hiding this comment.
Spelling: “acessible” should be “accessible”.
| In contrast with our initial scenario, the Azure Foundry API Key is only acessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire. | |
| In contrast with our initial scenario, the Azure Foundry API Key is only accessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire. |
|
|
||
| ## Deep Dive | ||
|
|
||
| While the complete configration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane: |
There was a problem hiding this comment.
Spelling: “configration” should be “configuration”.
| While the complete configration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane: | |
| While the complete configuration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane: |
| date: Fri, 03 Apr 2026 22:59:18 GMT | ||
| ``` | ||
|
|
||
| Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute. Requests from other applications can proceed without being blocked, because each application has it's own rate limit bucket, keyed on AppNet identity. |
There was a problem hiding this comment.
Grammar: “each application has it's own rate limit bucket” should use the possessive “its”.
| Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute. Requests from other applications can proceed without being blocked, because each application has it's own rate limit bucket, keyed on AppNet identity. | |
| Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute. Requests from other applications can proceed without being blocked, because each application has its own rate limit bucket, keyed on AppNet identity. |
|
|
||
| ## Conclusion | ||
|
|
||
| By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs). No newline at end of file |
There was a problem hiding this comment.
Spelling: “checkout” should be “check out”.
| By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs). | |
| By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more check out (AppNet Docs) and (agentgateway docs). |
|
|
||
| ## Conclusion | ||
|
|
||
| By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs). No newline at end of file |
There was a problem hiding this comment.
The conclusion ends with placeholder doc references (“(AppNet Docs) and (agentgateway docs)”). Replace these with real Markdown links (and descriptive link text) before merging, otherwise the published post will contain broken references.
| By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs). | |
| By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open-source tools like agentgateway. To learn more, see the [Azure Kubernetes Service application routing documentation](https://learn.microsoft.com/azure/aks/app-routing) and the [agentgateway documentation on GitHub](https://github.com/kubernetes-sigs/agentgateway). |
| @@ -0,0 +1,207 @@ | |||
| --- | |||
| title: Control AI spend with per-application token rate limiting using Application Network and agentgateway | |||
| description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys. | |||
There was a problem hiding this comment.
The description in front matter appears longer than the repo’s SEO guideline target (150–160 characters). Consider shortening it so it fits typical meta description display limits.
| description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys. | |
| description: Use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services. |
| This article describes a **platform-oriented approach** to controlling AI spend using **Azure Kubernetes Application Network** and **agentgateway**. By leveraging **workload identity already present in the network**, you can enforce **per-application, token-based rate limiting** without issuing API keys to every application. | ||
|
|
||
| **Azure Kubernetes Application Network** (AppNet, currently in Public Preview) is Azure's fully-managed L7 network for AKS, providing Security, Observability, and Control for your L7 network out-of-the-box. You can learn more about AppNet here, but in this article, we're focusing on AppNet's secure, automatic mTLS Authentication. | ||
|
|
There was a problem hiding this comment.
Consider adding an image near the top of the post with descriptive alt text (many posts include a header/hero image to improve scanability and social sharing previews).
|  |
This is the blog equivalent of the Azure booth demo at Kubeon EU 26. Highlights capabilities of the newly launched AppNet, and a "better together" story with agentgateway. Ideally timed around release of agentgateway 1.1 around April 8.