Blog: Controlling AI Spend w/ AppNet+agentgateway#5698

Open

therealmitchconnors wants to merge 1 commit intoAzure:masterfrom

therealmitchconnors:agw-blog

therealmitchconnors commented Apr 3, 2026

This is the blog equivalent of the Azure booth demo at Kubeon EU 26. Highlights capabilities of the newly launched AppNet, and a "better together" story with agentgateway. Ideally timed around release of agentgateway 1.1 around April 8.


          add first draft of agentgateway appnet blog

1e3f415

therealmitchconnors requested review from a team, colinmixonn and Copilot

April 3, 2026 23:26

Copilot started reviewing on behalf of therealmitchconnors

April 3, 2026 23:27

Author

therealmitchconnors commented Apr 3, 2026

Note to self: need to update parameters to point to AppNet control plane, not OSS istio...

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

This PR adds a new Docusaurus blog post describing a platform-layer pattern to control shared AI quota/spend by combining Azure Kubernetes Application Network (AppNet) identity (mTLS) with agentgateway token-based rate limiting.

Changes:

Adds a new blog post under website/blog/2026-04-09-appnet-agentgateway/.
Documents an architecture and example manifests for per-application token rate limiting.
Includes an example validation flow showing success (200) and throttling (429).

website/blog/2026-04-09-appnet-agentgateway/index.md

Comment on lines +3 to +9

+              description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
+              author: Mitch Connors, John Howard
+              ms.author: mconnors
+              ms.topic: conceptual
+              ms.service: azure-kubernetes-service
+              ms.subservice: application-network
+              ms.date: 04/03/2026

Copilot AI Apr 3, 2026

Front matter doesn’t match the blog post conventions used elsewhere in website/blog/ (expects date, authors as keys from website/blog/authors.yml, and tags as keys from website/blog/tags.yml). The current author: + ms.* metadata likely won’t be picked up by Docusaurus and may break listing/attribution/tag pages.

Suggested change

      
            description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
          
            author: Mitch Connors, John Howard
          
            ms.author: mconnors
          
            ms.topic: conceptual
          
            ms.service: azure-kubernetes-service
          
            ms.subservice: application-network
          
            ms.date: 04/03/2026
          
            date: 2026-04-09
          
            description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
          
            authors: [mitch-connors, john-howard]
          
            tags: [aks, application-network, ai]

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md

Comment on lines +12 to +13

		# Control AI spend with per-application token rate limiting using Application Network and agentgateway

Copilot AI Apr 3, 2026

This post includes an explicit H1 (# ...) even though the blog layout already renders the title from front matter. Other posts in website/blog/ don’t include a redundant H1, and keeping both typically results in duplicate top-level headings.

Suggested change

      
            # Control AI spend with per-application token rate limiting using Application Network and agentgateway

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md

		This article describes a platform-oriented approach to controlling AI spend using Azure Kubernetes Application Network and agentgateway. By leveraging workload identity already present in the network, you can enforce per-application, token-based rate limiting without issuing API keys to every application.

		Azure Kubernetes Application Network (AppNet, currently in Public Preview) is Azure's fully-managed L7 network for AKS, providing Security, Observability, and Control for your L7 network out-of-the-box. You can learn more about AppNet here, but in this article, we're focusing on AppNet's secure, automatic mTLS Authentication.

Copilot AI Apr 3, 2026

The intro is missing the  marker that most posts use to control the blog listing excerpt. Add  after the opening 1–3 paragraphs so the homepage/blog index doesn’t render the full article.

Suggested change



	<!-- truncate -->

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md

+                  App2 --> Z1 -->|id:b| AG --> AI
+              ```
+              In contrast with our initial scenario, the Azure Foundry API Key is only acessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire.

Copilot AI Apr 3, 2026

Spelling: “acessible” should be “accessible”.

Suggested change

      
            In contrast with our initial scenario, the Azure Foundry API Key is only acessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire.
          
            In contrast with our initial scenario, the Azure Foundry API Key is only accessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire.

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md


		## Deep Dive

		While the complete configration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane:

Copilot AI Apr 3, 2026

Spelling: “configration” should be “configuration”.

Suggested change

      
            While the complete configration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane:
          
            While the complete configuration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane:

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md

+              date: Fri, 03 Apr 2026 22:59:18 GMT
+              ```
+              Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute.  Requests from other applications can proceed without being blocked, because each application has it's own rate limit bucket, keyed on AppNet identity.

Copilot AI Apr 3, 2026

Grammar: “each application has it's own rate limit bucket” should use the possessive “its”.

Suggested change

      
            Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute.  Requests from other applications can proceed without being blocked, because each application has it's own rate limit bucket, keyed on AppNet identity.
          
            Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute.  Requests from other applications can proceed without being blocked, because each application has its own rate limit bucket, keyed on AppNet identity.

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md


		## Conclusion

		By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs). No newline at end of file

Copilot AI Apr 3, 2026

Spelling: “checkout” should be “check out”.

Suggested change

      
            By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs).
          
            By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more check out (AppNet Docs) and (agentgateway docs).

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md


		## Conclusion

		By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs). No newline at end of file

Copilot AI Apr 3, 2026

The conclusion ends with placeholder doc references (“(AppNet Docs) and (agentgateway docs)”). Replace these with real Markdown links (and descriptive link text) before merging, otherwise the published post will contain broken references.

Suggested change

      
            By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs).
          
            By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open-source tools like agentgateway. To learn more, see the [Azure Kubernetes Service application routing documentation](https://learn.microsoft.com/azure/aks/app-routing) and the [agentgateway documentation on GitHub](https://github.com/kubernetes-sigs/agentgateway).

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md

@@ @@ -0,0 +1,207 @@ @@
+              ---
+              title: Control AI spend with per-application token rate limiting using Application Network and agentgateway
+              description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.

Copilot AI Apr 3, 2026

The description in front matter appears longer than the repo’s SEO guideline target (150–160 characters). Consider shortening it so it fits typical meta description display limits.

Suggested change

      
            description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
          
            description: Use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services.

Copilot uses AI. Check for mistakes.

website/blog/2026-04-09-appnet-agentgateway/index.md

		This article describes a platform-oriented approach to controlling AI spend using Azure Kubernetes Application Network and agentgateway. By leveraging workload identity already present in the network, you can enforce per-application, token-based rate limiting without issuing API keys to every application.

		Azure Kubernetes Application Network (AppNet, currently in Public Preview) is Azure's fully-managed L7 network for AKS, providing Security, Observability, and Control for your L7 network out-of-the-box. You can learn more about AppNet here, but in this article, we're focusing on AppNet's secure, automatic mTLS Authentication.

Copilot AI Apr 3, 2026

Consider adding an image near the top of the post with descriptive alt text (many posts include a header/hero image to improve scanability and social sharing previews).

Suggested change

      
            ![Architecture diagram showing Azure Kubernetes Application Network and agentgateway enforcing per-application token rate limiting for shared AI services](./hero-image.png)

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet