Skip to content

Blog: Controlling AI Spend w/ AppNet+agentgateway#5698

Open
therealmitchconnors wants to merge 1 commit intoAzure:masterfrom
therealmitchconnors:agw-blog
Open

Blog: Controlling AI Spend w/ AppNet+agentgateway#5698
therealmitchconnors wants to merge 1 commit intoAzure:masterfrom
therealmitchconnors:agw-blog

Conversation

@therealmitchconnors
Copy link
Copy Markdown

This is the blog equivalent of the Azure booth demo at Kubeon EU 26. Highlights capabilities of the newly launched AppNet, and a "better together" story with agentgateway. Ideally timed around release of agentgateway 1.1 around April 8.

@therealmitchconnors
Copy link
Copy Markdown
Author

Note to self: need to update parameters to point to AppNet control plane, not OSS istio...

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new Docusaurus blog post describing a platform-layer pattern to control shared AI quota/spend by combining Azure Kubernetes Application Network (AppNet) identity (mTLS) with agentgateway token-based rate limiting.

Changes:

  • Adds a new blog post under website/blog/2026-04-09-appnet-agentgateway/.
  • Documents an architecture and example manifests for per-application token rate limiting.
  • Includes an example validation flow showing success (200) and throttling (429).

Comment on lines +3 to +9
description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
author: Mitch Connors, John Howard
ms.author: mconnors
ms.topic: conceptual
ms.service: azure-kubernetes-service
ms.subservice: application-network
ms.date: 04/03/2026
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Front matter doesn’t match the blog post conventions used elsewhere in website/blog/ (expects date, authors as keys from website/blog/authors.yml, and tags as keys from website/blog/tags.yml). The current author: + ms.* metadata likely won’t be picked up by Docusaurus and may break listing/attribution/tag pages.

Suggested change
description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
author: Mitch Connors, John Howard
ms.author: mconnors
ms.topic: conceptual
ms.service: azure-kubernetes-service
ms.subservice: application-network
ms.date: 04/03/2026
date: 2026-04-09
description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
authors: [mitch-connors, john-howard]
tags: [aks, application-network, ai]

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +13
# Control AI spend with per-application token rate limiting using Application Network and agentgateway

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This post includes an explicit H1 (# ...) even though the blog layout already renders the title from front matter. Other posts in website/blog/ don’t include a redundant H1, and keeping both typically results in duplicate top-level headings.

Suggested change
# Control AI spend with per-application token rate limiting using Application Network and agentgateway

Copilot uses AI. Check for mistakes.
This article describes a **platform-oriented approach** to controlling AI spend using **Azure Kubernetes Application Network** and **agentgateway**. By leveraging **workload identity already present in the network**, you can enforce **per-application, token-based rate limiting** without issuing API keys to every application.

**Azure Kubernetes Application Network** (AppNet, currently in Public Preview) is Azure's fully-managed L7 network for AKS, providing Security, Observability, and Control for your L7 network out-of-the-box. You can learn more about AppNet here, but in this article, we're focusing on AppNet's secure, automatic mTLS Authentication.

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intro is missing the <!-- truncate --> marker that most posts use to control the blog listing excerpt. Add <!-- truncate --> after the opening 1–3 paragraphs so the homepage/blog index doesn’t render the full article.

Suggested change
<!-- truncate -->

Copilot uses AI. Check for mistakes.
App2 --> Z1 -->|id:b| AG --> AI
```

In contrast with our initial scenario, the Azure Foundry API Key is only acessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire.
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: “acessible” should be “accessible”.

Suggested change
In contrast with our initial scenario, the Azure Foundry API Key is only acessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire.
In contrast with our initial scenario, the Azure Foundry API Key is only accessible to the agentgateway, so application teams don't touch any secrets, while AppNet provides per-application identity information on the wire.

Copilot uses AI. Check for mistakes.

## Deep Dive

While the complete configration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane:
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: “configration” should be “configuration”.

Suggested change
While the complete configration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane:
While the complete configuration for this demo can be found here, let's have a look at the key components that make up our rate limit. First, let's configure agentgateway to interoperate with AppNet, which exposes and Istio-compliant control plane:

Copilot uses AI. Check for mistakes.
date: Fri, 03 Apr 2026 22:59:18 GMT
```

Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute. Requests from other applications can proceed without being blocked, because each application has it's own rate limit bucket, keyed on AppNet identity.
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: “each application has it's own rate limit bucket” should use the possessive “its”.

Suggested change
Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute. Requests from other applications can proceed without being blocked, because each application has it's own rate limit bucket, keyed on AppNet identity.
Once we've exhausted our token budget, all requests from httpbin to Azure Foundry will be blocked by agentgateway until our budget resets in 1 minute. Requests from other applications can proceed without being blocked, because each application has its own rate limit bucket, keyed on AppNet identity.

Copilot uses AI. Check for mistakes.

## Conclusion

By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs). No newline at end of file
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: “checkout” should be “check out”.

Suggested change
By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs).
By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more check out (AppNet Docs) and (agentgateway docs).

Copilot uses AI. Check for mistakes.

## Conclusion

By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs). No newline at end of file
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conclusion ends with placeholder doc references (“(AppNet Docs) and (agentgateway docs)”). Replace these with real Markdown links (and descriptive link text) before merging, otherwise the published post will contain broken references.

Suggested change
By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open source tools like agentgateway. To learn more checkout (AppNet Docs) and (agentgateway docs).
By adopting this platform-oriented approach, we gain centralized control over AI spending, eliminate secrets distribution, and improve operational efficiency. Applications gain transparent rate limiting without code changes, while platform teams reduce overhead and enforce fair resource allocation across the organization. This is just one of the many ways you can benefit from Application Network, built on Istio's Ambient Mode, with readily available open-source tools like agentgateway. To learn more, see the [Azure Kubernetes Service application routing documentation](https://learn.microsoft.com/azure/aks/app-routing) and the [agentgateway documentation on GitHub](https://github.com/kubernetes-sigs/agentgateway).

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,207 @@
---
title: Control AI spend with per-application token rate limiting using Application Network and agentgateway
description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description in front matter appears longer than the repo’s SEO guideline target (150–160 characters). Consider shortening it so it fits typical meta description display limits.

Suggested change
description: Learn how to use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services without distributing API keys.
description: Use Application Network workload identity and agentgateway to enforce per-application, token-based rate limiting for shared AI services.

Copilot uses AI. Check for mistakes.
This article describes a **platform-oriented approach** to controlling AI spend using **Azure Kubernetes Application Network** and **agentgateway**. By leveraging **workload identity already present in the network**, you can enforce **per-application, token-based rate limiting** without issuing API keys to every application.

**Azure Kubernetes Application Network** (AppNet, currently in Public Preview) is Azure's fully-managed L7 network for AKS, providing Security, Observability, and Control for your L7 network out-of-the-box. You can learn more about AppNet here, but in this article, we're focusing on AppNet's secure, automatic mTLS Authentication.

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding an image near the top of the post with descriptive alt text (many posts include a header/hero image to improve scanability and social sharing previews).

Suggested change
![Architecture diagram showing Azure Kubernetes Application Network and agentgateway enforcing per-application token rate limiting for shared AI services](./hero-image.png)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants