diff --git a/skills/cortex-rest-api-pricing-calculator/LICENSE b/skills/cortex-rest-api-pricing-calculator/LICENSE new file mode 100644 index 00000000..bb8d9770 --- /dev/null +++ b/skills/cortex-rest-api-pricing-calculator/LICENSE @@ -0,0 +1,184 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship made available under + the License, as indicated by a copyright notice that is included in + or attached to the work (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other transformations + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean, as submitted to the Licensor for inclusion + in the Work by the copyright owner or by an individual or Legal Entity + authorized to submit on behalf of the copyright owner. For the purposes + of this definition, "submitted" means any form of electronic, verbal, + or written communication sent to the Licensor or its representatives, + including but not limited to communication on electronic mailing lists, + source code control systems, and issue tracking systems that are managed + by, or on behalf of, the Licensor for the purpose of discussing and + improving the Work, but excluding communication that is conspicuously + marked or otherwise designated in writing by the copyright owner as + "Not a Contribution." + + "Contributor" shall mean Licensor and any Legal Entity on behalf of + whom a Contribution has been received by the Licensor and subsequently + incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a cross-claim + or counterclaim in a lawsuit) alleging that the Work or any + Contribution embodied within the Work constitutes direct or contributory + patent infringement, then any patent licenses granted to You under + this License for that Work shall terminate as of the date such + litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or Derivative + Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, You must include a readable copy of the + attribution notices contained within such NOTICE file, in + at least one of the following places: within a NOTICE text + file distributed as part of the Derivative Works; within + the Source form or documentation, if provided along with the + Derivative Works; or, within a display generated by the + Derivative Works, if and wherever such third-party notices + normally appear. The contents of the NOTICE file are for + informational purposes only and do not modify the License. + You may add Your own attribution notices within Derivative + Works that You distribute, alongside or as an addendum to + the NOTICE text from the Work, provided that such additional + attribution notices cannot be construed as modifying the License. + + You may add Your own license statement for Your modifications and + may provide additional grant of rights to use, copy, modify, merge, + publish, distribute, sublicense, and/or sell copies of the + Contribution, either on an unmodified basis, with modifications, + or as part of a larger work. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or exemplary damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or all other + commercial damages or losses), even if such Contributor has been + advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + Copyright 2025 Navnit Shukla + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/skills/cortex-rest-api-pricing-calculator/SKILL.md b/skills/cortex-rest-api-pricing-calculator/SKILL.md new file mode 100644 index 00000000..6dd971b8 --- /dev/null +++ b/skills/cortex-rest-api-pricing-calculator/SKILL.md @@ -0,0 +1,149 @@ +--- +id: cortex-rest-api-pricing-calculator +name: cortex-rest-api-pricing-calculator +title: Cortex REST API Pricing +summary: Calculate monthly/annual Cortex REST API costs for Claude, GPT, DeepSeek, Mistral, Llama models. +description: >- + Calculate Snowflake Cortex REST API costs for customers. This is specifically for the Cortex REST API + (token-based LLM inference endpoint), NOT Cortex AI SQL functions. Estimate monthly/annual token-based + pricing for Claude, GPT, DeepSeek, Mistral, Llama models. Supports prompt caching (Table 6b) and + non-caching (Table 6c) rates. Use when: pricing estimate, cost calculator, REST API cost, token pricing, + cortex REST pricing, how much will cortex REST API cost, annual commit, monthly cost estimate, credit + consumption table. Do NOT use for Cortex AI SQL functions (COMPLETE, EXTRACT, SENTIMENT) which are + credit-based, not token-based. +tools: + - Read +prompt: "$cortex-rest-api-pricing-calculator estimate monthly cost for claude-sonnet-4-5 at 300M input and 130M output tokens" +language: en +status: stable +authors: + - Navnit Shukla +categories: + - pricing + - cortex +type: snowflake +--- + +# Cortex REST API Pricing Calculator + +## Scope + +This skill is ONLY for **Cortex REST API** pricing — the token-based LLM inference endpoint (POST /api/v2/cortex/inference:complete). It is NOT for: +- Cortex AI SQL functions (COMPLETE, EXTRACT, SENTIMENT, etc.) — those are credit-based, not token-based +- Cortex Search, Cortex Analyst, or other Cortex services + +# When to Use + +- User asks for Cortex REST API cost estimates +- User wants to calculate token-based pricing for a customer using the REST endpoint +- User mentions models like Claude, GPT, DeepSeek, Mistral, Llama in a REST API pricing context +- User wants monthly/annual commit projections for REST API usage +- User asks about prompt caching cost savings on the REST API + +# Instructions + +## Step 1: Determine Mode + +**Ask** user what they need: + +1. **Quick Estimate** — Calculate costs conversationally right here +2. **Interactive App** — Open the full Streamlit calculator with editable rates, PDF viewer, and Excel export + +**If Quick Estimate** → Continue to Step 2 +**If Interactive App** → Jump to Step 5 + +## Step 2: Gather Usage Parameters + +**Ask** user for: +- Model name(s) (reference `references/pricing-rates.md` for available models) +- Monthly token volumes (in millions): + - Input tokens (M) + - Output tokens (M) + - Cache Read tokens (M) — only for Table 6b models + - Cache Write tokens (M) — only for Table 6b models +- Discount percentage (if any, contract-dependent) + +**⚠️ STOPPING POINT:** Confirm parameters with user before calculating. + +## Step 3: Calculate Costs + +**Load** `references/pricing-rates.md` for current rates. + +**Formula** (per model, per million tokens): +``` +input_cost = input_M × input_rate +cache_write_cost = cache_write_M × cache_write_rate +cache_read_cost = cache_read_M × cache_read_rate +output_cost = output_M × output_rate +subtotal = input_cost + cache_write_cost + cache_read_cost + output_cost +``` + +**Summary:** +``` +baseline_monthly = sum of all model subtotals +discount_amount = baseline_monthly × (discount_pct / 100) +monthly_after_discount = baseline_monthly - discount_amount +annual_commit = monthly_after_discount × 12 +``` + +## Step 4: Present Results + +Present a clear table with: +- Per-model breakdown (tokens × rate = cost for each token type) +- Subtotal per model +- Baseline monthly total +- Discount applied +- Monthly after discount +- Annual commit (12 months) + +**Add caveat:** "Discounts are contract-dependent. Final calculations visible only upon invoicing." + +**Include source link:** [Snowflake Credit Consumption Table (PDF)](https://www.snowflake.com/legal-files/CreditConsumptionTable.pdf) — refer to Table 6(b) and 6(c) for REST API rates. + +**Done.** Ask if user wants to adjust parameters or open the interactive app. + +## Step 5: Interactive App + +**Direct the user to the published Streamlit app on Snowhouse:** + +https://app.snowflake.com/SFCOGSOPS/snowhouse_aws_us_west_2/#/streamlit-apps/TEMP.NASHUKLA.CORTEX_REST_API_PRICE_CALCULATOR + +Features of the interactive app: +- Editable pricing table (add/modify rates for new models) +- Embedded PDF viewer for Snowflake Credit Consumption Table +- Monthly usage entry with model selector +- Cost breakdown per model with calculation details +- Discount input with summary metrics +- Excel export (Cost Breakdown + Summary + Pricing Rates sheets) + +## Best Practices + +- Regional rates are 1.1× Global rates (10% premium) +- Cache Write rate is typically 1.25× the Input rate +- Cache Read rate is typically 0.1× the Input rate (90% savings vs input) +- Opus-tier models are ~5× Sonnet-tier pricing +- Haiku-tier models are ~0.33× Sonnet-tier pricing +- All rates are per 1M tokens in USD +- Table 6(b): REST API with Prompt Caching — supports input, cache_write, cache_read, output +- Table 6(c): REST API without Prompt Caching — input and output only + +# Stopping Points + +- ✋ After Step 2 — confirm usage parameters before calculating +- ✋ After Step 4 — offer adjustments or app link + +**Resume rule:** Upon user approval, proceed directly to next step without re-asking. + +# Output + +A formatted cost breakdown table with monthly and annual totals, plus a link to the source PDF and interactive app. + +# Examples + +## Example 1: Quick estimate without caching +User: $cortex-rest-api-pricing-calculator how much for claude-sonnet-4-6 at 300M input and 130M output monthly? +Assistant: Calculates using Global rates: (300 × $3.00) + (130 × $15.00) = $900 + $1,950 = $2,850/month, $34,200/year + +## Example 2: With prompt caching +User: $cortex-rest-api-pricing-calculator estimate claude-sonnet-4-5 with 300M input, 130M output, 20000M cache read, 4000M cache write +Assistant: Calculates: input $900 + cache write $15,000 + cache read $6,000 + output $1,950 = $23,850/month, $286,200/year diff --git a/skills/cortex-rest-api-pricing-calculator/references/pricing-rates.md b/skills/cortex-rest-api-pricing-calculator/references/pricing-rates.md new file mode 100644 index 00000000..c52aecc5 --- /dev/null +++ b/skills/cortex-rest-api-pricing-calculator/references/pricing-rates.md @@ -0,0 +1,75 @@ +# Cortex REST API Pricing Rates + +All rates in USD per 1 million tokens. Source: Snowflake Credit Consumption Table. + +## Table 6(b) — REST API with Prompt Caching + +### Anthropic Models (AWS) + +| Model | Region | Input | Cache Write | Cache Read | Output | +|-------|--------|-------|-------------|------------|--------| +| claude-3-7-sonnet | Regional | 3.30 | 4.13 | 0.33 | 16.50 | +| claude-3-7-sonnet | Global | 3.00 | 3.75 | 0.30 | 15.00 | +| claude-4-opus | Regional | 16.50 | 20.63 | 1.65 | 82.50 | +| claude-4-opus | Global | 15.00 | 18.75 | 1.50 | 75.00 | +| claude-4-sonnet | Regional | 3.30 | 4.13 | 0.33 | 16.50 | +| claude-4-sonnet | Global | 3.00 | 3.75 | 0.30 | 15.00 | +| claude-sonnet-4-5 | Regional | 3.30 | 4.13 | 0.33 | 16.50 | +| claude-sonnet-4-5 | Global | 3.00 | 3.75 | 0.30 | 15.00 | +| claude-sonnet-4-5-long-context | Regional | 6.60 | 8.25 | 0.66 | 24.75 | +| claude-sonnet-4-5-long-context | Global | 6.00 | 7.50 | 0.60 | 22.50 | +| claude-sonnet-4-6 | Regional | 3.30 | 4.13 | 0.33 | 16.50 | +| claude-sonnet-4-6 | Global | 3.00 | 3.75 | 0.30 | 15.00 | +| claude-haiku-4-5 | Regional | 1.10 | 1.38 | 0.11 | 5.50 | +| claude-haiku-4-5 | Global | 1.00 | 1.25 | 0.10 | 5.00 | +| claude-opus-4-5 | Regional | 16.50 | 20.63 | 1.65 | 82.50 | +| claude-opus-4-5 | Global | 15.00 | 18.75 | 1.50 | 75.00 | +| claude-opus-4-6 | Regional | 16.50 | 20.63 | 1.65 | 82.50 | +| claude-opus-4-6 | Global | 15.00 | 18.75 | 1.50 | 75.00 | + +### OpenAI Models (AWS) + +| Model | Region | Input | Cache Write | Cache Read | Output | +|-------|--------|-------|-------------|------------|--------| +| openai-gpt-5 | AWS Global | 1.25 | 1.25 | 0.13 | 10.00 | +| openai-gpt-5.2 | AWS Global | 1.75 | 1.75 | 0.18 | 14.00 | +| openai-gpt-5.4 | AWS Global | 2.50 | 2.50 | 0.25 | 15.00 | +| openai-gpt-4.1 | AWS Global | 2.00 | 2.00 | 0.50 | 8.00 | + +### OpenAI Models (Azure) + +| Model | Region | Input | Cache Write | Cache Read | Output | +|-------|--------|-------|-------------|------------|--------| +| openai-gpt-5 | Azure Global | 1.25 | 1.25 | 0.13 | 10.00 | +| openai-gpt-5.2 | Azure Global | 1.75 | 1.75 | 0.18 | 14.00 | +| openai-gpt-5.4 | Azure Global | 2.50 | 2.50 | 0.25 | 15.00 | +| openai-gpt-4.1 | Azure Global | 2.00 | 2.00 | 0.50 | 8.00 | + +### Anthropic Models (Azure) + +| Model | Region | Input | Cache Write | Cache Read | Output | +|-------|--------|-------|-------------|------------|--------| +| claude-sonnet-4-5 | Azure Regional | 3.30 | 4.13 | 0.33 | 16.50 | +| claude-sonnet-4-5 | Azure Global | 3.00 | 3.75 | 0.30 | 15.00 | +| claude-sonnet-4-6 | Azure Regional | 3.30 | 4.13 | 0.33 | 16.50 | +| claude-sonnet-4-6 | Azure Global | 3.00 | 3.75 | 0.30 | 15.00 | +| claude-haiku-4-5 | Azure Regional | 1.10 | 1.38 | 0.11 | 5.50 | +| claude-haiku-4-5 | Azure Global | 1.00 | 1.25 | 0.10 | 5.00 | + +## Table 6(c) — REST API without Prompt Caching + +| Model | Input | Output | +|-------|-------|--------| +| deepseek-r1 | 1.35 | 5.40 | +| mistral-large2 | 2.00 | 6.00 | +| llama3.3-70b | 0.72 | 0.72 | +| llama4-maverick | 0.24 | 0.97 | +| snowflake-llama-3.3-70b | 0.72 | 0.72 | + +## Pricing Patterns + +- Regional = 1.1 × Global (10% premium) +- Cache Write = ~1.25 × Input rate +- Cache Read = ~0.1 × Input rate (90% savings) +- Opus-tier models: ~5× Sonnet-tier pricing +- Haiku-tier models: ~0.33× Sonnet-tier pricing