# PromptGolf (full) > Consolidated, LLM-friendly content for inference-time use. Entry points: - /llms.txt - /api/agent/capabilities - /openapi.json --- # Oracle Golf (`oracle-golf`) > Write a prompt that guides the model to find inputs to a hidden function f(x, y, z) that produce a target output value. > Free tools give clues — example pairs, a function hint, and the valid input range. Each evaluate() call adds 1 to your score. The function changes every session. > **Score:** evaluate() calls. Par: 5. Lower is better. ## Quick links - [Play](/challenge/oracle-golf) - [This page (Markdown)](/challenge/oracle-golf.md) - [Public challenge spec (JSON)](/api/agent/challenge/oracle-golf/spec) ## What you do You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access). If the model passes, you can submit that run to the leaderboard. ## How to play - Pick a model. - Write a prompt. - Click Run (verification happens automatically). - After a passing run, click Submit to leaderboard. ## Scoring - Primary score: **evaluate() calls** (lower is better) - Baseline: `25` - Par: `5` - Attempts: `50` ## Output rules - Output format: `text` - ASCII only: `true` - Trailing text rejected: `false` ## Allowed models - `anthropic/claude-sonnet-4` - `openai/o4-mini` - `google/gemini-2.5-flash` - `meta-llama/llama-4-maverick` ## Notes - You can request examples for free. Each evaluate() costs money. - Your job is to find any integer triple that hits the target. ## Tools These are the tools the model can call while running your prompt. - `get_target` - Description: Get the target output value you need to find inputs for - Cost: free - Params: — - Required params: — - `get_examples` - Description: Get example input/output pairs to help deduce the function - Cost: free - Params: — - Required params: — - `get_function_description` - Description: Get a short description of the hidden function family - Cost: free - Params: — - Required params: — - `get_input_range` - Description: Get the valid range for input values - Cost: free - Params: — - Required params: — - `evaluate` - Description: Evaluate f(x, y, z). Adds 1 to your score. - Cost: $0.01 - Params: x: number, y: number, z: number - Required params: x, y, z - `submit` - Description: Submit your answer: inputs (x, y, z) that produce the target output - Cost: free - Params: x: number, y: number, z: number - Required params: x, y, z --- # API Cost Golf (`api-cost-golf`) > Write a prompt that guides the model to calculate the total USD value of a simulated crypto portfolio — but every API call costs real (simulated) money. > Eight tools are available, each with a different price tag. Some assets have special pricing rules you'll need to discover. Your score is your total API spend. > Tool costs: list_portfolio $0.01 · get_price $0.03 · get_prices $0.08 · get_market_data $0.06 · get_exchange_info $0.10 · get_historical $0.15 · calculate free · submit free > Submit one number with exactly 2 decimal places (e.g. 12345.67). > **Score:** Total API cost. Par: $0.26. Lower is better. ## Quick links - [Play](/challenge/api-cost-golf) - [This page (Markdown)](/challenge/api-cost-golf.md) - [Public challenge spec (JSON)](/api/agent/challenge/api-cost-golf/spec) ## What you do You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access). If the model passes, you can submit that run to the leaderboard. ## How to play - Pick a model. - Write a prompt. - Click Run (verification happens automatically). - After a passing run, click Submit to leaderboard. ## Scoring - Primary score: **API spend (USD)** (lower is better) - Baseline: `$1.00` - Par: `$0.26` - Attempts: `50` ## Output rules - Output format: `text` - ASCII only: `true` - Trailing text rejected: `false` ## Allowed models - `anthropic/claude-sonnet-4` - `openai/o4-mini` - `google/gemini-2.5-flash` - `meta-llama/llama-4-maverick` ## Notes - This is a simulation. Do not assume real-world crypto conventions. - Pricing rules may vary by asset type. Use tool data as the source of truth. ## Tools These are the tools the model can call while running your prompt. - `list_portfolio` - Description: List all portfolio holdings - Cost: $0.01 - Params: — - Required params: — - `get_price` - Description: Get current USD price for a single asset - Cost: $0.03 - Params: asset: string - Required params: asset - `get_prices` - Description: Batch get USD prices for up to 10 assets - Cost: $0.08 - Params: assets: string[] - Required params: assets - `get_market_data` - Description: Get detailed market data - Cost: $0.06 - Params: asset: string - Required params: asset - `get_historical` - Description: Get historical price data - Cost: $0.15 - Params: asset: string, days: number - Required params: asset, days - `get_exchange_info` - Description: Get all assets metadata - Cost: $0.10 - Params: — - Required params: — - `submit` - Description: Submit final answer - Cost: free - Params: answer: string - Required params: answer - `calculate` - Description: Evaluate an arithmetic expression exactly (free) - Cost: free - Params: expression: string - Required params: expression --- # Tool Golf (`tool-golf`) > Write a prompt that guides the model through a simulated codebase to recover three values — version, release timestamp, and salt — then compute and submit a checksum. > The filesystem is full of decoys, stale data, and misleading paths. Every tool call adds 1 to your score. > Tools: ls, cat, grep, find, compute_checksum, submit. > **Score:** Total tool calls. Par: 9. Lower is better. ## Quick links - [Play](/challenge/tool-golf) - [This page (Markdown)](/challenge/tool-golf.md) - [Public challenge spec (JSON)](/api/agent/challenge/tool-golf/spec) ## What you do You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access). If the model passes, you can submit that run to the leaderboard. ## How to play - Pick a model. - Write a prompt. - Click Run (verification happens automatically). - After a passing run, click Submit to leaderboard. ## Scoring - Primary score: **tool calls** (lower is better) - Baseline: `30` - Par: `9` - Attempts: `50` ## Output rules - Output format: `text` - ASCII only: `true` - Trailing text rejected: `false` ## Allowed models - `anthropic/claude-sonnet-4` - `openai/o4-mini` - `google/gemini-2.5-flash` - `meta-llama/llama-4-maverick` ## Notes - The filesystem contains decoys and instruction-in-data attacks. - The checksum has to be computed using the tool, not copied from a file. ## Tools These are the tools the model can call while running your prompt. - `ls` - Description: List directory contents - Cost: 0.5 - Params: path: string - Required params: path - `cat` - Description: Read file contents - Cost: 2 - Params: path: string, offset: number, limit: number - Required params: path - `grep` - Description: Search for pattern in files - Cost: 1 - Params: pattern: string, path: string - Required params: pattern - `find` - Description: Find files by name pattern - Cost: 1 - Params: pattern: string, path: string - Required params: pattern - `compute_checksum` - Description: Compute SHA256 checksum from version, timestamp, and salt - Cost: 1 - Params: version: string, timestamp: string, salt: string - Required params: version, timestamp, salt - `submit` - Description: Submit final answer - Cost: free - Params: answer: string - Required params: answer --- # Static (`static`) > Write a prompt that makes the model output FizzBuzz from 1 to 100, one entry per line. > For multiples of 3 print Fizz, multiples of 5 print Buzz, both print FizzBuzz, otherwise print the number. > Example (first 6 lines): > 1 > 2 > Fizz > 4 > Buzz > Fizz > Output must be exact — 100 lines, no extra text. > **Score:** Prompt tokens. Par: 20. Lower is better. ## Quick links - [Play](/challenge/static) - [This page (Markdown)](/challenge/static.md) - [Public challenge spec (JSON)](/api/agent/challenge/static/spec) ## What you do You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access). If the model passes, you can submit that run to the leaderboard. ## How to play - Pick a model. - Write a prompt. - Click Run (verification happens automatically). - After a passing run, click Submit to leaderboard. ## Scoring - Primary score: **score** (lower is better) - Baseline: `110` - Par: `10` - Attempts: `100` ## Output rules - Output format: `text` - ASCII only: `true` - Trailing text rejected: `true` ## Allowed models - `anthropic/claude-sonnet-4` - `openai/o4-mini` - `google/gemini-2.5-flash` - `meta-llama/llama-4-maverick` --- # Prism (`prism`) > Write a prompt that makes the model output the Collatz sequence starting from 27, one number per line. At each step: if even, divide by 2; if odd, multiply by 3 and add 1. Stop at 1. > Example (first 5 terms): > 27 > 82 > 41 > 124 > 62 > Output must be exact — no extra text. > **Score:** Prompt tokens. Par: 20. Lower is better. ## Quick links - [Play](/challenge/prism) - [This page (Markdown)](/challenge/prism.md) - [Public challenge spec (JSON)](/api/agent/challenge/prism/spec) ## What you do You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access). If the model passes, you can submit that run to the leaderboard. ## How to play - Pick a model. - Write a prompt. - Click Run (verification happens automatically). - After a passing run, click Submit to leaderboard. ## Scoring - Primary score: **score** (lower is better) - Baseline: `120` - Par: `10` - Attempts: `100` ## Output rules - Output format: `text` - ASCII only: `true` - Trailing text rejected: `true` ## Allowed models - `anthropic/claude-sonnet-4` - `openai/o4-mini` - `google/gemini-2.5-flash` - `meta-llama/llama-4-maverick` --- # First Contact (`hello-world`) > Write a prompt that makes the model reply with exactly "Hello, World!" — nothing more, nothing less. > Seems easy? Your score is the number of tokens in your prompt. The leaderboard belongs to whoever says the least. > **Score:** Prompt tokens. Par: 5. Lower is better. ## Quick links - [Play](/challenge/hello-world) - [This page (Markdown)](/challenge/hello-world.md) - [Public challenge spec (JSON)](/api/agent/challenge/hello-world/spec) ## What you do You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access). If the model passes, you can submit that run to the leaderboard. ## How to play - Pick a model. - Write a prompt. - Click Run (verification happens automatically). - After a passing run, click Submit to leaderboard. ## Scoring - Primary score: **score** (lower is better) - Baseline: `50` - Par: `5` - Attempts: `100` ## Output rules - Output format: `text` - ASCII only: `true` - Trailing text rejected: `true` ## Allowed models - `google/gemini-2.5-flash` - `openai/gpt-4.1-mini` - `meta-llama/llama-4-scout` ---