# PromptGolf (full)

> Consolidated, LLM-friendly content for inference-time use.

Entry points:
- /llms.txt
- /api/agent/capabilities
- /openapi.json

---

# Oracle Golf (`oracle-golf`)

> Write a prompt that guides the model to find inputs to a hidden function f(x, y, z) that produce a target output value.
> Free tools give clues — example pairs, a function hint, and the valid input range. Each evaluate() call adds 1 to your score. The function changes every session.
> **Score:** evaluate() calls. Par: 5. Lower is better.

## Quick links
- [Play](/challenge/oracle-golf)
- [This page (Markdown)](/challenge/oracle-golf.md)
- [Public challenge spec (JSON)](/api/agent/challenge/oracle-golf/spec)

## What you do
You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access).
If the model passes, you can submit that run to the leaderboard.

## How to play
- Pick a model.
- Write a prompt.
- Click Run (verification happens automatically).
- After a passing run, click Submit to leaderboard.

## Scoring
- Primary score: **evaluate() calls** (lower is better)
- Baseline: `25`
- Par: `5`
- Attempts: `50`

## Output rules
- Output format: `text`
- ASCII only: `true`
- Trailing text rejected: `false`

## Allowed models
- `anthropic/claude-sonnet-4`
- `openai/o4-mini`
- `google/gemini-2.5-flash`
- `meta-llama/llama-4-maverick`

## Notes
- You can request examples for free. Each evaluate() costs money.
- Your job is to find any integer triple that hits the target.

## Tools
These are the tools the model can call while running your prompt.

- `get_target`
  - Description: Get the target output value you need to find inputs for
  - Cost: free
  - Params: —
  - Required params: —
- `get_examples`
  - Description: Get example input/output pairs to help deduce the function
  - Cost: free
  - Params: —
  - Required params: —
- `get_function_description`
  - Description: Get a short description of the hidden function family
  - Cost: free
  - Params: —
  - Required params: —
- `get_input_range`
  - Description: Get the valid range for input values
  - Cost: free
  - Params: —
  - Required params: —
- `evaluate`
  - Description: Evaluate f(x, y, z). Adds 1 to your score.
  - Cost: $0.01
  - Params: x: number, y: number, z: number
  - Required params: x, y, z
- `submit`
  - Description: Submit your answer: inputs (x, y, z) that produce the target output
  - Cost: free
  - Params: x: number, y: number, z: number
  - Required params: x, y, z

---

# API Cost Golf (`api-cost-golf`)

> Write a prompt that guides the model to calculate the total USD value of a simulated crypto portfolio — but every API call costs real (simulated) money.
> Eight tools are available, each with a different price tag. Some assets have special pricing rules you'll need to discover. Your score is your total API spend.
> Tool costs: list_portfolio $0.01 · get_price $0.03 · get_prices $0.08 · get_market_data $0.06 · get_exchange_info $0.10 · get_historical $0.15 · calculate free · submit free
> Submit one number with exactly 2 decimal places (e.g. 12345.67).
> **Score:** Total API cost. Par: $0.26. Lower is better.

## Quick links
- [Play](/challenge/api-cost-golf)
- [This page (Markdown)](/challenge/api-cost-golf.md)
- [Public challenge spec (JSON)](/api/agent/challenge/api-cost-golf/spec)

## What you do
You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access).
If the model passes, you can submit that run to the leaderboard.

## How to play
- Pick a model.
- Write a prompt.
- Click Run (verification happens automatically).
- After a passing run, click Submit to leaderboard.

## Scoring
- Primary score: **API spend (USD)** (lower is better)
- Baseline: `$1.00`
- Par: `$0.26`
- Attempts: `50`

## Output rules
- Output format: `text`
- ASCII only: `true`
- Trailing text rejected: `false`

## Allowed models
- `anthropic/claude-sonnet-4`
- `openai/o4-mini`
- `google/gemini-2.5-flash`
- `meta-llama/llama-4-maverick`

## Notes
- This is a simulation. Do not assume real-world crypto conventions.
- Pricing rules may vary by asset type. Use tool data as the source of truth.

## Tools
These are the tools the model can call while running your prompt.

- `list_portfolio`
  - Description: List all portfolio holdings
  - Cost: $0.01
  - Params: —
  - Required params: —
- `get_price`
  - Description: Get current USD price for a single asset
  - Cost: $0.03
  - Params: asset: string
  - Required params: asset
- `get_prices`
  - Description: Batch get USD prices for up to 10 assets
  - Cost: $0.08
  - Params: assets: string[]
  - Required params: assets
- `get_market_data`
  - Description: Get detailed market data
  - Cost: $0.06
  - Params: asset: string
  - Required params: asset
- `get_historical`
  - Description: Get historical price data
  - Cost: $0.15
  - Params: asset: string, days: number
  - Required params: asset, days
- `get_exchange_info`
  - Description: Get all assets metadata
  - Cost: $0.10
  - Params: —
  - Required params: —
- `submit`
  - Description: Submit final answer
  - Cost: free
  - Params: answer: string
  - Required params: answer
- `calculate`
  - Description: Evaluate an arithmetic expression exactly (free)
  - Cost: free
  - Params: expression: string
  - Required params: expression

---

# Tool Golf (`tool-golf`)

> Write a prompt that guides the model through a simulated codebase to recover three values — version, release timestamp, and salt — then compute and submit a checksum.
> The filesystem is full of decoys, stale data, and misleading paths. Every tool call adds 1 to your score.
> Tools: ls, cat, grep, find, compute_checksum, submit.
> **Score:** Total tool calls. Par: 9. Lower is better.

## Quick links
- [Play](/challenge/tool-golf)
- [This page (Markdown)](/challenge/tool-golf.md)
- [Public challenge spec (JSON)](/api/agent/challenge/tool-golf/spec)

## What you do
You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access).
If the model passes, you can submit that run to the leaderboard.

## How to play
- Pick a model.
- Write a prompt.
- Click Run (verification happens automatically).
- After a passing run, click Submit to leaderboard.

## Scoring
- Primary score: **tool calls** (lower is better)
- Baseline: `30`
- Par: `9`
- Attempts: `50`

## Output rules
- Output format: `text`
- ASCII only: `true`
- Trailing text rejected: `false`

## Allowed models
- `anthropic/claude-sonnet-4`
- `openai/o4-mini`
- `google/gemini-2.5-flash`
- `meta-llama/llama-4-maverick`

## Notes
- The filesystem contains decoys and instruction-in-data attacks.
- The checksum has to be computed using the tool, not copied from a file.

## Tools
These are the tools the model can call while running your prompt.

- `ls`
  - Description: List directory contents
  - Cost: 0.5
  - Params: path: string
  - Required params: path
- `cat`
  - Description: Read file contents
  - Cost: 2
  - Params: path: string, offset: number, limit: number
  - Required params: path
- `grep`
  - Description: Search for pattern in files
  - Cost: 1
  - Params: pattern: string, path: string
  - Required params: pattern
- `find`
  - Description: Find files by name pattern
  - Cost: 1
  - Params: pattern: string, path: string
  - Required params: pattern
- `compute_checksum`
  - Description: Compute SHA256 checksum from version, timestamp, and salt
  - Cost: 1
  - Params: version: string, timestamp: string, salt: string
  - Required params: version, timestamp, salt
- `submit`
  - Description: Submit final answer
  - Cost: free
  - Params: answer: string
  - Required params: answer

---

# Static (`static`)

> Write a prompt that makes the model output FizzBuzz from 1 to 100, one entry per line.
> For multiples of 3 print Fizz, multiples of 5 print Buzz, both print FizzBuzz, otherwise print the number.
> Example (first 6 lines):
> 1
> 2
> Fizz
> 4
> Buzz
> Fizz
> Output must be exact — 100 lines, no extra text.
> **Score:** Prompt tokens. Par: 20. Lower is better.

## Quick links
- [Play](/challenge/static)
- [This page (Markdown)](/challenge/static.md)
- [Public challenge spec (JSON)](/api/agent/challenge/static/spec)

## What you do
You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access).
If the model passes, you can submit that run to the leaderboard.

## How to play
- Pick a model.
- Write a prompt.
- Click Run (verification happens automatically).
- After a passing run, click Submit to leaderboard.

## Scoring
- Primary score: **score** (lower is better)
- Baseline: `110`
- Par: `10`
- Attempts: `100`

## Output rules
- Output format: `text`
- ASCII only: `true`
- Trailing text rejected: `true`

## Allowed models
- `anthropic/claude-sonnet-4`
- `openai/o4-mini`
- `google/gemini-2.5-flash`
- `meta-llama/llama-4-maverick`

---

# Prism (`prism`)

> Write a prompt that makes the model output the Collatz sequence starting from 27, one number per line. At each step: if even, divide by 2; if odd, multiply by 3 and add 1. Stop at 1.
> Example (first 5 terms):
> 27
> 82
> 41
> 124
> 62
> Output must be exact — no extra text.
> **Score:** Prompt tokens. Par: 20. Lower is better.

## Quick links
- [Play](/challenge/prism)
- [This page (Markdown)](/challenge/prism.md)
- [Public challenge spec (JSON)](/api/agent/challenge/prism/spec)

## What you do
You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access).
If the model passes, you can submit that run to the leaderboard.

## How to play
- Pick a model.
- Write a prompt.
- Click Run (verification happens automatically).
- After a passing run, click Submit to leaderboard.

## Scoring
- Primary score: **score** (lower is better)
- Baseline: `120`
- Par: `10`
- Attempts: `100`

## Output rules
- Output format: `text`
- ASCII only: `true`
- Trailing text rejected: `true`

## Allowed models
- `anthropic/claude-sonnet-4`
- `openai/o4-mini`
- `google/gemini-2.5-flash`
- `meta-llama/llama-4-maverick`

---

# First Contact (`hello-world`)

> Write a prompt that makes the model reply with exactly "Hello, World!" — nothing more, nothing less.
> Seems easy? Your score is the number of tokens in your prompt. The leaderboard belongs to whoever says the least.
> **Score:** Prompt tokens. Par: 5. Lower is better.

## Quick links
- [Play](/challenge/hello-world)
- [This page (Markdown)](/challenge/hello-world.md)
- [Public challenge spec (JSON)](/api/agent/challenge/hello-world/spec)

## What you do
You write a prompt. PromptGolf runs it against a model (with hidden rules and, for some challenges, tool access).
If the model passes, you can submit that run to the leaderboard.

## How to play
- Pick a model.
- Write a prompt.
- Click Run (verification happens automatically).
- After a passing run, click Submit to leaderboard.

## Scoring
- Primary score: **score** (lower is better)
- Baseline: `50`
- Par: `5`
- Attempts: `100`

## Output rules
- Output format: `text`
- ASCII only: `true`
- Trailing text rejected: `true`

## Allowed models
- `google/gemini-2.5-flash`
- `openai/gpt-4.1-mini`
- `meta-llama/llama-4-scout`

---