ai utility

tokcount.

count tokens and estimate llm api costs offline.

tokcount tells you exactly how many tokens your prompt consumes across different models. it calculates the cost before you send the request.

github →npm →download →

what it is

tokcount is a local token counter for llm prompts. you paste your text, and it runs the exact tokenization algorithms used by OpenAI, Anthropic, and Google. it shows you the token count and the estimated cost for both input and output.

guessing token counts based on word count is a good way to hit api limits or overspend. different models tokenize code, whitespace, and non-english text differently. tokcount eliminates the guesswork by running the actual tokenizer.

this is built for developers wrapping llms in their applications. whether you are building a rag pipeline or just trying to fit a massive context window, knowing your token budget upfront prevents runtime errors.

core features

multiple tokenizers

supports the exact algorithms used by GPT-4, Claude, and Gemini. no generic word-count guessing.

cost estimation

calculates the exact price of your prompt based on current api rates. prevents billing surprises.

offline processing

runs locally without sending your proprietary prompts to a third party. completely private.

file support

upload text files or paste directly into the interface. handles large context windows easily.

split metrics

shows separate counts for input and estimated output tokens. helps you manage your total budget.

how to use it

check the token count for a text file using the Claude tokenizer.

// input

tokcount --model claude-3-opus < prompt.txt

// output

tokens: 4,205
estimated input cost: $0.06
estimated output cost: $0.31

it ran the Anthropic tokenizer locally and calculated the exact api cost based on current pricing.

why we built it

we built this because we received a massive bill from OpenAI after a rogue script got stuck in a retry loop with a huge context window. guessing token counts based on character length is a fool's errand.

different models tokenize code and whitespace differently. a prompt that costs pennies on one model might cost dollars on another due to how it handles indentation. we needed a reliable way to check this locally.

frequently asked questions

are the token counts exact?

yes. we use the official open-source tokenizers provided by the model creators. the count matches what the api will bill you for.

does it update when prices change?

the cli pulls the latest pricing JSON on boot if you have an internet connection. otherwise, it falls back to the last known rates.

can i use this in ci/cd?

yes. you can pipe text into tokcount and use the --fail-over flag to exit with an error if a prompt exceeds a certain token limit.

why not just use the api to count tokens?

because sending a 100k token prompt just to see how big it is wastes bandwidth and risks exposing sensitive data unnecessarily.

related tools

ctxstuff

dump your codebase into a single file for llm context.

promptdiff

compare llm outputs with a side-by-side diff.

// stop guessing how much your prompts cost.

← all tools