multiple tokenizers
supports the exact algorithms used by GPT-4, Claude, and Gemini. no generic word-count guessing.
count tokens and estimate llm api costs offline.
tokcount tells you exactly how many tokens your prompt consumes across different models. it calculates the cost before you send the request.
tokcount is a local token counter for llm prompts. you paste your text, and it runs the exact tokenization algorithms used by OpenAI, Anthropic, and Google. it shows you the token count and the estimated cost for both input and output.
guessing token counts based on word count is a good way to hit api limits or overspend. different models tokenize code, whitespace, and non-english text differently. tokcount eliminates the guesswork by running the actual tokenizer.
this is built for developers wrapping llms in their applications. whether you are building a rag pipeline or just trying to fit a massive context window, knowing your token budget upfront prevents runtime errors.
supports the exact algorithms used by GPT-4, Claude, and Gemini. no generic word-count guessing.
calculates the exact price of your prompt based on current api rates. prevents billing surprises.
runs locally without sending your proprietary prompts to a third party. completely private.
upload text files or paste directly into the interface. handles large context windows easily.
shows separate counts for input and estimated output tokens. helps you manage your total budget.
check the token count for a text file using the Claude tokenizer.
// input
tokcount --model claude-3-opus < prompt.txt
// output
tokens: 4,205 estimated input cost: $0.06 estimated output cost: $0.31
it ran the Anthropic tokenizer locally and calculated the exact api cost based on current pricing.
we built this because we received a massive bill from OpenAI after a rogue script got stuck in a retry loop with a huge context window. guessing token counts based on character length is a fool's errand.
different models tokenize code and whitespace differently. a prompt that costs pennies on one model might cost dollars on another due to how it handles indentation. we needed a reliable way to check this locally.
yes. we use the official open-source tokenizers provided by the model creators. the count matches what the api will bill you for.
the cli pulls the latest pricing JSON on boot if you have an internet connection. otherwise, it falls back to the last known rates.
yes. you can pipe text into tokcount and use the --fail-over flag to exit with an error if a prompt exceeds a certain token limit.
because sending a 100k token prompt just to see how big it is wastes bandwidth and risks exposing sensitive data unnecessarily.
// stop guessing how much your prompts cost.