Claude is powerful out of the box, but most people leave a lot on the table. They let one chat run forever, re-feed the same giant PDF over and over, fight with the model in follow-up messages, and burn their heaviest model on tasks that don’t need it.
None of that is necessary. A handful of small habits will make Claude’s answers sharper, keep your conversations clean, and stretch your usage limits much further. Here are the four that matter most, plus a bonus trick for anyone using Claude Code.
1. Don’t live in a single chat window
Long threads quietly work against you. The more messages pile up in one conversation, the more context Claude has to wade through on every reply, and the easier it is for answers to start drifting away from what you actually asked.
A good rule of thumb: after roughly 20–25 messages, or the moment responses start wandering off-brief, don’t push through it. Pack what matters into a clean summary and start fresh.
The trick is not to summarize by hand. Ask Claude to do it, in a format built for pasting into a new chat:
You are a context compactor.
Compress everything in this chat into a clean brief I can paste into a
fresh Claude chat without losing anything important.
Return it in this format:
TASK: What I'm trying to do.
CURRENT STATE: What's already been decided, created, or discussed.
KEY CONTEXT: Names, numbers, examples, constraints, audience, tone,
preferences, and any details you must keep.
WHAT TO IGNORE: Repeated points, rejected ideas, bad drafts, and
anything no longer useful.
Run that, copy the brief, open a new chat, and paste it in. You get all the signal with none of the accumulated clutter, and Claude starts the next leg of the work with a clear head.
2. Turn long PDFs into a compact, reusable digest
Here’s something people rarely think about: a long document sits in the conversation’s context, and Claude works through it again on each turn. Ask ten separate questions about a 100-page report and you’re effectively making it wrestle with all 100 pages ten times over — slower answers and a much faster trip to your usage limit.
Two ways to avoid that:
First, batch your questions. If you already know you have several things to ask about a document, put them all in one message instead of dribbling them out one at a time.
Second, distill the document once. Turn the heavy PDF into a tight digest you can reuse, so future questions run against a few hundred words instead of a few hundred pages. This prompt does exactly that:
Read this document like a research assistant preparing it for future
chats. Give me a more digestible version:
1. A one-paragraph overview of what this is and why it exists.
2. The core argument or thesis.
3. The 10 most important ideas, claims, or facts.
4. Key numbers, names, dates, definitions, and frameworks. Keep them exact.
5. Any tables, charts, or visuals worth noting.
6. Any conclusions, recommendations, or action items.
7. A short context brief I can paste into a new chat later.
8. 10 questions I should now ask about it.
Rules:
- Use page references where you can.
- Keep the document's exact terms.
- Don't over-summarise or skip useful details just to be short.
- Flag any logical gaps, contradictions, or errors you notice.
Save the result. From then on, you can paste that digest into any new chat and ask away — no more re-loading the full document every time.
3. Edit your prompts instead of correcting them
When Claude misses something, the instinct is to fire off a follow-up message: “No, I meant X,” or “You forgot to mention Y.” That works, but it bloats the thread and spends an extra message on a problem you could have prevented.
The cleaner move is to edit the original prompt. Hit the pencil icon on your message, add the detail that was missing, and regenerate.
This does two useful things. It keeps the conversation short and focused, since the back-and-forth correction never enters the history. And it often produces a better answer, because Claude is responding to a complete instruction rather than patching a half-formed one. Reserve new messages for genuinely new directions, not for fixing the last prompt.
4. Match the model to the task
Running the most powerful model for everything feels safe, but it drains your usage allowance far faster than necessary — and for routine work, you won’t notice the difference anyway. Claude comes in a few tiers, each tuned for a different kind of job:
- Haiku 4.5 — fast and light. Ideal for everyday stuff: rewrites, reformatting, quick summaries, simple lookups, and short back-and-forth questions.
- Sonnet 4.6 — the balanced workhorse. Reach for it when you’re doing real analysis, writing, strategy, or anything that needs solid reasoning without being a research project.
- Opus 4.8 — the heavyweight. Worth it for tasks that genuinely demand deep reasoning: complex research, gnarly problem-solving, multi-step logic, and high-stakes work where getting it exactly right matters most.
The simple discipline: default to a lighter model, and step up to Opus only when the task earns it. You’ll keep more of your limit for the work that actually needs the firepower.
Bonus: Make Claude Code answer in fewer words with the Caveman skill
If you use Claude Code and find the responses too chatty, there’s a community skill called Caveman that trims the fluff. It rewrites Claude’s output into short, fragment-style answers — cutting a large share of output tokens while keeping the technical substance intact. “Brain still big. Mouth small,” as the project puts it.
The cleanest way to install it is through the skills CLI:
npx skills add JuliusBrussee/caveman
Then restart Claude Code (it picks up new skills on launch). You can also toggle it within a session by typing:
/caveman
There’s a one-line installer script too, if you prefer it:
# macOS, Linux, WSL, or Git Bash
curl -fsSL https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.sh | bash
# Windows PowerShell
irm https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.ps1 | iex
You’ll need Node.js 18 or newer for either method.
One sensible caution: Caveman is a third-party tool that installs hooks into your Claude Code configuration, and the one-line installers pipe a script straight from the internet into your shell. That’s convenient, but it’s worth a moment of trust before you run it. If you’d rather see what you’re installing first, the npx skills add route and the project’s source are both on GitHub at github.com/JuliusBrussee/caveman.
The takeaway
None of these are complicated. Start a fresh chat when a thread gets heavy, distill long documents once instead of re-reading them, edit prompts rather than correcting them, and pick the model that fits the job. Do those four things consistently and you’ll get noticeably better answers while making your usage go a lot further — and if you live in Claude Code, Caveman is a fun way to cut the chatter.