AI Capabilities and Limitations

#claude #AI #capabilities #limitations #hallucination #context-window

AI Capabilities and Limitations

Know what AI does well — and where it will fail you.

Core mechanism: Claude doesn't look up facts. It predicts the next most statistically likely word — like a supercharged autocomplete. This is why it writes fluently and fabricates fluently.

What Claude Does Well

Capability	What it means in practice
Versatility	Draft emails, translate, write code, explain concepts — same model, no switching
Pattern recognition at scale	Read 200 reviews or 500 support tickets → find themes in minutes, not days
Few-shot learning	Give 2–3 examples → Claude matches your exact format and tone
Tool use (agentic)	Execute code, search the web, connect to Slack/Gmail/Drive via MCP
Extended Thinking	Step-by-step reasoning for math, logic, complex debugging (+10–30% accuracy)

Best for: Tasks in the "capability zone" — mainstream, stable, well-documented topics (e.g., "write a Python script", "summarize this report").

Where Claude Fails

Limitation	What happens	Fix
Knowledge cutoff	Doesn't know anything after its training date — claims stale facts with confidence	Enable web search tool
Hallucination	Fabricates citations, case laws, statistics, URLs — with full confidence	Cross-check all specific numbers/names/citations
Context window cliff	When the chat fills up, oldest content is silently dropped — no warning	Start fresh with a summary; use `/compact` in Code/Cowork
Lost in the middle	Accuracy drops >30% for info buried in the middle of long docs	Put critical instructions at the TOP and BOTTOM of your prompt
Non-determinism	Same prompt → different answer each time	Set temperature = 0 for consistency-critical tasks
Inherited bias	Defaults to Western, English-speaking, tech-industry norms	Flag local context explicitly: "in the context of Vietnam..."

The Hallucination Rule

Confidence does not equal accuracy. Claude says "I'm certain" the same way for both correct and fabricated facts.

Hallucinations concentrate in specifics: names, dates, numbers, DOIs, URLs, legal citations.

3 fixes:

Add "If unsure, flag it as [NEEDS VERIFY] — do not make things up" to your prompt
Use RAG (upload the source document) so Claude answers from your file, not memory
Run a separate verifier pass: "For each claim, tag it HIGH / MEDIUM / LOW confidence"

AI vs. Human — Who Wins Where

AI wins	Human wins
Speed, scale, pattern detection	Critical judgment, ethical decisions
Broad knowledge recall	Deep domain expertise
Parallel processing	Real-time awareness
Draft and summarize	Final call and accountability

The right model: AI does the "broad and fast." You do the "deep and judgment." Combine both for output neither could produce alone.

Source: Claude Course 2026 — NotebookLM