AI Capabilities and Limitations

AI Capabilities and Limitations

Know what AI does well — and where it will fail you.

Core mechanism: Claude doesn't look up facts. It predicts the next most statistically likely word — like a supercharged autocomplete. This is why it writes fluently and fabricates fluently.


What Claude Does Well

Capability What it means in practice
Versatility Draft emails, translate, write code, explain concepts — same model, no switching
Pattern recognition at scale Read 200 reviews or 500 support tickets → find themes in minutes, not days
Few-shot learning Give 2–3 examples → Claude matches your exact format and tone
Tool use (agentic) Execute code, search the web, connect to Slack/Gmail/Drive via MCP
Extended Thinking Step-by-step reasoning for math, logic, complex debugging (+10–30% accuracy)

Best for: Tasks in the "capability zone" — mainstream, stable, well-documented topics (e.g., "write a Python script", "summarize this report").


Where Claude Fails

Limitation What happens Fix
Knowledge cutoff Doesn't know anything after its training date — claims stale facts with confidence Enable web search tool
Hallucination Fabricates citations, case laws, statistics, URLs — with full confidence Cross-check all specific numbers/names/citations
Context window cliff When the chat fills up, oldest content is silently dropped — no warning Start fresh with a summary; use /compact in Code/Cowork
Lost in the middle Accuracy drops >30% for info buried in the middle of long docs Put critical instructions at the TOP and BOTTOM of your prompt
Non-determinism Same prompt → different answer each time Set temperature = 0 for consistency-critical tasks
Inherited bias Defaults to Western, English-speaking, tech-industry norms Flag local context explicitly: "in the context of Vietnam..."

The Hallucination Rule

Confidence does not equal accuracy. Claude says "I'm certain" the same way for both correct and fabricated facts.

Hallucinations concentrate in specifics: names, dates, numbers, DOIs, URLs, legal citations.

3 fixes:

  1. Add "If unsure, flag it as [NEEDS VERIFY] — do not make things up" to your prompt
  2. Use RAG (upload the source document) so Claude answers from your file, not memory
  3. Run a separate verifier pass: "For each claim, tag it HIGH / MEDIUM / LOW confidence"

AI vs. Human — Who Wins Where

AI wins Human wins
Speed, scale, pattern detection Critical judgment, ethical decisions
Broad knowledge recall Deep domain expertise
Parallel processing Real-time awareness
Draft and summarize Final call and accountability

The right model: AI does the "broad and fast." You do the "deep and judgment." Combine both for output neither could produce alone.


Source: Claude Course 2026 — NotebookLM

Powered by Forestry.md