Skip to content

Context Compaction

Carry Code’s intelligent context compression feature automatically manages conversation length, keeping you efficient during long conversations.


AI conversations have an important limitation: Token count. Each AI model has a maximum token limit (e.g., 128K, 200K, etc.).

When conversations get long:

  • Exceeding the limit prevents continued conversation
  • Sending long contexts increases API costs
  • Affects AI response speed

Carry Code’s intelligent compression solves these problems.


Carry Code automatically detects conversation length and when approaching the token limit:

  1. Analyze conversation - Identify key and secondary information
  2. Compress secondary content - Simplify early detailed conversations
  3. Preserve key information - Keep important code and decisions
  4. Maintain coherence - So AI still understands the context
Content TypeHandling
Key codeFully preserved
Important decisionsFully preserved
Error messagesCondensed
Casual chatHeavily simplified
Historical detailsSelectively compressed

/compact

Manually trigger context compression.

  • When conversation gets too long
  • Want to clear unnecessary history
  • Want to save API costs

After compression:

  • Token count significantly reduced
  • Key information still preserved
  • Conversation can continue

Configure in ~/.carry/carrycode.json:

{
"compaction": {
"enabled": true,
"threshold": 80000,
"preserveKeyInfo": true
}
}
ParameterDescriptionDefault
enabledEnable automatic compressiontrue
thresholdToken threshold to trigger compression80000
preserveKeyInfoPreserve key informationtrue

  1. Information simplification - Early detailed conversations are condensed
  2. Key points preserved - Important code and decisions are kept
  3. Continue chatting - You can continue the previous task
  • ❌ Won’t lose key information for current task
  • ❌ Won’t delete important code
  • ❌ Won’t affect current conversation context

When conversation exceeds a certain length, proactively use /compact:

  • Keep conversations efficient
  • Save costs
  • Avoid hitting token limits

For completely different tasks, creating a new session is better:

  • Avoid context confusion
  • Keep each task clear

When you don’t need to write code, use Plan mode:

  • Won’t generate new code content
  • Context is simpler

OperationEffect
/session newBrand new blank context
/compactCompress current context
/session switchSwitch to other context

Q: Will compression lose important information?

Section titled “Q: Will compression lose important information?”

No. The compression algorithm prioritizes preserving:

  • Key code snippets
  • Important decisions
  • Current task-related information

Q: Can I continue previous work after compression?

Section titled “Q: Can I continue previous work after compression?”

Yes. AI will understand the compressed context, and you can continue previous tasks.

Set "enabled": false in the config file, but it’s recommended to keep it enabled to avoid hitting token limits.