Message Array Design Considerations
Version: 1.0.0
Date: 2025-01-29
Status: ACTIVE ISSUE
The Problem
Messages in sessions are stored as arrays, which creates a fundamental React re-rendering issue:
// Current structure
session: {
id: 'session-123',
messages: [
{ type: 'user', content: '...', timestamp: '...' },
{ type: 'assistant', content: '...', timestamp: '...' },
// ... potentially hundreds of messages
]
}
Why This Is Problematic
- Array Immutability: Adding one message requires creating a new array
- Reference Changes: New array = new reference = React re-render
- Cascading Updates: All components depending on messages re-render
- Performance Impact: 100+ message array replaced for 1 new message
Current Behavior Analysis
What Happens When Claude Responds
1. Claude writes to JSONL file
2. Chokidar detects file change
3. Server reads ENTIRE session (all messages)
4. Server sends ENTIRE messages array via WebSocket
5. Client replaces ENTIRE messages array
6. React re-renders EVERYTHING that depends on messages
Performance Impact
- Data Transfer: Sending 100 messages when only 1 is new
- Memory: Creating new 100-element array for each update
- CPU: React diffing 100 messages unnecessarily
- UX: Potential scroll position loss, input focus issues
Solution Options
Option 1: Message Streaming (RECOMMENDED)
Concept: Send individual messages as they arrive, not arrays
Implementation:
// WebSocket message
{
type: 'message_streamed',
sessionId: 'session-123',
message: {
id: 'msg-48',
type: 'assistant',
content: '...',
timestamp: '...',
isStreaming: true // Still being written
}
}
// Component handling
const useMessageStream = (sessionId) => {
const [messages, setMessages] = useState([]);
useEffect(() => {
const handler = (event) => {
if (event.sessionId === sessionId) {
setMessages(prev => {
// Check if updating existing (streaming) or appending new
const existingIndex = prev.findIndex(m => m.id === event.message.id);
if (existingIndex >= 0) {
// Update existing message (streaming content)
const updated = [...prev];
updated[existingIndex] = event.message;
return updated;
} else {
// Append new message
return [...prev, event.message];
}
});
}
};
websocket.on('message_streamed', handler);
return () => websocket.off('message_streamed', handler);
}, [sessionId]);
return messages;
};
Pros:
- Minimal data transfer (one message at a time)
- Incremental updates
- Supports streaming responses
- Natural for real-time chat
Cons:
- Need to handle message ordering
- Complex for message edits/deletions
Option 2: Delta Updates
Concept: Send only what changed
Implementation:
// WebSocket message for new messages
{
type: 'messages_appended',
sessionId: 'session-123',
newMessages: [
{ id: 'msg-48', type: 'assistant', content: '...' }
],
afterMessageId: 'msg-47' // For ordering
}
// Reducer handling
case 'MESSAGES_APPENDED':
const session = findSession(state, action.sessionId);
return updateSession(state, action.sessionId, {
messages: [...session.messages, ...action.newMessages]
});
Pros:
- Clear semantics
- Supports batch updates
- Preserves message order
Cons:
- Still creates new array
- Complex for mid-array insertions
Option 3: Virtual Scrolling with Pagination
Concept: Only load visible messages
Implementation:
// State structure
{
sessions: {
'session-123': {
totalMessages: 148,
loadedRanges: [[140, 148]], // Only last 8 messages loaded
messages: new Map() // Sparse array of loaded messages
}
}
}
// Load on demand
const loadMessageRange = async (sessionId, start, end) => {
const messages = await fetchFromJSONL(sessionId, start, end);
dispatch({ type: 'MESSAGES_LOADED', sessionId, messages, start, end });
};
Pros:
- Scales to thousands of messages
- Minimal memory usage
- Fast initial load
Cons:
- Complex implementation
- Need virtual scroll component
- Search becomes harder
Option 4: Immutable Data Structure
Concept: Use Immutable.js or Immer for efficient updates
Implementation:
import { produce } from 'immer';
// Reducer with Immer
case 'APPEND_MESSAGE':
return produce(state, draft => {
const session = draft.sessions[action.sessionId];
session.messages.push(action.message);
// Immer handles immutability efficiently
});
Pros:
- Structural sharing (reuses unchanged parts)
- Simpler update code
- Better performance for large structures
Cons:
- Additional dependency
- Learning curve
- Integration complexity
Recommended Approach
Phase 1: Message Streaming (Quick Win)
Implement message streaming for new messages while keeping array structure:
- WebSocket sends individual messages as they arrive
- Client appends to existing array
- Use React.memo to prevent unnecessary re-renders
Phase 2: Smart Diffing
Implement intelligent message comparison:
// Memoize message list component
const MessageList = React.memo(({ messages }) => {
return messages.map(msg => <Message key={msg.id} {...msg} />);
}, (prevProps, nextProps) => {
// Only re-render if message count changed or last message different
return prevProps.messages.length === nextProps.messages.length &&
prevProps.messages[prevProps.messages.length - 1]?.id ===
nextProps.messages[nextProps.messages.length - 1]?.id;
});
Phase 3: Virtual Scrolling (Future)
For sessions with 100+ messages, implement virtual scrolling to only render visible messages.
Implementation Checklist
Immediate (Week 1)
- Modify server to detect new vs existing messages
- Send only new messages via WebSocket
- Update client to append rather than replace
- Add React.memo to MessageList component
Short-term (Week 2)
- Implement message streaming for real-time updates
- Add message ID tracking to prevent duplicates
- Optimize message comparison logic
- Test with large message arrays (500+ messages)
Long-term (Month 2)
- Evaluate virtual scrolling libraries
- Implement pagination for message history
- Add message search with efficient indexing
- Consider IndexedDB for client-side message cache
Performance Targets
| Metric | Current | Target | Streaming Target |
|---|---|---|---|
| Data transfer per new message | ~50KB | <2KB | <500 bytes |
| Re-renders per new message | 10-20 | 3-5 | 1-2 |
| Time to display new message | 800ms | 200ms | <100ms |
| Memory for 100 messages | ~500KB | ~200KB | ~100KB |
Testing Strategy
- Unit Tests: Message append logic
- Performance Tests: Measure re-renders with React DevTools
- Load Tests: 1000+ messages in a session
- Real-world Tests: Actual chat conversations
Risks and Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Message ordering issues | High | Use timestamp + sequence number |
| Duplicate messages | Medium | Track message IDs |
| Lost messages | High | Implement acknowledgments |
| Memory leaks | Medium | Limit cached messages |
Conclusion
The message array structure is a core architectural challenge that significantly impacts performance. Moving to a streaming model with incremental updates will dramatically reduce re-renders and improve user experience. This should be prioritized in the migration plan.