AI chat interfaces are one of the hardest layout problems on the web. Messages arrive as a stream of tokens. Bubble heights change mid-render. Users scroll up while new content pushes down. Every token that arrives can trigger a reflow, causing scroll jank that makes the experience feel broken.
Pretext solves this by computing text layout in pure JavaScript — before the DOM ever sees it. This post shows how to use Pretext to build AI chat interfaces that are fast, smooth, and pixel-perfect.
The AI Chat Layout Problem
Traditional chat UIs have a simple layout model: messages arrive one at a time, you append them to the DOM, and the browser handles the rest. But AI chat changed everything:
- Streaming tokens: GPT, Claude, and other LLMs send responses token by token. Each token changes the message length, which changes the bubble height, which shifts every message below it.
- Variable-width bubbles: A good chat UI wraps bubbles tightly around text, not stretching them to a fixed
max-width. But computing the tightest width that preserves line count requires measuring the text. - Long conversations: AI conversations can contain hundreds of messages. Virtual scrolling is essential for performance, but it requires knowing every message height — even offscreen ones.
- Code blocks and mixed content: AI responses frequently contain code, lists, and formatted text with different font sizes and line heights.
Each of these problems involves the same bottleneck: you need to know how much space text will occupy before you render it.
How Traditional Chat UIs Handle This
Most chat UIs use one of these approaches:
Approach 1: Let the Browser Handle It
// Append message, let CSS do the layout
chatContainer.appendChild(messageBubble);
chatContainer.scrollTop = chatContainer.scrollHeight;This works for simple cases but causes visible scroll jumping during streaming. Every token triggers a reflow, and if the user has scrolled up, the scroll position shifts unpredictably.
Approach 2: Fixed-Width Bubbles
.bubble {
max-width: 70%;
/* Browser wraps text within this fixed width */
}This avoids measurement entirely but wastes horizontal space. A short message like "Sure!" gets a bubble 70% of the container width. It looks sloppy — especially on mobile where screen space is precious.
Approach 3: DOM Pre-Measurement
function measureBubble(text, maxWidth) {
const hidden = document.createElement('div');
hidden.style.cssText = `
position: absolute; visibility: hidden;
max-width: ${maxWidth}px; font: 14px/20px Inter;
`;
hidden.textContent = text;
document.body.appendChild(hidden);
const { width, height } = hidden.getBoundingClientRect();
document.body.removeChild(hidden);
return { width, height };
}This gives accurate dimensions but triggers a reflow per measurement. During streaming, you might measure 30+ times per second as tokens arrive — each time forcing the browser to recalculate layout.
The Pretext Approach
Pretext eliminates all three problems at once. Here is the core pattern:
import { prepare, layout } from 'pretext';
// One-time setup — cache font measurements
const engine = prepare({
fontFamily: 'Inter',
fontSize: 14,
lineHeight: 20,
});
function measureMessage(text, maxWidth) {
const result = layout(engine, text, { maxWidth });
return {
width: result.width,
height: result.height,
lineCount: result.lines.length,
};
}After prepare() runs once, every call to layout() is pure math — no DOM, no reflow, no jank. You can call it hundreds of times per frame without affecting rendering performance.
Pattern 1: Smooth Streaming Without Scroll Jank
The biggest win for AI chat is during streaming. As tokens arrive, you need to know whether the new token creates a new line (changing the bubble height) or fits on the current line (no height change).
function handleStreamToken(messageId, fullText) {
const { height: newHeight } = measureMessage(fullText, maxBubbleWidth);
const prevHeight = heightCache.get(messageId) || 0;
if (newHeight !== prevHeight) {
// Height changed — update the bubble and adjust scroll
heightCache.set(messageId, newHeight);
updateBubbleHeight(messageId, newHeight);
// Only adjust scroll if user is near the bottom
if (isNearBottom()) {
scrollToBottom({ behavior: 'smooth' });
}
}
// Update text content without triggering layout measurement
updateBubbleText(messageId, fullText);
}The key insight: you only adjust scroll position when the height actually changes. Most tokens do not create new lines, so most of the time no scroll adjustment is needed. Without Pretext, you would need a DOM measurement for every single token to know this.
Pattern 2: Tight-Wrap Bubbles
Tight-wrapping means finding the minimum bubble width that keeps the same number of lines. This eliminates the wasted space of fixed max-width bubbles.
function tightWrap(text, maxWidth) {
// First, get the natural layout at max width
const natural = layout(engine, text, { maxWidth });
const lineCount = natural.lines.length;
// Binary search for the minimum width that preserves line count
let lo = 0;
let hi = maxWidth;
while (hi - lo > 1) {
const mid = (lo + hi) / 2;
const test = layout(engine, text, { maxWidth: mid });
if (test.lines.length === lineCount) {
hi = mid; // Can go narrower
} else {
lo = mid; // Too narrow, lines increased
}
}
return {
width: hi,
height: natural.height,
};
}This binary search calls layout() about 10–12 times per message. With DOM measurement, that would mean 10–12 reflows per bubble — unusable during streaming. With Pretext, it completes in microseconds.
The result: every bubble fits its content perfectly, just like native messaging apps.
Pattern 3: Virtual Scroll for Long Conversations
AI conversations can easily reach hundreds of messages. Rendering all of them tanks performance. Virtual scroll renders only the visible messages, but it needs to know the height of every message to calculate scroll positions.
const engine = prepare({
fontFamily: 'Inter',
fontSize: 14,
lineHeight: 20,
});
function buildHeightMap(messages) {
return messages.map(msg => {
const result = layout(engine, msg.text, {
maxWidth: maxBubbleWidth,
});
// Add padding, avatar space, timestamp height
const bubbleHeight = result.height;
const padding = 24; // top + bottom padding
const metadata = 20; // timestamp row
return bubbleHeight + padding + metadata;
});
}
// Compute all heights instantly — no DOM needed
const heights = buildHeightMap(allMessages);
const totalHeight = heights.reduce((sum, h) => sum + h, 0);Computing heights for 1,000 messages takes under 10ms with Pretext. The same operation with DOM measurement would take 500ms+ and freeze the UI.
Pattern 4: Pre-Computing Heights During Fetch
One of Pretext's unique advantages is that you can compute layout before the component mounts. This means you can calculate heights while chat history is loading:
async function loadChatHistory(chatId) {
const messages = await fetchMessages(chatId);
// Compute heights immediately — no need to wait for mount
const heights = messages.map(msg => {
const result = layout(engine, msg.text, { maxWidth: 400 });
return result.height + 24; // + padding
});
return { messages, heights };
}When the component mounts, it already knows every message height. There is no measurement pass, no layout shift, and no flash of incorrectly-sized content.
Handling Code Blocks
AI responses often contain code blocks with a different font. Pretext handles this by creating separate engines for each font configuration:
const textEngine = prepare({
fontFamily: 'Inter',
fontSize: 14,
lineHeight: 20,
});
const codeEngine = prepare({
fontFamily: 'JetBrains Mono',
fontSize: 13,
lineHeight: 20,
});
function measureAIResponse(blocks) {
let totalHeight = 0;
for (const block of blocks) {
const engine = block.type === 'code' ? codeEngine : textEngine;
const maxWidth = block.type === 'code' ? codeBlockWidth : textWidth;
const result = layout(engine, block.content, { maxWidth });
totalHeight += result.height;
// Add block spacing
totalHeight += block.type === 'code' ? 32 : 8; // code blocks have more padding
}
return totalHeight;
}Performance Numbers
Here are rough benchmarks for a typical AI chat scenario (1,000 messages, average 50 words each):
| Operation | DOM Measurement | Pretext |
|---|---|---|
| Initial height map | ~500ms | ~8ms |
| Per-token streaming measurement | ~0.5ms (with reflow) | ~0.01ms |
| Tight-wrap (12 iterations) | ~6ms | ~0.1ms |
| Memory overhead | Creates/destroys DOM nodes | ~50KB cached engine |
The streaming measurement is the most critical number. At 30 tokens per second (typical for LLM streaming), DOM measurement would consume 15ms per second just for layout — that is 25% of the frame budget at 60fps. Pretext uses 0.3ms per second — essentially free.
Integration with React
Here is a minimal React hook for using Pretext in a chat component:
import { prepare, layout } from 'pretext';
import { useRef, useMemo } from 'react';
function usePretextEngine(fontConfig) {
const engineRef = useRef(null);
if (!engineRef.current) {
engineRef.current = prepare(fontConfig);
}
return engineRef.current;
}
function useBubbleSize(text, maxWidth, fontConfig) {
const engine = usePretextEngine(fontConfig);
return useMemo(() => {
if (!text) return { width: 0, height: 0 };
const result = layout(engine, text, { maxWidth });
return { width: result.width, height: result.height };
}, [engine, text, maxWidth]);
}
// Usage in a message component
function ChatBubble({ message, maxWidth }) {
const { width, height } = useBubbleSize(
message.text,
maxWidth,
{ fontFamily: 'Inter', fontSize: 14, lineHeight: 20 }
);
return (
<div style={{ width, minHeight: height }} className="chat-bubble">
{message.text}
</div>
);
}When Not to Use Pretext for Chat
Pretext works with plain text strings. If your AI chat renders rich HTML (bold, italic, links, inline images), Pretext cannot measure the mixed layout. In those cases, you have two options:
- Use Pretext for height estimation: Measure the plain text version for scroll calculations, and let the browser handle the final rich render. The estimate will be close enough for smooth scrolling.
- Use Pretext for the text segments: Parse the markdown into blocks, measure each text block with Pretext, and add fixed heights for non-text elements (images, embeds).
Conclusion
AI chat interfaces push web layout to its limits. Streaming tokens, tight-wrap bubbles, and long conversation histories all demand text measurement at a scale that DOM-based approaches cannot handle without jank.
Pretext gives you exact text dimensions through pure JavaScript computation. No reflows, no hidden elements, no main thread blocking. For AI chat, the benefits are immediate and dramatic: smooth streaming, pixel-perfect bubbles, and instant virtual scroll — all without touching the DOM.
Try it yourself in the Pretext Playground, or see the tight chat bubbles demo to see the difference between CSS max-width and Pretext-wrapped bubbles side by side.