chenglou/pretext

[!tip] 中文标题 Pretext：用于多行文本测量与布局的纯 JavaScript/TypeScript 库

[!abstract] 摘要 Pretext 是一个纯 JavaScript/TypeScript 库，用于多行文本的测量与布局。它通过自身的文本测量逻辑，避免了昂贵的 DOM 测量和布局重排，利用浏览器的字体引擎作为基准，实现了快速且准确的文本处理。该库支持所有语言，包括表情符号和混合双向文本，并能渲染到 DOM、Canvas、SVG 等目标。其 API 主要分为两类：一是通过 prepare 和 layout 函数高效计算段落高度，用于虚拟化等场景；二是通过 prepareWithSegments 及相关函数提供更细粒度的行级手动布局控制，适用于 Canvas 渲染等高级用途。库内包含缓存管理、区域设置等辅助功能，并遵循常见的 CSS 文本包装规则。

关键要点

Pretext 是一个纯 JavaScript/TypeScript 库，通过避免 DOM 重排来实现快速、准确的多行文本测量与布局。
它支持所有语言（包括表情符号和混合双向文本），并能将文本渲染到 DOM、Canvas、SVG 等多种目标。
其核心 API 分为两类：一类用于高效计算段落高度，另一类提供细粒度的行级手动布局控制。
库内包含缓存管理、区域设置等辅助函数，并遵循 overflow-wrap: break-word 等常见的 CSS 文本包装规则。
开发灵感源于 Sebastian Markbåge 的早期工作，目前针对 white-space: normal 或 pre-wrap 等常见文本设置进行优化。

中文全文

前言

用于多行文本测量与布局的纯 JavaScript/TypeScript 库。快速、准确，并支持您甚至不知道的所有语言。允许渲染到 DOM、Canvas、SVG，并且很快将支持服务器端。

Pretext 绕过了对 DOM 测量（例如 getBoundingClientRect、offsetHeight）的需求，这些操作会触发布局重排，这是浏览器中最昂贵的操作之一。它实现了自己的文本测量逻辑，使用浏览器自身的字体引擎作为基准（非常 AI 友好的迭代方法）。

安装

npm install @chenglou/pretext

演示

克隆仓库，运行 bun install，然后 bun start，并在浏览器中打开 /demos（不要带尾部斜杠。Bun 开发服务器对此有 bug）。或者，在 chenglou.me/pretext 查看在线演示。更多演示请见 somnai-dreams.github.io/pretext-demos

API

Pretext 服务于 2 个用例：

1. 在不接触 DOM 的情况下测量段落高度

import { prepare, layout } from '@chenglou/pretext'

const prepared = prepare('AGI 春天到了. بدأت الرحلة 🚀', '16px Inter')
const { height, lineCount } = layout(prepared, textWidth, 20) // 纯算术运算。无 DOM 布局和重排！

prepare() 执行一次性工作：规范化空白字符、分割文本、应用粘合规则、使用 Canvas 测量分割段，并返回一个不透明的句柄。layout() 是之后廉价的“热路径”：对缓存的宽度进行纯算术运算。

如果您想要类似 textarea 的文本，其中普通空格、\\t 制表符和 \ 硬换行保持可见，请将 { whiteSpace: 'pre-wrap' } 传递给 prepare() / prepareWithSegments()。

const prepared = prepare(textareaValue, '16px Inter', { whiteSpace: 'pre-wrap' })
const { height } = layout(prepared, textareaWidth, 20)

在当前已签入的基准测试快照中：

prepare() 对于共享的 500 个文本批次大约需要 19ms
layout() 对于同一批次大约需要 0.09ms

我们支持您能想象的所有语言，包括表情符号和混合双向文本，并针对特定的浏览器怪癖进行了调整。

返回的高度是解锁 Web UI 的关键最后一块：

无需猜测和缓存即可实现适当的虚拟化/遮挡
花哨的用户端布局：砖石布局、类似 JS 驱动的 flexbox 实现、无需 CSS hack 即可微调一些布局值（想象一下）等。
开发时验证（尤其是现在有了 AI）标签（例如按钮上的标签）不会溢出到下一行，且无需浏览器
当新文本加载并且您想重新锚定滚动位置时，防止布局偏移

2. 手动自行布局段落行

将 prepare 替换为 prepareWithSegments，然后：

layoutWithLines() 在固定宽度下为您提供所有行：

import { prepareWithSegments, layoutWithLines } from '@chenglou/pretext'

const prepared = prepareWithSegments('AGI 春天到了. بدأت الرحلة 🚀', '18px \\"Helvetica Neue\\"')
const { lines } = layoutWithLines(prepared, 320, 26) // 320px 最大宽度，26px 行高
for (let i = 0; i < lines.length; i++) ctx.fillText(lines[i].text, 0, i * 26)

walkLineRanges() 为您提供行宽和光标位置，而无需构建文本字符串：

let maxW = 0
walkLineRanges(prepared, 320, line => { if (line.width > maxW) maxW = line.width })
// maxW 现在是最大行宽 —— 仍然能容纳文本的最紧凑容器宽度！这种多行“收缩包裹”功能在 Web 上一直缺失

layoutNextLine() 允许您在宽度变化时逐行流动文本：

let cursor = { segmentIndex: 0, graphemeIndex: 0 }
let y = 0

// 在浮动图像周围流动文本：图像旁边的行更窄
while (true) {
  const width = y < image.bottom ? columnWidth - image.width : columnWidth
  const line = layoutNextLine(prepared, cursor, width)
  if (line === null) break
  ctx.fillText(line.text, 0, y)
  cursor = line.end
  y += 26
}

这种用法允许渲染到 Canvas、SVG、WebGL 以及（最终）服务器端。

API 术语表

用例 1 的 API：

prepare(text: string, font: string, options?: { whiteSpace?: 'normal' | 'pre-wrap' }): PreparedText // 一次性文本分析 + 测量过程，返回一个不透明的值以传递给 `layout()`。确保 `font` 与您要测量的文本的 CSS `font` 声明简写（例如大小、粗细、样式、字体系列）同步。`font` 的格式与您用于 `myCanvasContext.font = ...` 的格式相同，例如 `16px Inter`。
layout(prepared: PreparedText, maxWidth: number, lineHeight: number): { height: number, lineCount: number } // 给定最大宽度和行高计算文本高度。确保 `lineHeight` 与您要测量的文本的 CSS `line-height` 声明同步。

用例 2 的 API：

prepareWithSegments(text: string, font: string, options?: { whiteSpace?: 'normal' | 'pre-wrap' }): PreparedTextWithSegments // 与 `prepare()` 相同，但返回更丰富的结构以满足手动行布局需求
layoutWithLines(prepared: PreparedTextWithSegments, maxWidth: number, lineHeight: number): { height: number, lineCount: number, lines: LayoutLine[] } // 用于手动布局需求的高级 API。接受所有行的固定最大宽度。类似于 `layout()` 的返回值，但额外返回行信息
walkLineRanges(prepared: PreparedTextWithSegments, maxWidth: number, onLine: (line: LayoutLineRange) => void): number // 用于手动布局需求的低级 API。接受所有行的固定最大宽度。每行调用一次 `onLine`，提供其实际计算的行宽和开始/结束光标，而无需构建行文本字符串。对于某些您想推测性测试一些宽度和高度边界的情况非常有用（例如，通过重复调用 walkLineRanges 并检查行数以及因此的高度是否“合适”，来二分搜索一个合适的宽度值。您可以通过这种方式实现文本消息的收缩包裹和平衡文本布局）。在 walkLineRanges 调用之后，您将调用一次 layoutWithLines，使用您满意的最大宽度，以获取实际的行信息。
layoutNextLine(prepared: PreparedTextWithSegments, start: LayoutCursor, maxWidth: number): LayoutLine | null // 用于以不同宽度布局每一行的迭代器式 API！返回从 `start` 开始的 LayoutLine，或者在段落耗尽时返回 `null`。将前一行的 `end` 光标作为下一个 `start` 传递。
type LayoutLine = {
  text: string // 此行的完整文本内容，例如 'hello world'
  width: number // 此行的测量宽度，例如 87.5
  start: LayoutCursor // 在 prepared 的 segments/graphemes 中的包含性起始光标
  end: LayoutCursor // 在 prepared 的 segments/graphemes 中的排除性结束光标
}
type LayoutLineRange = {
  width: number // 此行的测量宽度，例如 87.5
  start: LayoutCursor // 在 prepared 的 segments/graphemes 中的包含性起始光标
  end: LayoutCursor // 在 prepared 的 segments/graphemes 中的排除性结束光标
}
type LayoutCursor = {
  segmentIndex: number // 在 prepareWithSegments 的 prepared 丰富分段流中的分段索引
  graphemeIndex: number // 该分段内的字素索引；在分段边界处为 `0`
}

其他辅助函数：

clearCache(): void // 清除 prepare() 和 prepareWithSegments() 使用的 Pretext 共享内部缓存。如果您的应用程序循环使用许多不同的字体或文本变体，并且您希望释放累积的缓存，这将非常有用
setLocale(locale?: string): void // 可选（默认我们使用当前区域设置）。为未来的 prepare() 和 prepareWithSegments() 设置区域设置。在内部，它也会调用 clearCache()。设置新的区域设置不会影响现有的 prepare() 和 prepareWithSegments() 状态（不会对它们进行修改）

注意事项

Pretext 并不试图成为一个完整的字体渲染引擎（目前？）。它目前针对常见的文本设置：

white-space: normal
word-break: normal
overflow-wrap: break-word
line-break: auto
如果您传递 { whiteSpace: 'pre-wrap' }，则普通空格、\\t 制表符和 \ 硬换行将被保留而不是折叠。制表符遵循默认的浏览器样式 tab-size: 8。其他包装默认值保持不变：word-break: normal、overflow-wrap: break-word 和 line-break: auto。
在 macOS 上，system-ui 对于 layout() 的准确性是不安全的。请使用命名字体。
因为默认目标包括 overflow-wrap: break-word，所以非常窄的宽度仍然可以在单词内断行，但仅限于字素边界。

开发

有关开发设置和命令，请参见 DEVELOPMENT.md。

致谢

Sebastiaan Markbåge 在上个十年通过 text-layout 首次播下了种子。他的设计 —— 用于字形排版的 Canvas measureText、来自 pdf.js 的双向文本处理、流式换行 —— 为我们在此不断推进的架构提供了参考。

原文

Pretext

Pure JavaScript/TypeScript library for multiline text measurement & layout. Fast, accurate & supports all the languages you didn't even know about. Allows rendering to DOM, Canvas, SVG and soon, server-side.

Pretext side-steps the need for DOM measurements (e.g. getBoundingClientRect, offsetHeight), which trigger layout reflow, one of the most expensive operations in the browser. It implements its own text measurement logic, using the browsers' own font engine as ground truth (very AI-friendly iteration method).

Installation

npm install @chenglou/pretext

Demos

Clone the repo, run bun install, then bun start, and open the /demos in your browser (no trailing slash. Bun devserver bugs on those) Alternatively, see them live at chenglou.me/pretext. Some more at somnai-dreams.github.io/pretext-demos

API

Pretext serves 2 use cases:

1. Measure a paragraph's height without ever touching DOM

import { prepare, layout } from '@chenglou/pretext'

const prepared = prepare('AGI 春天到了. بدأت الرحلة 🚀', '16px Inter')
const { height, lineCount } = layout(prepared, textWidth, 20) // pure arithmetics. No DOM layout & reflow!

prepare() does the one-time work: normalize whitespace, segment the text, apply glue rules, measure the segments with canvas, and return an opaque handle. layout() is the cheap hot path after that: pure arithmetic over cached widths.

If you want textarea-like text where ordinary spaces, \t tabs, and \n hard breaks stay visible, pass { whiteSpace: 'pre-wrap' } to prepare() / prepareWithSegments().

const prepared = prepare(textareaValue, '16px Inter', { whiteSpace: 'pre-wrap' })
const { height } = layout(prepared, textareaWidth, 20)

On the current checked-in benchmark snapshot:

prepare() is about 19ms for the shared 500-text batch
layout() is about 0.09ms for that same batch

We support all the languages you can imagine, including emojis and mixed-bidi, and caters to specific browser quirks

The returned height is the crucial last piece for unlocking web UI's:

proper virtualization/occlusion without guesstimates & caching
fancy userland layouts: masonry, JS-driven flexbox-like implementations, nudging a few layout values without CSS hacks (imagine that), etc.
development time verification (especially now with AI) that labels on e.g. buttons don't overflow to the next line, browser-free
prevent layout shift when new text loads and you wanna re-anchor the scroll position

2. Lay out the paragraph lines manually yourself

Switch out prepare with prepareWithSegments, then:

layoutWithLines() gives you all the lines at a fixed width:

import { prepareWithSegments, layoutWithLines } from '@chenglou/pretext'

const prepared = prepareWithSegments('AGI 春天到了. بدأت الرحلة 🚀', '18px "Helvetica Neue"')
const { lines } = layoutWithLines(prepared, 320, 26) // 320px max width, 26px line height
for (let i = 0; i < lines.length; i++) ctx.fillText(lines[i].text, 0, i * 26)

walkLineRanges() gives you line widths and cursors without building the text strings:

let maxW = 0
walkLineRanges(prepared, 320, line => { if (line.width > maxW) maxW = line.width })
// maxW is now the widest line — the tightest container width that still fits the text! This multiline "shrink wrap" has been missing from web

layoutNextLine() lets you route text one row at a time when width changes as you go:

let cursor = { segmentIndex: 0, graphemeIndex: 0 }
let y = 0

// Flow text around a floated image: lines beside the image are narrower
while (true) {
  const width = y < image.bottom ? columnWidth - image.width : columnWidth
  const line = layoutNextLine(prepared, cursor, width)
  if (line === null) break
  ctx.fillText(line.text, 0, y)
  cursor = line.end
  y += 26
}

This usage allows rendering to canvas, SVG, WebGL and (eventually) server-side.

API Glossary

Use-case 1 APIs:

prepare(text: string, font: string, options?: { whiteSpace?: 'normal' | 'pre-wrap' }): PreparedText // one-time text analysis + measurement pass, returns an opaque value to pass to \`layout()\`. Make sure \`font\` is synced with your css \`font\` declaration shorthand (e.g. size, weight, style, family) for the text you're measuring. \`font\` is the same format as what you'd use for \`myCanvasContext.font = ...\`, e.g. \`16px Inter\`.
layout(prepared: PreparedText, maxWidth: number, lineHeight: number): { height: number, lineCount: number } // calculates text height given a max width and lineHeight. Make sure \`lineHeight\` is synced with your css \`line-height\` declaration for the text you're measuring.

Use-case 2 APIs:

prepareWithSegments(text: string, font: string, options?: { whiteSpace?: 'normal' | 'pre-wrap' }): PreparedTextWithSegments // same as \`prepare()\`, but returns a richer structure for manual line layouts needs
layoutWithLines(prepared: PreparedTextWithSegments, maxWidth: number, lineHeight: number): { height: number, lineCount: number, lines: LayoutLine[] } // high-level api for manual layout needs. Accepts a fixed max width for all lines. Similar to \`layout()\`'s return, but additionally returns the lines info
walkLineRanges(prepared: PreparedTextWithSegments, maxWidth: number, onLine: (line: LayoutLineRange) => void): number // low-level api for manual layout needs. Accepts a fixed max width for all lines. Calls \`onLine\` once per line with its actual calculated line width and start/end cursors, without building line text strings. Very useful for certain cases where you wanna speculatively test a few width and height boundaries (e.g. binary search a nice width value by repeatedly calling walkLineRanges and checking the line count, and therefore height, is "nice" too. You can have text messages shrinkwrap and balanced text layout this way). After walkLineRanges calls, you'd call layoutWithLines once, with your satisfying max width, to get the actual lines info.
layoutNextLine(prepared: PreparedTextWithSegments, start: LayoutCursor, maxWidth: number): LayoutLine | null // iterator-like api for laying out each line with a different width! Returns the LayoutLine starting from \`start\`, or \`null\` when the paragraph's exhausted. Pass the previous line's \`end\` cursor as the next \`start\`.
type LayoutLine = {
  text: string // Full text content of this line, e.g. 'hello world'
  width: number // Measured width of this line, e.g. 87.5
  start: LayoutCursor // Inclusive start cursor in prepared segments/graphemes
  end: LayoutCursor // Exclusive end cursor in prepared segments/graphemes
}
type LayoutLineRange = {
  width: number // Measured width of this line, e.g. 87.5
  start: LayoutCursor // Inclusive start cursor in prepared segments/graphemes
  end: LayoutCursor // Exclusive end cursor in prepared segments/graphemes
}
type LayoutCursor = {
  segmentIndex: number // Segment index in prepareWithSegments' prepared rich segment stream
  graphemeIndex: number // Grapheme index within that segment; \`0\` at segment boundaries
}

Other helpers:

clearCache(): void // clears Pretext's shared internal caches used by prepare() and prepareWithSegments(). Useful if your app cycles through many different fonts or text variants and you want to release the accumulated cache
setLocale(locale?: string): void // optional (by default we use the current locale). Sets locale for future prepare() and prepareWithSegments(). Internally, it also calls clearCache(). Setting a new locale doesn't affect existing prepare() and prepareWithSegments() states (no mutations to them)

Caveats

Pretext doesn't try to be a full font rendering engine (yet?). It currently targets the common text setup:

white-space: normal
word-break: normal
overflow-wrap: break-word
line-break: auto
If you pass { whiteSpace: 'pre-wrap' }, ordinary spaces, \t tabs, and \n hard breaks are preserved instead of collapsed. Tabs follow the default browser-style tab-size: 8. The other wrapping defaults stay the same: word-break: normal, overflow-wrap: break-word, and line-break: auto.
system-ui is unsafe for layout() accuracy on macOS. Use a named font.
Because the default target includes overflow-wrap: break-word, very narrow widths can still break inside words, but only at grapheme boundaries.

Develop

See DEVELOPMENT.md for the dev setup and commands.

Credits

Sebastian Markbage first planted the seed with text-layout last decade. His design — canvas measureText for shaping, bidi from pdf.js, streaming line breaking — informed the architecture we kept pushing forward here.