home / data-processing / htmlq

htmlq

7.5k

CLI for querying HTML with CSS selectors and extracting matching fragments, text, or attributes in shell pipelines.

$brew install htmlq

Language

Rust

Stars

7,504

Category

Data Processing

Agent

—

AI Analysis

htmlq is a small CLI for selecting parts of an HTML document with CSS selectors and sending the result to stdout. It is useful when a script or agent needs lightweight HTML extraction or cleanup without a browser session.

What It Enables

Extract matching elements, text content, or attribute values from HTML read from stdin or a file.
Strip unwanted nodes before output so downstream tools see only the fragment you care about.
Rewrite relative links against a supplied or detected base URL before passing results into later shell steps.

Agent Fit

CSS selectors plus stdin/stdout make it easy to drop into fetch, inspect, and follow-up pipeline loops.
Output is deterministic plain text or HTML, but there is no JSON mode, so downstream parsing stays string-based.
Best for lightweight scraping and preprocessing of static HTML, not for broader browser automation or stateful web workflows.

Caveats

The feature surface is intentionally narrow: selection, text or attribute extraction, node removal, pretty printing, and link rewriting.
If a page needs JavaScript execution, login state, or form interaction, you need another tool before htmlq can help.