Structural Code Search with ast-grep Creator Herrington Darkholme
Including a glimpse of Herrington's new book
ast-grep is an open-source structural code search library created by Rust and TypeScript developer Herrington Darkholme.
Traditional grep searches code as plain text using regular expressions, with options like case sensitivity. ast-grep goes further by searching the code’s abstract syntax tree (AST) — its parsed structure — so you can match patterns by syntax, not by formatting or exact spelling. It’s useful for tasks like:
Refactoring large codebases
Cleaning up and modernising codebases
Linting
API migrations
Safe pattern search
This month, Herrington published a book guiding users through the ast-grep library in a comprehensive way, which you can find on Leanpub as Mastering ast-grep. He explains why he wrote the book:
“I frequently see users asking questions about specific API design choices, misunderstanding the nuances of the interface, or having a hard time wiring complex rules together effectively.
“These common hurdles highlighted the need for a place to articulate the tool’s design principles and provide a thorough, end-to-end explanation of how the system functions as a whole.”
I liked the book introduction’s final note, which explains why developers should develop a structural mindset to their code [emphasis mine]:
Structural code search is not a novelty; it is the correct abstraction for code analysis and transformation tasks. Text is too low-level. Compiler APIs are too complex. Pattern-based structural matching operates at the right level: semantic structures expressed in the language’s own syntax.
Mastering ast-grep means mastering this abstraction. The investment yields returns across your career — every codebase you encounter, every refactoring task you face, every linting rule you author. The tool becomes an extension of your capability to understand and manipulate code at scale.
Below is an extract on using rule-based patterns in ast-grep, shared with permission!
Rule-Based Pattern Specification
Rules transform ast-grep from a structural search tool into a comprehensive code analysis and linting platform. While pattern strings provide immediate utility for straightforward searches, production code analysis demands additional capabilities: diagnostic metadata, constraint composition, persistent configuration, and systematic testing. Rules provide the structured specification framework that addresses these requirements.
Motivation: The Limits of Pattern Strings
Pattern strings excel at expressing single structural queries, but real-world code analysis scenarios expose their limitations.
Consider the task of detecting potential SQL injection vulnerabilities — specifically, database query invocations that use string concatenation rather than parameterized queries.
The structural pattern alone identifies query invocations:
ast-grep --pattern 'connection.query($QUERY)' --lang js This pattern matches all query invocations but provides no mechanism to:
Filter matches by additional constraints: Only flag queries containing string concatenation
Attach diagnostic information: Explain why the match represents a vulnerability
Specify severity: Distinguish critical security issues from style preferences
Persist the specification: Reuse the query across multiple invocations without reconstructing command arguments
Test systematically: Validate that the specification correctly identifies vulnerable patterns while avoiding false positives
A pattern string represents a single predicate. Rules provide the composition operators, metadata schema, and persistence layer required for production code analysis.
Conceptual Model
A rule specification consists of three logical components:
Matching predicate: The rule field contains atomic or composite matching specifications that determine which AST nodes satisfy the rule. This predicate operates on the structural representation of code, evaluating to true or false for each candidate node.
Diagnostic metadata: Fields like message, severity, and note attach human-readable information to matches, transforming raw structural patterns into actionable diagnostic reports.
Identification: The id and language fields uniquely identify the rule within a rule set and specify which parser to invoke.
Rules compose atomic matching primitives — pattern, kind, and regex — through logical operators to construct arbitrarily complex predicates. This compositional approach separates matching logic (what to find) from diagnostic metadata (how to report findings), enabling systematic testing and reusable specifications.
Design Inspiration: CSS Selectors
The rule composition model draws inspiration from CSS selectors. Where CSS combines element types, classes, and spatial relationships to select DOM nodes, ast-grep combines kind (node type), pattern (structural shape), and regex (text content) with relational operators (has, inside, follows) to select AST nodes. This declarative composition of simple primitives yields the same benefits — rules remain readable, testable, and composable without imperative traversal code.
Rule Schema
The minimal rule specification requires three fields:
id: rule-name
language: JavaScript
rule:
pattern: console.log($MSG)A more complete rule specification includes diagnostic metadata:
id: no-console-log
language: JavaScript
rule:
pattern: console.log($MSG)
message: Console.log statement found in production code
severity: warning
note: Remove debug logging before deploymentMastering ast-grep is available to order on Leanpub.

