Tree-sitter Syntax Highlighting Setup (2026)
Claude Code for Tree-Sitter Syntax Highlighting Guide
Tree-sitter has revolutionized how developers visualize and navigate code. As a solid parsing library, Tree-sitter generates precise syntax trees that power everything from editor highlighting to code intelligence tools. This guide shows you how to use Claude Code to work with Tree-sitter for syntax highlighting in your projects.
Understanding Tree-Sitter Fundamentals
Tree-sitter is a parser generator tool and an incremental parsing library. Unlike traditional regex-based approaches, Tree-sitter builds accurate parse trees by analyzing the grammatical structure of your code. This precision translates directly to superior syntax highlighting that understands code context.
The core concepts you need to grasp are:
- Grammars: Define the lexical and syntactic rules for a language
- Parse Trees: Hierarchical representations of code structure
- Nodes: Individual elements within the parse tree (functions, variables, keywords)
- Captures: Pattern matching rules that associate tree nodes with semantic names
Claude Code can help you generate grammars, write capture rules, and debug parsing issues efficiently.
Tree-Sitter vs. Traditional Highlighting Approaches
Before investing time in Tree-sitter, it helps to understand exactly what you gain versus simpler methods:
| Approach | How It Works | Context-Aware | Incremental Parsing | DSL Support |
|---|---|---|---|---|
| Regex / TextMate grammars | Pattern-match on raw text | No | No | Limited |
| LSP semantic tokens | Language server annotates ranges | Yes | Server-dependent | Good |
| Tree-sitter | Full parse tree per file | Yes | Yes (O(change)) | Excellent |
| Manual DOM annotation | Hand-code span tags | No | No | Full control |
Tree-sitter’s incremental parsing is the key differentiator. When you edit a file, Tree-sitter re-parses only the changed subtree, not the entire file. On a 10,000-line codebase this is the difference between a 2ms and 200ms highlight refresh.
Setting Up Tree-Sitter with Claude Code
Before diving into syntax highlighting, ensure you have the necessary tools installed. Tree-sitter requires a few dependencies:
Install Tree-sitter CLI
npm install -g tree-sitter-cli
Verify installation
tree-sitter --version
You will also need a C compiler on your PATH because tree-sitter generate produces a C parser. On macOS, xcode-select --install provides this. On Linux, sudo apt-get install build-essential covers it.
Once installed, you can use Claude Code to bootstrap new language grammars. Tell Claude about the language you want to support, and it can help generate the initial grammar structure:
// Example: A minimal JavaScript grammar snippet
module.exports = grammar({
name: 'javascript',
rules: {
program: $ => repeat($._statement),
_statement: $ => choice(
$.expression_statement,
$.variable_declaration,
$.function_declaration
),
expression_statement: $ => $.expression,
variable_declaration: $ => seq(
'const',
$.identifier,
'=',
$.expression
),
// Additional rules...
}
});
A recommended Claude Code prompt for bootstrapping a grammar is:
“Generate a Tree-sitter grammar.js for [language name]. The language has the following constructs: [list keywords, delimiters, expression forms]. Include rules for comments, string literals, and numeric literals. Use
extrasto skip whitespace.”
Claude can produce a working scaffold in one pass. Your job is then to test it against real source files using tree-sitter parse and refine the edge cases it misses.
Project Structure for a Custom Grammar
tree-sitter-mylang/
grammar.js # Language grammar definition
package.json # npm package config
src/
parser.c # Generated by tree-sitter generate
tree_sitter/
parser.h
queries/
highlights.scm # Highlighting capture rules
locals.scm # Scope/variable tracking
injections.scm # Embedded language rules
test/
corpus/ # .txt test case files
Keep your grammar.js and queries/highlights.scm in version control and regenerate src/parser.c as a build artifact. The generated C file is large and not hand-editable.
Creating Effective Highlighting Rules
The real power of Tree-sitter lies in its capture system. By defining patterns that match specific node types, you can create sophisticated highlighting that responds to code semantics rather than just patterns.
Understanding Capture Groups
Captures associate node patterns with names that your highlighting theme can style:
// tree-sitter queries syntax
(function_declaration
name: (identifier) @function.name)
(method_declaration
name: (property_identifier) @method)
(call_expression
function: (identifier) @function.call)
Each @capture name maps to a highlight group in your editor. This separation means you can update highlighting themes without touching the query logic.
Standard Capture Name Conventions
Neovim’s nvim-treesitter project has established a widely-adopted set of capture names. Using these conventions means your grammar integrates with existing themes without custom theme work:
| Capture Name | What It Highlights |
|---|---|
@keyword |
Language keywords (if, for, return) |
@keyword.control |
Control flow keywords specifically |
@function |
Function names at definition sites |
@function.call |
Function names at call sites |
@method |
Method definitions |
@method.call |
Method calls |
@type |
Type names |
@type.builtin |
Built-in types (int, string, bool) |
@variable |
Variable references |
@variable.builtin |
Built-in variables (self, this) |
@string |
String literals |
@string.special |
Template strings, regex literals |
@number |
Numeric literals |
@comment |
Line and block comments |
@operator |
Operators (+, -, ==, etc.) |
@punctuation.bracket |
Brackets and parentheses |
@property |
Object property access |
@constant |
Constants (SCREAMING_SNAKE) |
@namespace |
Module or namespace references |
Following this convention from the start prevents theme incompatibility issues when you publish your grammar.
Highlighting Different Code Elements
Here’s a practical query file for comprehensive JavaScript highlighting:
; Keywords
["const" "let" "var" "function" "return" "if" "else" "for" "while"] @keyword
; Functions
(function_declaration name: @function)
(arrow_function expression: (function_expression))
; Types
(identifier) @type
(type_annotation type: (primitive_type) @type.builtin)
; Strings
(string) @string
(template_string) @string
; Numbers
(number) @number
; Comments
(comment) @comment
; Variables and properties
(property_identifier) @property
(identifier) @variable
; Operators
(binary_expression operator: @operator)
(unary_expression operator: @operator)
One subtlety: capture rules are applied in order, and later rules override earlier ones for the same node. Put more specific captures after general ones. For example, putting (identifier) @variable before (call_expression function: (identifier) @function.call) means function call identifiers correctly get @function.call, not @variable.
Practical Examples with Claude Code
Claude Code excels at helping you write and debug Tree-sitter queries. Here’s how to approach common scenarios:
Example 1: Highlighting Decorators
Modern frameworks use decorators extensively. Here’s how to create queries that catch them:
(decorator
name: (identifier) @decorator
arguments: (call_expression arguments: (_) @decorator.args))
This captures both simple decorators like @decorator and those with arguments like @decorator(arg).
A complete decorator handling pattern for TypeScript, which also covers the decorator factory pattern:
; @Component
(decorator
"@" @punctuation.special
(identifier) @attribute)
; @Component({ ... })
(decorator
"@" @punctuation.special
(call_expression
function: (identifier) @attribute
arguments: (arguments) @constructor.arguments))
; class method decorators: @Log() before method
(method_definition
(decorator (identifier) @attribute)
name: (property_identifier) @method)
Example 2: Context-Aware String Highlighting
Different string types often warrant different visual treatment:
(string
(template_string) @string.special)
(string
(string_fragment) @string)
Extending this to cover common edge cases in JavaScript:
; Regular strings
(string) @string
; Template literals (different color to signal interpolation risk)
(template_string) @string.special
; String escape sequences get their own highlight
(escape_sequence) @string.escape
; Regex literals
(regex) @string.regex
(regex_pattern) @string.regex
(regex_flags) @keyword.operator
Example 3: Function Call vs. Definition
Distinguishing between function calls and definitions helps readers understand code flow:
(function_declaration name: @function.definition)
(call_expression function: (identifier) @function.call)
(call_expression function: (member_expression property: (property_identifier) @method.call))
Extending this pattern to cover arrow functions and class methods:
; Named function declaration
(function_declaration
name: (identifier) @function)
; Variable assigned arrow function: const greet = () => ...
(variable_declarator
name: (identifier) @function
value: (arrow_function))
; Object method shorthand: { greet() { ... } }
(method_definition
name: (property_identifier) @method)
; Call sites
(call_expression
function: (identifier) @function.call)
; Chained method call: obj.greet()
(call_expression
function: (member_expression
property: (property_identifier) @method.call))
; new MyClass()
(new_expression
constructor: (identifier) @constructor)
Example 4: Injection Queries for Embedded Languages
One of Tree-sitter’s most powerful features is language injection. parsing one language embedded inside another. CSS-in-JS, SQL in Python, and HTML templates all benefit from this:
; queries/injections.scm
; Inject SQL highlighting inside tagged template literals: sql`SELECT ...`
(tagged_template_expression
tag: (identifier) @_tag
string: (template_string) @injection.content
(#match? @_tag "^sql$")
(#set! injection.language "sql"))
; Inject CSS inside styled-components template literals
(tagged_template_expression
tag: (identifier) @_tag
string: (template_string) @injection.content
(#match? @_tag "^(css|createGlobalStyle|styled\\..+)$")
(#set! injection.language "css"))
This is something regex-based grammars cannot do reliably. With Tree-sitter injections, the embedded SQL is fully parsed as SQL, giving you proper keyword and table name highlighting inside the template literal.
Debugging Your Queries
When queries don’t match as expected, Tree-sitter provides debugging tools. Use the tree-sitter parse command to see the actual parse tree structure:
tree-sitter parse your-file.js
This output shows node types and their hierarchy. Compare this against your queries to identify mismatches. Claude Code can help interpret parse output and suggest corrections to your queries.
The parse output looks like this for a simple function declaration:
(program [0, 0] - [3, 0]
(function_declaration [0, 0] - [2, 1]
name: (identifier [0, 9] - [0, 14])
parameters: (formal_parameters [0, 14] - [0, 16])
body: (statement_block [0, 17] - [2, 1]
(return_statement [1, 2] - [1, 11]
(number [1, 9] - [1, 10])))))
Each node shows its type, start position [row, col], and end position. When a query fails to match, check that you are using the exact node type string shown in this output. A common mistake is writing function_declaration when the grammar uses function_definition (Python uses the latter).
Interactive Query Testing
Tree-sitter CLI also provides a playground for testing queries interactively:
Test highlights against a file
tree-sitter highlight your-file.js
Open the interactive web playground
tree-sitter playground
The tree-sitter highlight command prints the source with ANSI color codes applied according to your highlights.scm. This lets you see exactly which nodes are being captured without loading an editor.
For a rapid debugging loop with Claude Code, paste the raw tree-sitter parse output into Claude and describe which nodes you want to capture. Claude can read the node type hierarchy and generate corrected query patterns in one pass, faster than reading grammar documentation.
Writing Grammar Tests
Tree-sitter has a built-in test format in test/corpus/. These are plain text files that assert what the parse tree should look like for given inputs:
Variable Declaration
const x = 42;
---
(program
(variable_declaration
name: (identifier)
value: (number)))
Run all tests with:
tree-sitter test
Ask Claude Code to generate test corpus entries from code samples. Providing 10-15 test cases across edge cases (empty bodies, nested structures, error recovery) catches regressions when you modify grammar rules.
Integrating with Popular Editors
Most modern editors support Tree-sitter highlighting:
Neovim
Neovim has built-in Tree-sitter support:
-- Configuration in init.lua
require('nvim-treesitter.configs').setup({
highlight = {
enable = true,
additional_vim_regex_highlighting = false,
},
})
To add a custom grammar that is not in the official nvim-treesitter repository:
local parser_config = require('nvim-treesitter.parsers').get_parser_configs()
parser_config.mylang = {
install_info = {
url = 'https://github.com/yourname/tree-sitter-mylang',
files = { 'src/parser.c' },
branch = 'main',
},
filetype = 'mylang',
}
-- Assign file extension to parser
vim.filetype.add({ extension = { ml = 'mylang' } })
Place your highlights.scm, locals.scm, and injections.scm files under ~/.config/nvim/queries/mylang/. Neovim picks them up automatically without restarting.
VS Code
Install the Tree-sitter extension and add your queries to the extension’s query directory.
For VS Code, the vscode-anycode extension provides Tree-sitter-based symbol search and basic highlighting for unsupported languages. You can also bundle a Tree-sitter grammar directly into a language extension by including the WASM parser:
Generate WASM parser for browser/VS Code use
tree-sitter build-wasm
Then load it in your extension’s activate function using the web-tree-sitter npm package.
Helix
Helix uses Tree-sitter natively and requires no plugins. Add your grammar to ~/.config/helix/languages.toml:
[[language]]
name = "mylang"
scope = "source.mylang"
file-types = ["ml"]
roots = []
comment-token = "//"
language-server = { command = "mylang-lsp" }
[language.auto-pairs]
'(' = ')'
'{' = '}'
[[grammar]]
name = "mylang"
source = { git = "https://github.com/yourname/tree-sitter-mylang", rev = "main" }
Run hx --grammar fetch && hx --grammar build to download and compile the grammar.
Editor Support Comparison
| Editor | Tree-sitter Support | Custom Grammar Method | Query Hot-Reload |
|---|---|---|---|
| Neovim | Built-in (nvim-treesitter) | parser_config + query files |
Yes (:TSUpdate) |
| Helix | Built-in | languages.toml + grammar fetch |
Yes |
| VS Code | Via anycode / extension | Bundle WASM parser | No |
| Emacs | Via treesit.el (29+) | treesit-language-source-alist |
Partial |
| Zed | Built-in | Extension manifest | Yes |
Actionable Tips for Better Highlighting
- Start Simple: Begin with keywords and basic types before adding complexity
- Use Semantic Names: Choose capture names that convey meaning (
@function,@variable,@type) - Test Incrementally: Add queries one category at a time and verify they work
- Use Claude Code: Describe what you want to highlight, and let Claude suggest queries
- Consider Performance: Complex queries across large files can slow parsing; optimize patterns
Additional tips that come from production grammar maintenance:
-
Avoid over-capturing with
(identifier) @variable: This puts every identifier in the variable group, including type names, function names, and module names. Use more specific patterns and let unmatched identifiers fall through to the default color. -
Use predicates sparingly: Tree-sitter supports query predicates like
(#match? @node "pattern")and(#eq? @a @b), but each predicate adds overhead. Reserve them for cases like injection language detection where they are genuinely needed. -
Pin grammar versions: When shipping a grammar as a dependency, pin the exact git revision in your lockfile. Grammar node types change between versions and can silently break highlight queries.
-
Read existing grammars: The official
tree-sitter-javascript,tree-sitter-python, andtree-sitter-rustgrammars are thoroughly battle-tested. Reading theirgrammar.jsandqueries/highlights.scmis the fastest way to learn idiomatic patterns.
Advanced: Custom Grammars for Domain-Specific Languages
If you’re working with a DSL or custom configuration format, building a dedicated grammar provides the best highlighting experience. Define your language rules in a grammar.js file:
module.exports = grammar({
name: 'my_config',
extras: $ => [/\s/, $.comment],
rules: {
config: $ => seq(
'{',
repeat($.property),
'}'
),
property: $ => seq(
$.key,
':',
$.value
),
key: $ => /[a-z_]+/,
value: $ => choice(
$.string,
$.number,
$.boolean,
$.array
),
// Define other node types...
}
})
Generate the parser with tree-sitter generate, then write queries following the patterns shown earlier.
Handling Error Recovery in DSL Grammars
Real-world source files contain syntax errors. Tree-sitter’s error recovery inserts ERROR nodes where parsing fails rather than aborting. You should highlight these explicitly so users can see where their syntax broke:
; highlights.scm. mark parse errors visibly
(ERROR) @error
In Neovim this maps to the @error highlight group, which most themes render in red. Users get immediate visual feedback on broken syntax without needing a language server.
You can also use the word grammar option to improve error recovery around identifiers:
module.exports = grammar({
name: 'my_config',
word: $ => $.key, // helps parser recover at word boundaries
// ...
});
Using Claude Code for a Full DSL Workflow
The complete workflow for a new DSL grammar looks like this:
- Describe the DSL to Claude: its purpose, keywords, expression forms, and any embedded languages.
- Claude generates an initial
grammar.js. - Run
tree-sitter generate && tree-sitter parse sample.mydsland paste the output back to Claude with a description of what is wrong. - Claude suggests grammar corrections. Repeat until the parse tree matches expectations.
- Ask Claude to generate
highlights.scmbased on the node types visible in the parse output. - Ask Claude to generate
test/corpus/basics.txtwith 10 test cases covering normal and edge-case inputs. - Run
tree-sitter testand iterate until all tests pass.
This loop typically takes 1-2 hours for a simple DSL versus a full day writing the grammar from scratch with documentation alone.
Conclusion
Tree-sitter syntax highlighting transforms code visualization from simple pattern matching to semantic understanding. By using Claude Code’s assistance, you can efficiently create and maintain highlighting rules that make your codebase more navigable and readable. Start with basic queries, iterate based on what you see in your editor, and gradually build comprehensive coverage for all the languages you work with.
The investment in well-crafted Tree-sitter queries pays dividends every time you open your editor and instantly recognize code structure at a glance.
Try it: Paste your error into our Error Diagnostic for an instant fix.
Related Reading
- Claude Code for Tree-sitter AST Traversal Workflow
- Claude Code for Tree-sitter Node Types Workflow Guide
- Claude Code for Tree-sitter Playground Workflow Guide
Built by theluckystrike. More at zovo.one
Find the right skill → Browse 155+ skills in our Skill Finder.
Quick setup → Launch your project with our Project Starter.