Tree-sitter Syntax Highlighting Setup (2026)

Claude Code for Tree-Sitter Syntax Highlighting Guide

Tree-sitter has revolutionized how developers visualize and navigate code. As a solid parsing library, Tree-sitter generates precise syntax trees that power everything from editor highlighting to code intelligence tools. This guide shows you how to use Claude Code to work with Tree-sitter for syntax highlighting in your projects.

Understanding Tree-Sitter Fundamentals

Tree-sitter is a parser generator tool and an incremental parsing library. Unlike traditional regex-based approaches, Tree-sitter builds accurate parse trees by analyzing the grammatical structure of your code. This precision translates directly to superior syntax highlighting that understands code context.

The core concepts you need to grasp are:

  1. Grammars: Define the lexical and syntactic rules for a language
  2. Parse Trees: Hierarchical representations of code structure
  3. Nodes: Individual elements within the parse tree (functions, variables, keywords)
  4. Captures: Pattern matching rules that associate tree nodes with semantic names

Claude Code can help you generate grammars, write capture rules, and debug parsing issues efficiently.

Tree-Sitter vs. Traditional Highlighting Approaches

Before investing time in Tree-sitter, it helps to understand exactly what you gain versus simpler methods:

Approach How It Works Context-Aware Incremental Parsing DSL Support
Regex / TextMate grammars Pattern-match on raw text No No Limited
LSP semantic tokens Language server annotates ranges Yes Server-dependent Good
Tree-sitter Full parse tree per file Yes Yes (O(change)) Excellent
Manual DOM annotation Hand-code span tags No No Full control

Tree-sitter’s incremental parsing is the key differentiator. When you edit a file, Tree-sitter re-parses only the changed subtree, not the entire file. On a 10,000-line codebase this is the difference between a 2ms and 200ms highlight refresh.

Setting Up Tree-Sitter with Claude Code

Before diving into syntax highlighting, ensure you have the necessary tools installed. Tree-sitter requires a few dependencies:

Install Tree-sitter CLI
npm install -g tree-sitter-cli
Verify installation
tree-sitter --version

You will also need a C compiler on your PATH because tree-sitter generate produces a C parser. On macOS, xcode-select --install provides this. On Linux, sudo apt-get install build-essential covers it.

Once installed, you can use Claude Code to bootstrap new language grammars. Tell Claude about the language you want to support, and it can help generate the initial grammar structure:

// Example: A minimal JavaScript grammar snippet
module.exports = grammar({
 name: 'javascript',
 rules: {
 program: $ => repeat($._statement),
 _statement: $ => choice(
 $.expression_statement,
 $.variable_declaration,
 $.function_declaration
 ),
 expression_statement: $ => $.expression,
 variable_declaration: $ => seq(
 'const',
 $.identifier,
 '=',
 $.expression
 ),
 // Additional rules...
 }
});

A recommended Claude Code prompt for bootstrapping a grammar is:

“Generate a Tree-sitter grammar.js for [language name]. The language has the following constructs: [list keywords, delimiters, expression forms]. Include rules for comments, string literals, and numeric literals. Use extras to skip whitespace.”

Claude can produce a working scaffold in one pass. Your job is then to test it against real source files using tree-sitter parse and refine the edge cases it misses.

Project Structure for a Custom Grammar

tree-sitter-mylang/
 grammar.js # Language grammar definition
 package.json # npm package config
 src/
 parser.c # Generated by tree-sitter generate
 tree_sitter/
 parser.h
 queries/
 highlights.scm # Highlighting capture rules
 locals.scm # Scope/variable tracking
 injections.scm # Embedded language rules
 test/
 corpus/ # .txt test case files

Keep your grammar.js and queries/highlights.scm in version control and regenerate src/parser.c as a build artifact. The generated C file is large and not hand-editable.

Creating Effective Highlighting Rules

The real power of Tree-sitter lies in its capture system. By defining patterns that match specific node types, you can create sophisticated highlighting that responds to code semantics rather than just patterns.

Understanding Capture Groups

Captures associate node patterns with names that your highlighting theme can style:

// tree-sitter queries syntax
(function_declaration
 name: (identifier) @function.name)
(method_declaration
 name: (property_identifier) @method)
(call_expression
 function: (identifier) @function.call)

Each @capture name maps to a highlight group in your editor. This separation means you can update highlighting themes without touching the query logic.

Standard Capture Name Conventions

Neovim’s nvim-treesitter project has established a widely-adopted set of capture names. Using these conventions means your grammar integrates with existing themes without custom theme work:

Capture Name What It Highlights
@keyword Language keywords (if, for, return)
@keyword.control Control flow keywords specifically
@function Function names at definition sites
@function.call Function names at call sites
@method Method definitions
@method.call Method calls
@type Type names
@type.builtin Built-in types (int, string, bool)
@variable Variable references
@variable.builtin Built-in variables (self, this)
@string String literals
@string.special Template strings, regex literals
@number Numeric literals
@comment Line and block comments
@operator Operators (+, -, ==, etc.)
@punctuation.bracket Brackets and parentheses
@property Object property access
@constant Constants (SCREAMING_SNAKE)
@namespace Module or namespace references

Following this convention from the start prevents theme incompatibility issues when you publish your grammar.

Highlighting Different Code Elements

Here’s a practical query file for comprehensive JavaScript highlighting:

; Keywords
["const" "let" "var" "function" "return" "if" "else" "for" "while"] @keyword
; Functions
(function_declaration name: @function)
(arrow_function expression: (function_expression))
; Types
(identifier) @type
(type_annotation type: (primitive_type) @type.builtin)
; Strings
(string) @string
(template_string) @string
; Numbers
(number) @number
; Comments
(comment) @comment
; Variables and properties
(property_identifier) @property
(identifier) @variable
; Operators
(binary_expression operator: @operator)
(unary_expression operator: @operator)

One subtlety: capture rules are applied in order, and later rules override earlier ones for the same node. Put more specific captures after general ones. For example, putting (identifier) @variable before (call_expression function: (identifier) @function.call) means function call identifiers correctly get @function.call, not @variable.

Practical Examples with Claude Code

Claude Code excels at helping you write and debug Tree-sitter queries. Here’s how to approach common scenarios:

Example 1: Highlighting Decorators

Modern frameworks use decorators extensively. Here’s how to create queries that catch them:

(decorator
 name: (identifier) @decorator
 arguments: (call_expression arguments: (_) @decorator.args))

This captures both simple decorators like @decorator and those with arguments like @decorator(arg).

A complete decorator handling pattern for TypeScript, which also covers the decorator factory pattern:

; @Component
(decorator
 "@" @punctuation.special
 (identifier) @attribute)
; @Component({ ... })
(decorator
 "@" @punctuation.special
 (call_expression
 function: (identifier) @attribute
 arguments: (arguments) @constructor.arguments))
; class method decorators: @Log() before method
(method_definition
 (decorator (identifier) @attribute)
 name: (property_identifier) @method)

Example 2: Context-Aware String Highlighting

Different string types often warrant different visual treatment:

(string
 (template_string) @string.special)
(string
 (string_fragment) @string)

Extending this to cover common edge cases in JavaScript:

; Regular strings
(string) @string
; Template literals (different color to signal interpolation risk)
(template_string) @string.special
; String escape sequences get their own highlight
(escape_sequence) @string.escape
; Regex literals
(regex) @string.regex
(regex_pattern) @string.regex
(regex_flags) @keyword.operator

Example 3: Function Call vs. Definition

Distinguishing between function calls and definitions helps readers understand code flow:

(function_declaration name: @function.definition)
(call_expression function: (identifier) @function.call)
(call_expression function: (member_expression property: (property_identifier) @method.call))

Extending this pattern to cover arrow functions and class methods:

; Named function declaration
(function_declaration
 name: (identifier) @function)
; Variable assigned arrow function: const greet = () => ...
(variable_declarator
 name: (identifier) @function
 value: (arrow_function))
; Object method shorthand: { greet() { ... } }
(method_definition
 name: (property_identifier) @method)
; Call sites
(call_expression
 function: (identifier) @function.call)
; Chained method call: obj.greet()
(call_expression
 function: (member_expression
 property: (property_identifier) @method.call))
; new MyClass()
(new_expression
 constructor: (identifier) @constructor)

Example 4: Injection Queries for Embedded Languages

One of Tree-sitter’s most powerful features is language injection. parsing one language embedded inside another. CSS-in-JS, SQL in Python, and HTML templates all benefit from this:

; queries/injections.scm
; Inject SQL highlighting inside tagged template literals: sql`SELECT ...`
(tagged_template_expression
 tag: (identifier) @_tag
 string: (template_string) @injection.content
 (#match? @_tag "^sql$")
 (#set! injection.language "sql"))
; Inject CSS inside styled-components template literals
(tagged_template_expression
 tag: (identifier) @_tag
 string: (template_string) @injection.content
 (#match? @_tag "^(css|createGlobalStyle|styled\\..+)$")
 (#set! injection.language "css"))

This is something regex-based grammars cannot do reliably. With Tree-sitter injections, the embedded SQL is fully parsed as SQL, giving you proper keyword and table name highlighting inside the template literal.

Debugging Your Queries

When queries don’t match as expected, Tree-sitter provides debugging tools. Use the tree-sitter parse command to see the actual parse tree structure:

tree-sitter parse your-file.js

This output shows node types and their hierarchy. Compare this against your queries to identify mismatches. Claude Code can help interpret parse output and suggest corrections to your queries.

The parse output looks like this for a simple function declaration:

(program [0, 0] - [3, 0]
 (function_declaration [0, 0] - [2, 1]
 name: (identifier [0, 9] - [0, 14])
 parameters: (formal_parameters [0, 14] - [0, 16])
 body: (statement_block [0, 17] - [2, 1]
 (return_statement [1, 2] - [1, 11]
 (number [1, 9] - [1, 10])))))

Each node shows its type, start position [row, col], and end position. When a query fails to match, check that you are using the exact node type string shown in this output. A common mistake is writing function_declaration when the grammar uses function_definition (Python uses the latter).

Interactive Query Testing

Tree-sitter CLI also provides a playground for testing queries interactively:

Test highlights against a file
tree-sitter highlight your-file.js
Open the interactive web playground
tree-sitter playground

The tree-sitter highlight command prints the source with ANSI color codes applied according to your highlights.scm. This lets you see exactly which nodes are being captured without loading an editor.

For a rapid debugging loop with Claude Code, paste the raw tree-sitter parse output into Claude and describe which nodes you want to capture. Claude can read the node type hierarchy and generate corrected query patterns in one pass, faster than reading grammar documentation.

Writing Grammar Tests

Tree-sitter has a built-in test format in test/corpus/. These are plain text files that assert what the parse tree should look like for given inputs:

Variable Declaration
const x = 42;
---
(program
 (variable_declaration
 name: (identifier)
 value: (number)))

Run all tests with:

tree-sitter test

Ask Claude Code to generate test corpus entries from code samples. Providing 10-15 test cases across edge cases (empty bodies, nested structures, error recovery) catches regressions when you modify grammar rules.

Most modern editors support Tree-sitter highlighting:

Neovim

Neovim has built-in Tree-sitter support:

-- Configuration in init.lua
require('nvim-treesitter.configs').setup({
 highlight = {
 enable = true,
 additional_vim_regex_highlighting = false,
 },
})

To add a custom grammar that is not in the official nvim-treesitter repository:

local parser_config = require('nvim-treesitter.parsers').get_parser_configs()
parser_config.mylang = {
 install_info = {
 url = 'https://github.com/yourname/tree-sitter-mylang',
 files = { 'src/parser.c' },
 branch = 'main',
 },
 filetype = 'mylang',
}
-- Assign file extension to parser
vim.filetype.add({ extension = { ml = 'mylang' } })

Place your highlights.scm, locals.scm, and injections.scm files under ~/.config/nvim/queries/mylang/. Neovim picks them up automatically without restarting.

VS Code

Install the Tree-sitter extension and add your queries to the extension’s query directory.

For VS Code, the vscode-anycode extension provides Tree-sitter-based symbol search and basic highlighting for unsupported languages. You can also bundle a Tree-sitter grammar directly into a language extension by including the WASM parser:

Generate WASM parser for browser/VS Code use
tree-sitter build-wasm

Then load it in your extension’s activate function using the web-tree-sitter npm package.

Helix

Helix uses Tree-sitter natively and requires no plugins. Add your grammar to ~/.config/helix/languages.toml:

[[language]]
name = "mylang"
scope = "source.mylang"
file-types = ["ml"]
roots = []
comment-token = "//"
language-server = { command = "mylang-lsp" }
[language.auto-pairs]
'(' = ')'
'{' = '}'
[[grammar]]
name = "mylang"
source = { git = "https://github.com/yourname/tree-sitter-mylang", rev = "main" }

Run hx --grammar fetch && hx --grammar build to download and compile the grammar.

Editor Support Comparison

Editor Tree-sitter Support Custom Grammar Method Query Hot-Reload
Neovim Built-in (nvim-treesitter) parser_config + query files Yes (:TSUpdate)
Helix Built-in languages.toml + grammar fetch Yes
VS Code Via anycode / extension Bundle WASM parser No
Emacs Via treesit.el (29+) treesit-language-source-alist Partial
Zed Built-in Extension manifest Yes

Actionable Tips for Better Highlighting

  1. Start Simple: Begin with keywords and basic types before adding complexity
  2. Use Semantic Names: Choose capture names that convey meaning (@function, @variable, @type)
  3. Test Incrementally: Add queries one category at a time and verify they work
  4. Use Claude Code: Describe what you want to highlight, and let Claude suggest queries
  5. Consider Performance: Complex queries across large files can slow parsing; optimize patterns

Additional tips that come from production grammar maintenance:

  1. Avoid over-capturing with (identifier) @variable: This puts every identifier in the variable group, including type names, function names, and module names. Use more specific patterns and let unmatched identifiers fall through to the default color.

  2. Use predicates sparingly: Tree-sitter supports query predicates like (#match? @node "pattern") and (#eq? @a @b), but each predicate adds overhead. Reserve them for cases like injection language detection where they are genuinely needed.

  3. Pin grammar versions: When shipping a grammar as a dependency, pin the exact git revision in your lockfile. Grammar node types change between versions and can silently break highlight queries.

  4. Read existing grammars: The official tree-sitter-javascript, tree-sitter-python, and tree-sitter-rust grammars are thoroughly battle-tested. Reading their grammar.js and queries/highlights.scm is the fastest way to learn idiomatic patterns.

Advanced: Custom Grammars for Domain-Specific Languages

If you’re working with a DSL or custom configuration format, building a dedicated grammar provides the best highlighting experience. Define your language rules in a grammar.js file:

module.exports = grammar({
 name: 'my_config',
 extras: $ => [/\s/, $.comment],
 rules: {
 config: $ => seq(
 '{',
 repeat($.property),
 '}'
 ),
 property: $ => seq(
 $.key,
 ':',
 $.value
 ),
 key: $ => /[a-z_]+/,
 value: $ => choice(
 $.string,
 $.number,
 $.boolean,
 $.array
 ),
 // Define other node types...
 }
})

Generate the parser with tree-sitter generate, then write queries following the patterns shown earlier.

Handling Error Recovery in DSL Grammars

Real-world source files contain syntax errors. Tree-sitter’s error recovery inserts ERROR nodes where parsing fails rather than aborting. You should highlight these explicitly so users can see where their syntax broke:

; highlights.scm. mark parse errors visibly
(ERROR) @error

In Neovim this maps to the @error highlight group, which most themes render in red. Users get immediate visual feedback on broken syntax without needing a language server.

You can also use the word grammar option to improve error recovery around identifiers:

module.exports = grammar({
 name: 'my_config',
 word: $ => $.key, // helps parser recover at word boundaries
 // ...
});

Using Claude Code for a Full DSL Workflow

The complete workflow for a new DSL grammar looks like this:

  1. Describe the DSL to Claude: its purpose, keywords, expression forms, and any embedded languages.
  2. Claude generates an initial grammar.js.
  3. Run tree-sitter generate && tree-sitter parse sample.mydsl and paste the output back to Claude with a description of what is wrong.
  4. Claude suggests grammar corrections. Repeat until the parse tree matches expectations.
  5. Ask Claude to generate highlights.scm based on the node types visible in the parse output.
  6. Ask Claude to generate test/corpus/basics.txt with 10 test cases covering normal and edge-case inputs.
  7. Run tree-sitter test and iterate until all tests pass.

This loop typically takes 1-2 hours for a simple DSL versus a full day writing the grammar from scratch with documentation alone.

Conclusion

Tree-sitter syntax highlighting transforms code visualization from simple pattern matching to semantic understanding. By using Claude Code’s assistance, you can efficiently create and maintain highlighting rules that make your codebase more navigable and readable. Start with basic queries, iterate based on what you see in your editor, and gradually build comprehensive coverage for all the languages you work with.

The investment in well-crafted Tree-sitter queries pays dividends every time you open your editor and instantly recognize code structure at a glance.


Try it: Paste your error into our Error Diagnostic for an instant fix.

I'm a solo developer in Vietnam. 50K Chrome extension users. $500K+ on Upwork. 5 Claude Max subscriptions running agent fleets in parallel. These are my actual CLAUDE.md templates, orchestration configs, and prompts. Not a course. Not theory. The files I copy into every project before I write a line of code. **[See what's inside →](https://zovo.one/lifetime?utm_source=ccg&utm_medium=cta-default&utm_campaign=claude-code-for-tree-sitter-syntax-highlighting-guide)** $99 once. Free forever. 47/500 founding spots left.

Related Reading

Built by theluckystrike. More at zovo.one

Find the right skill → Browse 155+ skills in our Skill Finder.

Quick setup → Launch your project with our Project Starter.