A tree sitting in your editor
Table of content
- The use-case
- What’s tree-sitter?
- A syntax tree
- Neovim tree-sitter API
- tree-sitter queries
- Using tree-sitter-queries
- Wrap up
This is a short introduction to the tree-sitter
integration in Neovim based on a
use-case I had: To find content in a TOML
file close to the
cursor position and then launch an application using this
information.
The use-case ¶
I had a use-case where I wanted to be able to trigger an application from within Neovim and pass along information based on file contents close to the cursors position.
I wanted to do this from within TOML files. An example looks like this:
[setup]
statement_files = ["sql/uservisits.sql"]
[[queries]]
name = "global avg"
statement = '''select avg("adRevenue") from uservisits'''
iterations = 500
[[queries]]
name = "global max-long"
statement = "select max(duration) from uservisits"
iterations = 500
[teardown]
statements = ["drop table if exists uservisits"]
If I hit a key combination it should notice whether the cursor is
within a [setup]
, [teardown]
or
[[queries]]
block. If it’s within [[queries]]
it should extract the name
value within it launch an
application with the value as argument.
Most TOML
parser libraries don’t preserve whitespace
information, but instead provide the result as a data dictionary.
Without whitespace information it would be difficult to figure out how
the cursor position relates to the data.
Therefore a regular parser library is out of the question. This is where tree-sitter comes in.
What’s tree-sitter? ¶
From the Tree-sitter website:
Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited
The tree-sitter runtime library is written in C. That made it possible for Neovim to embed it and provide an API for users and plugin authors to retrieve and query the syntax tree of a document.
To use tree-sitter you need to install language specific parsers. You can learn more about that in nvim-treesitter
A syntax tree ¶
Parsing the above toml
example yields a syntax tree like
this:
table [0, 0] - [3, 0]
bare_key [0, 1] - [0, 6]
pair [1, 0] - [1, 40]
bare_key [1, 0] - [1, 15]
array [1, 18] - [1, 40]
string [1, 19] - [1, 39]
table_array_element [3, 0] - [8, 0]
bare_key [3, 2] - [3, 9]
pair [4, 0] - [4, 19]
bare_key [4, 0] - [4, 4]
string [4, 7] - [4, 19]
pair [5, 0] - [5, 57]
bare_key [5, 0] - [5, 9]
string [5, 12] - [5, 57]
pair [6, 0] - [6, 16]
bare_key [6, 0] - [6, 10]
integer [6, 13] - [6, 16]
table_array_element [8, 0] - [13, 0]
bare_key [8, 2] - [8, 9]
pair [9, 0] - [9, 24]
bare_key [9, 0] - [9, 4]
string [9, 7] - [9, 24]
pair [10, 0] - [10, 50]
bare_key [10, 0] - [10, 9]
string [10, 12] - [10, 50]
pair [11, 0] - [11, 16]
bare_key [11, 0] - [11, 10]
integer [11, 13] - [11, 16]
table [13, 0] - [15, 0]
bare_key [13, 1] - [13, 9]
pair [14, 0] - [14, 48]
bare_key [14, 0] - [14, 10]
array [14, 13] - [14, 48]
string [14, 14] - [14, 47]
On a first glance this may look like gibberish. What you see here are
the names of the syntax nodes and their positions within the document.
In the square brackets you see [start_row, start_col]
and
[end_row, end_col]
. The exact syntax nodes always depend on
a concrete parser implementation and the language grammar. The tree for
a Python program will look different.
In this example the top-level nodes are:
table
for[setup]
and[teardown]
table_array_element
for[[queries]]
To make it easier to learn how text relates to syntax nodes you can use the playground plugin for Neovim. It helps inspecting a syntax tree as it highlights the related text as you navigate through the syntax tree:

(Neovim 0.9 adds a :InspectTree
command that can do
roughly the same)
Now, how can you use this syntax tree to find the needed information?
Neovim tree-sitter API ¶
Usually the best way to learn about APIs in Neovim is to use the
built-in help system. For tree-sitter that would be
:help lua-treesitter
. I’ll go over the main components now,
so you don’t need to consult the help page immediately, but if you
intend to play around with tree-sitter yourself, make sure to read it
eventually.
First we need a parser. We can get that by using the
get_parser
function within the vim.treesitter
module:
local parser = vim.treesitter.get_parser()
The parser will be bound to the current buffer/document and we can
immediately parse the document using the parse
method:
local trees = parser:parse()
The method returns a list (or in Lua: tables) of trees, not a single tree. This is because in languages like Markdown you can have nested languages, and you’d get a tree for each.
My toml
files don’t contain nested languages so I ignore
that and use the first tree, and then retrieve the root node:
local root = trees[1]:root()
Neovim provides a neat function that lets us retrieve the smallest
node spanning a given range. We can use this function to get the node
containing the cursor. But first we need the cursor location. We can use
the nvim_win_get_cursor
function for that. It takes as
argument the window number and supports passing 0
for the
current window.
local lnum, col = unpack(vim.api.nvim_win_get_cursor(0))
It returns a tuple of (row, col)
with the rows starting
at 1 and columns at 0. tree-sitter uses 0-based indexes so we need to
subtract one:
lnum = lnum - 1
To retrieve the node containing the cursor we use the
descendant_for_range
method. It takes a start row, start
column, end row and end column as parameters. Given that the cursor is
in a single point, not a range, we use lnum
and
col
for both start and end:
local cursor_node = root:descendant_for_range(lnum, col, lnum, col)
With this node we can traverse upward until we find the
table
node, then back down using :child(index)
or :iter_children()
.
You can get the type of a node using :type()
and
retrieve the content of a node with
vim.treesitter.query.get_node_text(node, bufnr)
. A complete
example:
local parent = cursor_node:parent()
while parent ~= nil do
local type = parent:type()
if type == "table" and parent:child_count() > 0 then
local child = parent:child(1)
if child:type() == "bare_key" then
local name = vim.treesitter.query.get_node_text(child, bufnr)
if name == "setup" or name == "teardown" then
print('Cursor was within a setup or teardown block')
return
end
end
end
parent = parent:parent()
end
This traverses upward to find the table
, then back down
to the bare_key
to get the value of the TOML
table node name. In the example file that would be either
[setup]
or [teardown]
.
tree-sitter queries ¶
An alternative approach to manually traversing the syntax tree is to use a lisp-like query language. Tree-sitter uses S-expressions to query for nodes within a syntax tree.
From Pattern matching with queries:
A query consists of one or more patterns, where each pattern is an S-expression that matches a certain set of nodes in a syntax tree. The expression to match a given node consists of a pair of parentheses containing two things: the node’s type, and optionally, a series of other S-expressions that match the node’s children
It may take some getting used to this query language, but the Playground can
help again. It highlights the parts in the document which the query
matches. Read its documentation for more information. (Neovim 0.10
provides a :EditQuery
command with similar
capabilities)
I won’t repeat the documentation referenced above and instead show you with what I ended up:
((table_array_element
(bare_key) @element_name
(#eq? @element_name "queries")
(pair
(bare_key) @property
(string) @value
(#eq? @property "name")
)
)
)
To translate this into English: Find all
table_array_element
nodes where the bare_key
value matches queries
and where there is a
property = value
pair
where the
property
value equals name
Using tree-sitter-queries ¶
The vim.treesitter.query
module contains a
parse_query
function which requires the parser name and a
query string. It returns a query object:
local query = vim.treesitter.query.parse_query(vim.bo.filetype, [[
((table_array_element
(bare_key) @element_name
(#eq? @element_name "queries")
(pair
(bare_key) @property
(string) @value
(#eq? @property "name")
)
)
)
]])
Using this query object we can iterate over any matching
captures
using a iter_captures
method. The
captures are the @<name>
parts of the query. The
iter_captures
method has four parameters: A syntax node as
starting point, the buffer number, the starting row from which to start
the search and the end row.
We use the root
node as starting point because it should
query the full document. We use 0
as starting row to start
from the top of the document and the cursor position (lnum
)
as end row. This ensures nodes below the cursor are excluded.
Now to get the [[queries]]
block closest to the cursor
we can loop through all matches and keep a reference to the last
one:
local bufnr = vim.api.nvim_get_current_buf()
local last = nil
for id, node in query:iter_captures(root, bufnr, 0, lnum) do
local capture = query.captures[id]
if capture == "value" then
last = node
end
end
iter_captures
will return each matching node with
captures. In the query above there are a few
(@element_name
, @property
, and
@value
) but we only need value
If there was a match, get the contents of the node and run the application:
if last then
local name = vim.treesitter.query.get_node_text(last, bufnr)
local cmd = {
'cr8',
'run-spec',
api.nvim_buf_get_name(bufnr),
'localhost:4200',
'--action', 'queries',
'--re-name', string.sub(name, 2, #name - 1)
}
-- Not included: These spawn the cmd in a terminal:
()
close_term(cmd)
launch_termend
Wrap up ¶
That’s it.
I hope this gave you some inspiration and ideas how you could use tree-sitter to improve your own editing tasks or workflows.