Auto LSP
A Rust crate for creating Abstract Syntax Trees (AST) and Language Server Protocol (LSP) servers powered by Tree-sitter
auto_lsp
is a generic library for creating Abstract Syntax Trees (AST) and Language Server Protocol (LSP) servers.
It leverages crates such as lsp_types, lsp_server, salsa, and texter, and generates the AST of a Tree-sitter language to simplify building LSP servers.
auto_lsp
provides useful abstractions while remaining flexible. You can override the default database as well as all LSP request and notification handlers.
It is designed to be as language-agnostic as possible, allowing any Tree-sitter grammar to be used.
See ARCHITECTURE.md for more information.
✨ Features
- Generates a thread-safe, immutable and iterable AST with parent-child relations from a Tree-sitter language.
- Supports downcasting of AST nodes to concrete types.
- Integrates with a Salsa database and parallelize LSP requests and notifications.
📚 Documentation
Examples
Cargo Features
lsp_server
: Enables the LSP server (uses lsp_server).wasm
: Enables WASM support (compatible only withwasi-p1-threads
).
Inspirations / Similar Projects
- Volar
- Type-sitter
- Rust Analyzer
- Ruff
- texter by airblast-dev, which saved hours of headaches.
Architecture
Introduction
auto_lsp
is a generic library for creating Abstract Syntax Trees (AST) and Language Server Protocol (LSP) servers.
It leverages crates such as lsp_types, lsp_server, salsa, and texter, and generates the AST of a Tree-sitter language to simplify building LSP servers.
auto_lsp
provides useful abstractions while remaining flexible. You can override the default database as well as all LSP request and notification handlers.
The library is inspired by language tools such as rust-analyzer and ruff, but with a Tree-sitter touch.
It was originally created as a way to quickly ship LSP servers without reinventing the wheel for each new language.
Crates
auto_lsp
This the main crate that reexports auto_lsp_core, auto_lsp_codegen, auto_lsp_default and auto_lsp_server.
auto_lsp_core
auto_lsp_core is the most important crate, it exports:
ast
Defines the AstNode trait and templates used by the codegen crate to build the AST.document
The Document struct that stores the text and Tree-sitter tree of a file.parsers
Contains the Parsers struct that stores a Tree-sitter parser and an AST parser function, configured via theconfigure_parsers!
macro. This is used by the db to know how to parse a file.
Additional features:
document_symbols_builder
A DocumentSymbols buildersemantic_tokens_builder
A SemanticTokens builderregex
A method that applies regex captures over the results of a Tree-sitter query and returns the captures.dispatch!
anddispatch_once!
Macros that make it more convenient to call a method on one or all nodes that match a given concrete type without having to write redundant downcasting code.
auto_lsp_codegen
auto_lsp_codegen contains the code generation logic. Unlike auto_lsp_core, codegen is not reexported by the main crate.
It just exposes a generate
function that takes a node-types.json
, a LanguageFn
and returns a proc_macro2::TokenStream.
auto_lsp_default
auto_lsp_default contains the default database and server capabilities.
It is reexported by the main crate when the default
feature is enabled.
auto_lsp_server
auto_lsp_server contains the logic for starting an LSP server.
It is reexported by the main crate when the lsp_server
feature is enabled.
Examples
This example is the most complete one, it contains the generated AST from tree_sitter_python
, LSP requests, a database and a custom parser.
This example is a bit more minimal, it only contains the generated AST from tree_sitter_html
and a database.
Runs the ast-python
example in a vscode extension using the WASI SDK.
Runs the ast-python
example in a vscode extension using either Windows or Linux.
Runs the ast-python
example in a native binary with a client mock.
Testing
Most tests are located in the examples folder.
Alongside testing the behavior of the AST, database, and LSP server,
we also test whether the generated ASTs are correct in the corpus
folder using insta.
Workflows:
-
codegen: Tests the
auto_lsp_codegen
crate. -
test-ast-native: Run main crate tests, ast-python and ast-html tests on Windows and Linux targets.
-
test-ast-wasi-p1-threads: Same as above but for wasi-p1-threads target.
-
lsp-server-native: Runs a real LSP server with a client mock.
Workflow
This is the current workflow used in internal projects when creating a new language support.
graph TB A[generate the ast] B[configure parsers] C[create a database] D[create an LSP server] E[run the server] C1[test the ast generation - with expect_test or insta] E1[test the LSP server - by mocking requests/notifications] A ==> B ==> C ==> D ==> E subgraph Db C --> C1 end subgraph LSP E --> E1 end
License
auto_lsp is licensed under the GPL-3.0 license.
Generating an AST
To generate an AST, simply provide a Tree-sitter node-types.json and LanguageFn of any language to the generate
function of the auto_lsp_codegen
crate.
cargo add auto_lsp_codegen
Although auto_lsp_codegen
is a standalone crate, the generated code depends on the main auto_lsp
crate.
Usage
The auto_lsp_codegen
crate exposes a single generate
function, which takes:
- A
node-types.json
, - A
LanguageFn
- A
HashMap<&str, &str>
to rename tokens (see SuperTypes) - And returns a TokenStream.
How you choose to use the TokenStream
is up to you.
The most common setup is to call it from a build.rs script and write the generated code to a Rust file.
Note, however, that the output can be quite large—for example, Python’s AST results in ~11,000 lines of code.
use auto_lsp_codegen::generate;
use std::{fs, path::PathBuf};
fn main() {
if std::env::var("AST_GEN").unwrap_or("0".to_string()) == "0" {
return;
}
let output_path = PathBuf::from("./src/generated.rs");
fs::write(
output_path,
generate(
tree_sitter_python::NODE_TYPES,
&tree_sitter_python::LANGUAGE.into(),
None,
)
.to_string(),
)
.unwrap();
}
You can also invoke it from your own CLI or tool if needed.
How Codegen Works
The generated code structure depends on the Tree-sitter grammar.
Structs for Rules
Each rule in node-types.json
becomes a dedicated Rust struct. For example, given the rule:
function_definition: $ => seq(
optional('async'),
'def',
field('name', $.identifier),
field('type_parameters', optional($.type_parameter)),
field('parameters', $.parameters),
optional(
seq(
'->',
field('return_type', $.type),
),
),
':',
field('body', $._suite),
),
The generated struct would look like this:
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq)] pub struct FunctionDefinition { pub name: std::sync::Arc<Identifier>, pub body: std::sync::Arc<Block>, pub type_parameters: Option<std::sync::Arc<TypeParameter>>, pub parameters: std::sync::Arc<Parameters>, pub return_type: Option<std::sync::Arc<Type>>, /* ... */ } }
Field Matching
To match fields, codegen uses the field_id()
method from the Tree-sitter cursor.
From the above example, the generated builder might look like this:
builder.builder(db, &node, Some(id), |b| {
b.on_field_id::<Identifier, 19u16>(&mut name)?
.on_field_id::<Block, 6u16>(&mut body)?
.on_field_id::<TypeParameter, 31u16>(&mut type_parameters)?
.on_field_id::<Parameters, 23u16>(&mut parameters)?
.on_field_id::<Type, 24u16>(&mut return_type)
});
Each u16 represents the unique field ID assigned by the Tree-sitter language parser.
Handling Children
If a node has no named fields, a children enum is generated to represent all possible variants.
- If the children are unnamed, a generic "Operator_" enum is generated
- If the children are named, the enum will be a concatenation of all possible child node types with underscores, using sanitized Rust-friendly names.
For example, given the rule:
_statement: $ => choice(
$._simple_statement,
$._compound_statement,
),
The generated enum would look like this:
#![allow(unused)] fn main() { pub enum SimpleStatement_CompoundStatement { SimpleStatement(SimpleStatement), CompoundStatement(CompoundStatement), } }
If the generated enum name becomes too long, consider using a Tree-sitter supertype to group nodes together.
The kind_id()
method is used to determine child kinds during traversal.
The AstNode::contains
method relies on this to check whether a node kind belongs to a specific struct or enum variant.
Vec and Option Fields
repeat
and repeat1
in the grammar will generate a Vec
field.
optional(...)
will generate an Option<T>
field.
Token Naming
Unnamed tokens are mapped to Rust enums using a built-in token map. For instance:
{ "type": "+", "named": false },
{ "type": "+=", "named": false },
{ "type": ",", "named": false },
{ "type": "-", "named": false },
{ "type": "-=", "named": false },
Generates:
#![allow(unused)] fn main() { pub enum Token_Plus {} pub enum Token_PlusEqual {} pub enum Token_Comma {} pub enum Token_Minus {} pub enum Token_MinusEqual {} }
Tokens with regular identifiers are converted to PascalCase.
Custom Tokens
If your grammar defines additional unnamed tokens not covered by the default map, you can provide a custom token mapping to generate appropriate Rust enum names.
use auto_lsp_codegen::generate;
let _result = generate(
&tree_sitter_python::NODE_TYPES,
&tree_sitter_python::LANGUAGE.into(),
Some(HashMap::from([
("+", "Plus"),
("+=", "PlusEqual"),
(",", "Comma"),
("-", "Minus"),
("-=", "MinusEqual"),
])),
);
Tokens that are not in the map will be added, and tokens that already exist in the map will be overwritten.
Super Types
Tree-sitter supports supertypes, which allow grouping related nodes under a common type.
For example, in the Python grammar:
{
"type": "_compound_statement",
"named": true,
"subtypes": [
{
"type": "class_definition",
"named": true
},
{
"type": "decorated_definition",
"named": true
},
/* ... */
{
"type": "with_statement",
"named": true
}
]
},
This becomes a Rust enum:
#![allow(unused)] fn main() { pub enum CompoundStatement { ClassDefinition(ClassDefinition), DecoratedDefinition(DecoratedDefinition), /* ... */ WithStatement(WithStatement), } }
Some super types might contain other super types, in which case, the generated enum will flatten the hierarchy.
The AstNode Trait
The AstNode
trait is the core abstraction for all AST nodes in auto-lsp.
Definition
The AstNode
trait is implemented by all generated AST types. It extends:
Debug + Clone + Send + Sync
— for thread safety and loggingPartialEq + Eq + PartialOrd + Ord
— nodes can be sorted or comparedDowncast
— enables safe casting to concrete node types
Each AST node has a unique identifier, generated during the Tree-sitter traversal. This ID is used to implement comparison traits.
Eq is based on the unique ID and the range of the node, although comparing Arc pointers should be preferred because comparing 2 nodes of different trees might yield false negatives.
Downcasting to Concrete Types
The AstNode
trait supports safe downcasting to concrete types through the Downcast
trait from the downcast_rs
crate.
With is::()
if node.is::<FunctionDefinition>() {
println!("It's a function!");
}
With downcast_ref::()
// Attempt to downcast to a specific type
if let Some(function) = node.downcast_ref::<FunctionDefinition>() {
// Work with the concrete FunctionDefinition type
println!("Function name: {}", function.name);
}
Pattern Matching with Downcasting
match pass_statement.downcast_ref::<CompoundStatement_SimpleStatement>() {
Some(CompoundStatement_SimpleStatement::SimpleStatement(
SimpleStatement::PassStatement(PassStatement { .. }),
)) => {
// Successfully matched a `pass` statement
},
_ => panic!("Expected PassStatement"),
}
Building an AST
The generated AST includes:
- Structs representing AST nodes.
- Implementations of the AstNode trait.
- A TryFrom implementation to build nodes from Tree-sitter.
Building an AST requires a salsa::Database
, which is used to accumulate errors during parsing.
Using TryFrom
Each AST node type provides a TryFrom
implementation that accepts a TryFromParams
tuple. This is used to convert a Tree-sitter node into an AST node.
/// Parameters passed to `TryFrom` implementations for AST nodes.
pub type TryFromParams<'from> = (
&'from Node<'from>, // Tree-sitter node
&'from dyn salsa::Database, // Salsa database
&'from mut Builder, // AST builder
usize, // Node ID (auto-incremented by the builder)
Option<usize>, // Optional parent node ID
);
Example: Building a root node
// Create the AST builder
let mut builder = auto_lsp::core::ast::Builder::default();
// Build the root node from the Tree-sitter parse tree
let root = ast::generated::SourceFile::try_from((
&tree.root_node(),
db, // Your salsa database
&mut builder,
0, // Root node ID
None, // Root has no parent
))?;
// Retrieve all non-root nodes from the builder
let mut nodes = builder.take_nodes();
// Add the root node manually
nodes.push(std::sync::Arc::new(root));
// Optional: Sort the nodes by ID
nodes.sort_unstable();
Retrieving Errors
Errors that occur during AST construction are accumulated using the ParseErrorAccumulator
struct. This allows partial AST construction even when some nodes fail to parse.
It’s recommended to use TryFrom
inside a salsa::tracked
function so you can retrieve errors using salsa::accumulated
.
The default crate provides a get_ast
query that builds the AST and collects errors. It is compatible with BaseDatabase
.
ParseError Structure
The ParseError
enum represents errors that can be encountered during parsing.
#[derive(Error, Clone, Debug, PartialEq, Eq)]
pub enum ParseError {
#[error("{error:?}")]
LexerError {
range: lsp_types::Range,
#[source]
error: LexerError,
},
#[error("{error:?}")]
AstError {
range: lsp_types::Range,
#[source]
error: AstError,
},
}
- LexerError — Issues from Tree-sitter's lexer
- AstError — Issues from a TryFrom implementation
You can retrieve lexer errors via get_tree_sitter_errors() from the default
crate.
LexerError
can either be a missing symbol error or a syntax error.
#[derive(Error, Clone, Debug, PartialEq, Eq)]
pub enum LexerError {
#[error("{error:?}")]
Missing {
range: lsp_types::Range,
error: String,
},
#[error("{error:?}")]
Syntax {
range: lsp_types::Range,
error: String,
},
}
ParsedAst struct
The result of get_ast
is a ParsedAst
struct, which holds the list of AST nodes and implements Deref
for direct iteration.
pub struct ParsedAst {
pub nodes: Arc<Vec<Arc<dyn AstNode>>>,
}
You can work with AST nodes in two ways:
- Downcast a node to a concrete type and access its fields.
- Iterate over all nodes and filter or match on their type.
Methods
get_root
: Returns the root node.descendant_at
: Returns the first node that contains the given offset.
Example: Filtering nodes by type
let functions = get_ast(db file)
.iter()
.filter_map(|node| node.is::<FunctionDefinition>())
.collect();
For convenience when calling methods on multiple node types, use the dispatch
or dispatch_once
macros.
See Dispatch Pattern.
Dispatch
When working with the AST, you can either:
- Manually walk the tree through concrete node types.
- Iterate over node lists.
To make traversal easier, auto_lsp provides two macros: dispatch_once!
and dispatch!
, which call methods on nodes matching a given type.
dispatch_once
Calls the method on the first node that matches one of the specified types and returns early.
use ast::generated::{FunctionDefinition, ClassDefinition};
use auto_lsp::dispatch_once;
dispatch_once!(node.lower(), [
FunctionDefinition => return_something(db, param),
ClassDefinition => return_something(db, param)
]);
Ok(None)
dispatch
Calls the method on all matching node types.
use ast::generated::{FunctionDefinition, ClassDefinition};
use auto_lsp::dispatch;
dispatch!(node.lower(), [
FunctionDefinition => build_something(db, param),
ClassDefinition => build_something(db, param)
]);
Ok(())
Lower Method
The .lower()
method retrieves the lowest-level (most concrete) AST node for a given input.
This avoids matching on enum variants by directly returning the most specific node type.
It behaves similarly to enum_dispatch
, but instead of returning a concrete type, it returns a &dyn AstNode
.
lower() always returns the most specific variant. If an enum wraps another enum, lower() will recursively unwrap to reach the innermost node.
Example: dispatch_once! in a Hover Request
// Request for hover
pub fn hover(db: &impl BaseDatabase, params: HoverParams) -> anyhow::Result<Option<Hover>> {
// Get the file in DB
let uri = ¶ms.text_document_position_params.text_document.uri;
let file = db
.get_file(uri)
.ok_or_else(|| anyhow::format_err!("File not found in workspace"))?;
let document = file.document(db);
// Find the node at the position using `offset_at` method
// Note that we could also iterate over the AST to find the node
let offset = document
.offset_at(params.text_document_position_params.position)
.ok_or_else(|| {
anyhow::format_err!(
"Invalid position, {:?}",
params.text_document_position_params.position
)
})?;
// Get the node at the given offset
if let Some(node) = get_ast(db, file).descendant_at(offset) {
// Call the `get_hover` method on the node if it matches the type.
dispatch_once!(node.lower(), [
PassStatement => get_hover(db, file),
Identifier => get_hover(db, file)
]);
}
Ok(None)
}
// Implementation of the `get_hover` method for `PassStatement` and `Identifier`
impl PassStatement {
fn get_hover(
&self,
_db: &impl BaseDatabase,
_file: File,
) -> anyhow::Result<Option<lsp_types::Hover>> {
Ok(Some(lsp_types::Hover {
contents: lsp_types::HoverContents::Markup(lsp_types::MarkupContent {
kind: lsp_types::MarkupKind::Markdown,
value: r#"This is a pass statement
[See python doc](https://docs.python.org/3/reference/simple_stmts.html#the-pass-statement)"#
.into(),
}),
range: None,
}))
}
}
impl Identifier {
fn get_hover(
&self,
db: &impl BaseDatabase,
file: File,
) -> anyhow::Result<Option<lsp_types::Hover>> {
let doc = file.document(db);
Ok(Some(lsp_types::Hover {
contents: lsp_types::HoverContents::Markup(lsp_types::MarkupContent {
kind: lsp_types::MarkupKind::PlainText,
value: format!("hover {}", self.get_text(doc.texter.text.as_bytes())?),
}),
range: None,
}))
}
}
Tree-sitter queries
Document
gives you access to the tree_sitter::Tree
via the tree
field.
From there, you can run any query you want instead of using the AST.
Example: Folding ranges in Python
// from: https://github.com/nvim-treesitter/nvim-treesitter/blob/master/queries/python/folds.scm
static FOLD: &str = r#"
[
(function_definition)
(class_definition)
(while_statement)
(for_statement)
(if_statement)
(with_statement)
(try_statement)
(match_statement)
(import_from_statement)
(parameters)
(argument_list)
(parenthesized_expression)
(generator_expression)
(list_comprehension)
(set_comprehension)
(dictionary_comprehension)
(tuple)
(list)
(set)
(dictionary)
(string)
] @fold
(comment) @fold.comment
[
(import_statement)
(import_from_statement)
]+ @fold"#;
// Precompile the query
pub static FOLD_QUERY: LazyLock<tree_sitter::Query> = LazyLock::new(|| {
tree_sitter::Query::new(&tree_sitter_python::LANGUAGE.into(), FOLD)
.expect("Failed to create fold query")
});
/// Request for folding ranges
pub fn folding_ranges(
db: &impl BaseDatabase,
params: FoldingRangeParams,
) -> anyhow::Result<Option<Vec<FoldingRange>>> {
// Get the file in DB
let uri = params.text_document.uri;
let file = db
.get_file(&uri)
.ok_or_else(|| anyhow::format_err!("File not found in workspace"))?;
let document = file.document(db);
let root_node = document.tree.root_node();
let source = document.texter.text.as_str();
// Creates a new query cursor
let mut query_cursor = tree_sitter::QueryCursor::new();
let mut captures = query_cursor.captures(&FOLD_QUERY, root_node, source.as_bytes());
let mut ranges = vec![];
// Iterate over the captures
while let Some((m, capture_index)) = captures.next() {
let capture = m.captures[*capture_index];
let kind = match FOLD_QUERY.capture_names()[capture.index as usize] {
"fold.comment" => FoldingRangeKind::Comment,
_ => FoldingRangeKind::Region,
};
let range = capture.node.range();
ranges.push(FoldingRange {
start_line: range.start_point.row as u32,
start_character: Some(range.start_point.column as u32),
end_line: range.end_point.row as u32,
end_character: Some(range.end_point.column as u32),
kind: Some(kind),
collapsed_text: None,
});
}
Ok(Some(ranges))
}
Range Requests
Some LSP requests, such as semantic tokens, support ranges, meaning you should request information for a specific range of the document instead of the whole document.
To support this, you can use the get_ast
method from the default
crate to get the AST of a file.
Since the nodes are sorted by position, it is possible to iterate over the AST and perform operations only on a portion of the AST that contains the range.
Example: Semantic tokens for a range
pub fn semantic_tokens_range(
db: &impl BaseDatabase,
params: SemanticTokensRangeParams,
) -> anyhow::Result<Option<SemanticTokensResult>> {
// Get the file in DB
let uri = params.text_document.uri;
let file = db
.get_file(&uri)
.ok_or_else(|| anyhow::format_err!("File not found in workspace"))?;
let mut builder = SemanticTokensBuilder::new("".into());
// Iterate over the AST
for node in get_ast(db, file).iter() {
// Skip nodes that are before the range
if node.get_lsp_range().end <= params.range.start {
continue;
}
// Stop at nodes that are after the range
if node.get_lsp_range().start >= params.range.end {
break;
}
// Dispatch on the node
dispatch!(node.lower(),
[
FunctionDefinition => build_semantic_tokens(db, file, &mut builder)
]
);
}
Ok(Some(SemanticTokensResult::Tokens(builder.build())))
}
Default Database
The default
crate provides default components for managing source files and a database in the db module:
- BaseDb: A basic database implementation using Salsa.
- File: A struct representing a source file.
- BaseDatabase: A trait for file retrieval.
- FileManager: A trait for file updates.
These components are designed to cover common use cases but can be extended or replaced to suit your project’s specific needs.
File struct
The File struct is a salsa::input
representing a source file, its Url
, and a reference to its parser configuration.
#[salsa::input]
pub struct File {
#[id]
pub url: Url,
pub parsers: &'static Parsers,
#[return_ref]
pub document: Arc<Document>,
}
BaseDb struct
BaseDb
is the default implementation of a database. It stores File inputs.
#![allow(unused)] fn main() { #[salsa::db] #[derive(Default, Clone)] pub struct BaseDb { storage: Storage<Self>, pub(crate) files: DashMap<Url, File>, } }
To enable logging, use the with_logger
method to initialize the database with a logger closure.
BaseDatabase trait
The BaseDatabase
trait defines how to access stored files. It's meant to be implemented by any Salsa-compatible database. It is used to retrieve files from the database.
#[salsa::db]
pub trait BaseDatabase: Database {
fn get_files(&self) -> &DashMap<Url, File>;
fn get_file(&self, url: &Url) -> Option<File> {
self.get_files().get(url).map(|file| *file)
}
}
FileManager trait
The FileManager
trait provides high-level methods to manage files (add, update, remove). It is implemented for any type that also implements BaseDatabase
.
pub trait FileManager: BaseDatabase + salsa::Database {
fn add_file_from_texter(
&mut self,
parsers: &'static Parsers,
url: &Url,
texter: Text,
) -> Result<(), DataBaseError>;
fn update(
&mut self,
url: &Url,
changes: &[lsp_types::TextDocumentContentChangeEvent],
) -> Result<(), DataBaseError>;
fn remove_file(&mut self, url: &Url) -> Result<(), DataBaseError>;
}
Document
Acknowledgement
Thanks to texter
crate, text in any encoding is supported.
texter
also provides an efficient way to update documents incrementally.
The Document struct has the following fields:
texter
: a texter struct that stores the document.tree
: The tree-sitter syntax tree.
Creating a document
Document can be created using either the from_utf8
or from_texter
methods of FileManager
.
Updating a document
The database support updating a document using the update
method of FileManager
.
update
takes 2 parameters:
- The
Url
of the document to update. - A list of
lsp_types::TextDocumentChangeEvent
changes.
These changes are sent by the client when the document is modified.
registry.on_mut::<DidChangeTextDocument, _>(|session, params| {
Ok(session.db.update(¶ms.text_document.uri, ¶ms.content_changes)?)
})
update
may return a DataBaseError
if the update fails.
Configuring Parsers
To inform the server about which file extensions are associated with a parser, you need to use the configure_parsers!
macro.
configure_parsers!
takes as first argument the name of the list, then each entry is a parser configuration.
A parser requires the following informations:
- A tree-sitter language fn.
- The AST root node (often Module, Document, SourceFile nodes ...).
Example with python
configure_parsers!(
PYTHON_PARSERS,
"python" => {
language: tree_sitter_python::LANGUAGE,
ast_root: ast::generated::Module // generated by auto_lsp_codegen
}
);
LSP Server
auto-lsp
utilizes lsp_server
from rust analyzer and crossbeam
to launch the server.
Global State
The server's global state is managed by a Session
.
Configuring a server
Pre-requisites
The server is generic over a salsa::Database
, so you need to implement a database before starting the server.
You can use the default BaseDb
database provided by auto_lsp
or create your own.
The default
module in server contains a file storage and file event handlers and workspace loading logic that are compatible with the BaseDatabase
trait.
If you create your own database, you will have to create your own file storage and file event handlers.
Configuring
To configure a server, you need to use the create
method from the Session
struct wich takes 4 arguments.
parsers
: A list of parsers (previously defined with theconfigure_parsers!
macro)capabilities
: Server capabilities, see ServerCapabilities.server_info
: Optional information about the server, such as its name and version, see ServerInfo.connection
: The connection to the client, see Connection.db
: The database to use, it must implementsalsa::Database
.
create
will return a tuple containing the Session
and the InitializeParams sent by the client.
The server communicates with an LSP client using one of lsp_server's tranport methods: stdio
, tcp
or memory
.
use std::error::Error;
use auto_lsp::server::{InitOptions, Session, ServerCapabilities};
use ast_python::db::PYTHON_PARSERS;
fn main() -> Result<(), Box<dyn Error + Sync + Send>> {
// Enable logging and tracing, this is optional
stderrlog::new()
.modules([module_path!(), "auto_lsp"])
.verbosity(4)
.init()
.unwrap();
fastrace::set_reporter(ConsoleReporter, Config::default());
// Server options
let options = InitOptions {
parsers: &PYTHON_PARSERS,
capabilities: ServerCapabilities {
..Default::default()
},
server_info: None,
};
// Create the connection
let (connection, io_threads) = Connection::stdio();
// Create a database, either BaseDb or your own
let db = BaseDb::default();
// Create the session
let (mut session, params) = Session::create(
options,
connection,
db,
)?;
// This is where you register your requests and notifications
// See the handlers section for more information
let mut request_registry = RequestRegistry::<BaseDb>::default();
let mut notification_registry = NotificationRegistry::<BaseDb>::default();
// This will add all files available in the workspace.
// The init_workspace is only available for databases that implement BaseDatabase or BaseDb
let init_results = session.init_workspace(params)?;
if !init_results.is_empty() {
init_results.into_iter().for_each(|result| {
if let Err(err) = result {
eprintln!("{}", err);
}
});
};
// Run the server and wait for the two threads to end (typically by trigger LSP Exit event).
session.main_loop(
&mut request_registry,
&mut notification_registry,
)?;
session.io_threads.join()?;
// Shut down gracefully.
eprintln!("Shutting down server");
Ok(())
}
Default Capabilities
auto_lsp
provides helper methods to configure some default capabilities.
Semantic Tokens
The semantic_tokens_provider
method will configure the semantic_tokens_provider
field of the ServerCapabilities
struct.
Parameters:
range
: Whether the client supports to send requests for semantic tokens for a specific range.token_types
: The list of token types that the server supports.token_modifiers
: The list of token modifiers that the server supports.
use auto_lsp::server::semantic_tokens_provider;
let capabilities = ServerCapabilities {
semantic_tokens_provider: semantic_tokens_provider(false, Some(SUPPORTED_TYPES), Some(SUPPORTED_MODIFIERS)),
..Default::default()
};
Except for semantic tokens, these default capabilities are only available if you use the BaseDb
database.
Text Document Sync
Since the Document
supports incremental updates, the text_document_sync
field of the ServerCapabilities
struct is configured to INCREMENTAL
by default.
You can use the TEXT_DOCUMENT_SYNC
constant to configure it.
use auto_lsp::server::TEXT_DOCUMENT_SYNC;
let capabilities = ServerCapabilities {
text_document_sync: TEXT_DOCUMENT_SYNC.clone(),
..Default::default()
};
This is meant to be used for the default handler open_text_document
to work.
Workspace Provider
The WORKSPACE_PROVIDER
constant will configure the workspace
field of the ServerCapabilities
struct.
use auto_lsp::server::WORKSPACE_PROVIDER;
let capabilities = ServerCapabilities {
workspace: WORKSPACE_PROVIDER.clone(),
..Default::default()
};
This is meant to be used for the default handler changed_watched_files
to work.
Workspace initialization
When using BaseDb
as the database, the init_workspace
method will load all files in the workspace and associate them with a parser.
It will also send diagnostics for all files.
If you want to customize this behavior, you can implement your own init_workspace
method and call it instead of the default one.
use auto_lsp::server::Session;
let (mut session, params) = Session::create(
options,
connection,
db,
)?;
// This will add all files available in the workspace.
let init_results = my_init_workspace(&mut session, params)?;
if !init_results.is_empty() {
init_results.into_iter().for_each(|result| {
if let Err(err) = result {
eprintln!("{}", err);
}
});
};
Handlers
All LSP requests and notifications must be registered before calling main_loop
.
Handlers are registered using the RequestRegistry
and NotificationRegistry
structs. Both store handlers in internal HashMaps, using the method name as the key. When a request or notification is received, the corresponding handler is looked up and invoked based on the method name.
Handler callbacks receive two parameters:
- session: The global state of the server.
- parameters: The request or notification parameters.
Both registries implement Default
, but require a salsa::Database
type parameter.
Adding Handlers
The RequestRegistry
and NotificationRegistry
structs provide two methods to register a handler:
- .on: Executes the handler in a separate thread. This is cancelable.
- .on_mut: Executes the handler synchronously with mutable access to the session.
use capabilities::handle_document_symbols;
use capabilities::handle_folding_ranges;
use capabilities::handle_watched_files;
use auto_lsp::lsp_types::notification::DidOpenTextDocument;
use auto_lsp::lsp_types::request::{DocumentSymbolRequest, FoldingRangeRequest};
use auto_lsp::server::{NotificationRegistry, RequestRegistry};
fn main() -> Result<(), Box<dyn Error + Sync + Send>> {
/* ... */
let mut request_registry = RequestRegistry::<BaseDb>::default();
let mut notification_registry = NotificationRegistry::<BaseDb>::default();
request_registry
// read only, will be executed in a separate thread
.on::<DocumentSymbolRequest, _>(handle_document_symbols)
.on::<FoldingRangeRequest, _>(handle_folding_ranges);
notification_registry
// mutable because we need to update the database
.on_mut::<DidChangeWatchedFiles, _>(handle_watched_files);
/* ... */
}
Custom Request
You can define your own request types by implementing the Request
trait from lsp_types
.
pub struct GetWorkspaceFilesUris {}
impl Request for GetWorkspaceFilesUris {
type Params = (); // Parameters for the request
type Result = Vec<String>; // Expected response type
const METHOD: &'static str = "custom/GetWorkspaceFilesUris"; // Method name used in the request
}
Similarly, to define a custom notification, implement the Notification
trait instead of Request
.
Default Handlers
The default
crates provides handlers for several LSP requests and notifications.
changed_watched_files
This notification is handled by the changed_watched_files
function in the default
crate. It updates files in the workspace when external changes are detected.
To enable this handler, use the WORKSPACE_PROVIDER
constant when configuring the server capabilities.
open_text_document
This notification is handled by the open_text_document
function in the default module. It ensures that the file is added to the workspace if not already present.
To enable this handler, use the TEXT_DOCUMENT_SYNC
constant during server capabilities configuration.
Configuring Semantic Tokens
To configure semantic tokens, you need to use the define_semantic_token_types
and define_semantic_token_modifiers
macros.
Token Types
use auto_lsp::define_semantic_token_types;
define_semantic_token_types![
standard {
"namespace" => NAMESPACE,
"type" => TYPE,
"function" => FUNCTION,
}
custom {
"custom" => CUSTOM,
}
];
This macro generates three components to streamline working with semantic token types:
- Constants: Creates a constant for each standard and custom token type.
- Supported Token Types: Generates a slice (
SUPPORTED_TYPES
) containing all supported token types that can be reused to inform the LSP client about available tokens.
Token Modifiers
use auto_lsp::define_semantic_token_modifiers;
define_semantic_token_modifiers![
standard {
DOCUMENTATION,
DECLARATION,
}
custom {
(READONLY, "readonly"),
(STATIC, "static"),
}
];
This generates:
- Constants for standard (
DOCUMENTATION
,DECLARATION
) and custom (READONLY
,STATIC
) modifiers. - A
SUPPORTED_MODIFIERS
slice that includes both standard and custom modifiers.
Example in AST
use auto_lsp::core::semantic_tokens_builder::SemanticTokensBuilder;
define_semantic_token_types![
standard {
FUNCTION,
}
custom {}
];
define_semantic_token_modifiers![
standard {
DECLARATION,
}
custom {}
];
impl MyType {
fn build_semantic_tokens(&self, builder: &mut SemanticTokensBuilder) {
builder.push(
self.name.get_lsp_range(),
SUPPORTED_TYPES.iter().position(|x| *x == FUNCTION).unwrap() as u32,
SUPPORTED_MODIFIERS.iter().position(|x| *x == DECLARATION).unwrap() as u32,
);
}
}
Configuring a client
File extensions
The LSP server must know how each file extensions are associated with a parser.
The client is responsible for sending this information to the server.
Using VScode
LSP client, this is done via providing perFileParser
object in the initializationOptions
of LanguageClientOptions
.
import { LanguageClient, LanguageClientOptions, ServerOptions, RequestType } from 'vscode-languageclient/node';
// We tell the server that .py files are associated with the python parser defined via the configure_parsers! macro.
const initializationOptions = {
perFileParser: {
"py": "python"
}
}
const clientOptions: LanguageClientOptions = {
documentSelector: [{ language: 'python' }],
synchronize: {
fileEvents: workspace.createFileSystemWatcher('**/*.py')
},
outputChannel: channel,
uriConverters: createUriConverters(),
initializationOptions
};
Logging and Tracing
auto-lsp uses fastrace
and log
for tracing and logging.
Logging
To enable logging, you can use any logger that implements the log
crate.
For example, you can use stderrlog
to log to stderr.
stderrlog::new()
.modules([module_path!(), "auto_lsp"])
.verbosity(4)
.init()
.unwrap();
Tracing
To enable tracing, follow the instructions in the fastrace documentation.
Snapshots
Since all AST nodes implement the Debug
trait, it is very easy to take snapshots of them.
Therefore, you can use any snapshot testing library such as insta
or expect_test
.