Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Auto LSP

A Rust crate for creating Abstract Syntax Trees (AST) and Language Server Protocol (LSP) servers powered by Tree-sitter

CI Status CI Status Book crates.io Rust Version

auto_lsp is a generic library for creating Abstract Syntax Trees (AST) and Language Server Protocol (LSP) servers.

It leverages crates such as lsp_types, lsp_server, salsa, and texter, and generates the AST of a Tree-sitter language to simplify building LSP servers.

auto_lsp provides useful abstractions while remaining flexible. You can override the default database as well as all LSP request and notification handlers.

It is designed to be as language-agnostic as possible, allowing any Tree-sitter grammar to be used.

See ARCHITECTURE.md for more information.

✨ Features

  • Generates a thread-safe, immutable and iterable AST with parent-child relations from a Tree-sitter language.
  • Supports downcasting of AST nodes to concrete types.
  • Integrates with a Salsa database and parallelize LSP requests and notifications.

📚 Documentation

Examples

Cargo Features

  • lsp_server: Enables the LSP server (uses lsp_server).
  • wasm: Enables WASM support (compatible only with wasi-p1-threads).

Inspirations / Similar Projects

Architecture

Introduction

auto_lsp is a generic library for creating Abstract Syntax Trees (AST) and Language Server Protocol (LSP) servers.

It leverages crates such as lsp_types, lsp_server, salsa, and texter, and generates the AST of a Tree-sitter language to simplify building LSP servers.

auto_lsp provides useful abstractions while remaining flexible. You can override the default database as well as all LSP request and notification handlers.

The library is inspired by language tools such as rust-analyzer and ruff, but with a Tree-sitter touch.

It was originally created as a way to quickly ship LSP servers without reinventing the wheel for each new language.

Crates

auto_lsp

This the main crate that reexports auto_lsp_core, auto_lsp_codegen, auto_lsp_default and auto_lsp_server.

auto_lsp_core

auto_lsp_core is the most important crate, it exports:

  • ast Defines the AstNode trait and templates used by the codegen crate to build the AST.
  • document The Document struct that stores the text and Tree-sitter tree of a file.
  • parsers Contains the Parsers struct that stores a Tree-sitter parser and an AST parser function, configured via the configure_parsers! macro. This is used by the db to know how to parse a file.

Additional features:

  • document_symbols_builder A DocumentSymbols builder
  • semantic_tokens_builder A SemanticTokens builder
  • regex A method that applies regex captures over the results of a Tree-sitter query and returns the captures.
  • dispatch! and dispatch_once! Macros that make it more convenient to call a method on one or all nodes that match a given concrete type without having to write redundant downcasting code.

auto_lsp_codegen

auto_lsp_codegen contains the code generation logic. Unlike auto_lsp_core, codegen is not reexported by the main crate.

It just exposes a generate function that takes a node-types.json, a LanguageFn and returns a proc_macro2::TokenStream.

auto_lsp_default

auto_lsp_default contains the default database and server capabilities. It is reexported by the main crate when the default feature is enabled.

auto_lsp_server

auto_lsp_server contains the logic for starting an LSP server. It is reexported by the main crate when the lsp_server feature is enabled.

Examples

This example is the most complete one, it contains the generated AST from tree_sitter_python, LSP requests, a database and a custom parser.

This example is a bit more minimal, it only contains the generated AST from tree_sitter_html and a database.

Runs the ast-python example in a vscode extension using the WASI SDK.

Runs the ast-python example in a vscode extension using either Windows or Linux.

Runs the ast-python example in a native binary with a client mock.

Testing

Most tests are located in the examples folder.

Alongside testing the behavior of the AST, database, and LSP server, we also test whether the generated ASTs are correct in the corpus folder using insta.

Workflows:

Workflow

This is the current workflow used in internal projects when creating a new language support.

graph TB
    A[generate the ast]
    B[configure parsers]
    C[create a database] 
    D[create an LSP server]
    E[run the server]

    C1[test the ast generation - with expect_test or insta]
    E1[test the LSP server - by mocking requests/notifications]

    A ==> B ==> C ==> D ==> E
    subgraph Db
        C --> C1
    end
    subgraph LSP 
        E --> E1
    end

License

auto_lsp is licensed under the GPL-3.0 license.

Generating an AST

To generate an AST, simply provide a Tree-sitter node-types.json and LanguageFn of any language to the generate function of the auto_lsp_codegen crate.

cargo add auto_lsp_codegen

Note

Although auto_lsp_codegen is a standalone crate, the generated code depends on the main auto_lsp crate.

Usage

The auto_lsp_codegen crate exposes a single generate function, which takes:

How you choose to use the TokenStream is up to you.

The most common setup is to call it from a build.rs script and write the generated code to a Rust file.

Note, however, that the output can be quite large—for example, Python’s AST results in ~11,000 lines of code.

use auto_lsp_codegen::generate;
use std::{fs, path::PathBuf};

fn main() {
    if std::env::var("AST_GEN").unwrap_or("0".to_string()) == "0" {
        return;
    }

    let output_path = PathBuf::from("./src/generated.rs");

    fs::write(
        output_path,
        generate(
            tree_sitter_python::NODE_TYPES,
            &tree_sitter_python::LANGUAGE.into(),
            None,
        )
        .to_string(),
    )
    .unwrap();
}

You can also invoke it from your own CLI or tool if needed.

How Codegen Works

The generated code structure depends on the Tree-sitter grammar.

Structs for Rules

Each rule in node-types.json becomes a dedicated Rust struct. For example, given the rule:

function_definition: $ => seq(
      optional('async'),
      'def',
      field('name', $.identifier),
      field('type_parameters', optional($.type_parameter)),
      field('parameters', $.parameters),
      optional(
        seq(
          '->',
          field('return_type', $.type),
        ),
      ),
      ':',
      field('body', $._suite),
    ),

The generated struct would look like this:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq)]
pub struct FunctionDefinition {
    pub name: std::sync::Arc<Identifier>,
    pub body: std::sync::Arc<Block>,
    pub type_parameters: Option<std::sync::Arc<TypeParameter>>,
    pub parameters: std::sync::Arc<Parameters>,
    pub return_type: Option<std::sync::Arc<Type>>,
    /* ... */
}
}

Field Matching

To match fields, codegen uses the field_id() method from the Tree-sitter cursor.

From the above example, the generated builder might look like this:

builder.builder(db, &node, Some(id), |b| {
  b.on_field_id::<Identifier, 19u16>(&mut name)?
    .on_field_id::<Block, 6u16>(&mut body)?
    .on_field_id::<TypeParameter, 31u16>(&mut type_parameters)?
    .on_field_id::<Parameters, 23u16>(&mut parameters)?
    .on_field_id::<Type, 24u16>(&mut return_type)
});

Each u16 represents the unique field ID assigned by the Tree-sitter language parser.

Handling Children

If a node has no named fields, a children enum is generated to represent all possible variants.

  • If the children are unnamed, a generic "Operator_" enum is generated
  • If the children are named, the enum will be a concatenation of all possible child node types with underscores, using sanitized Rust-friendly names.

For example, given the rule:

  _statement: $ => choice(
      $._simple_statement,
      $._compound_statement,
    ),

The generated enum would look like this:

#![allow(unused)]
fn main() {
pub enum SimpleStatement_CompoundStatement {
    SimpleStatement(SimpleStatement),
    CompoundStatement(CompoundStatement),
}
}

Note

If the generated enum name becomes too long, consider using a Tree-sitter supertype to group nodes together.

The kind_id() method is used to determine child kinds during traversal.

The AstNode::contains method relies on this to check whether a node kind belongs to a specific struct or enum variant.

Vec and Option Fields

repeat and repeat1 in the grammar will generate a Vec field. optional(...) will generate an Option<T> field.

Token Naming

Unnamed tokens are mapped to Rust enums using a built-in token map. For instance:

  { "type": "+", "named": false },
  { "type": "+=", "named": false },
  { "type": ",", "named": false },
  { "type": "-", "named": false },
  { "type": "-=", "named": false },

Generates:

#![allow(unused)]
fn main() {
pub enum Token_Plus {}
pub enum Token_PlusEqual {}
pub enum Token_Comma {}
pub enum Token_Minus {}
pub enum Token_MinusEqual {}
}

Tokens with regular identifiers are converted to PascalCase.

Custom Tokens

If your grammar defines additional unnamed tokens not covered by the default map, you can provide a custom token mapping to generate appropriate Rust enum names.

use auto_lsp_codegen::generate;

let _result = generate(
        &tree_sitter_python::NODE_TYPES,
        &tree_sitter_python::LANGUAGE.into(),
        Some(HashMap::from([
            ("+", "Plus"),
            ("+=", "PlusEqual"),
            (",", "Comma"),
            ("-", "Minus"),
            ("-=", "MinusEqual"),
        ])),
    );

Tokens that are not in the map will be added, and tokens that already exist in the map will be overwritten.

Super Types

Tree-sitter supports supertypes, which allow grouping related nodes under a common type.

For example, in the Python grammar:

  {
    "type": "_compound_statement",
    "named": true,
    "subtypes": [
      {
        "type": "class_definition",
        "named": true
      },
      {
        "type": "decorated_definition",
        "named": true
      },
      /* ... */
      {
        "type": "with_statement",
        "named": true
      }
    ]
  },

This becomes a Rust enum:

#![allow(unused)]
fn main() {
pub enum CompoundStatement {
    ClassDefinition(ClassDefinition),
    DecoratedDefinition(DecoratedDefinition),
    /* ... */
    WithStatement(WithStatement),
}
}

Note

Some super types might contain other super types, in which case, the generated enum will flatten the hierarchy.

The AstNode Trait

The AstNode trait is the core abstraction for all AST nodes in auto-lsp.

Definition

The AstNode trait is implemented by all generated AST types. It extends:

  • Debug + Clone + Send + Sync — for thread safety and logging
  • PartialEq + Eq + PartialOrd + Ord — nodes can be sorted or compared
  • Downcast — enables safe casting to concrete node types

Each AST node has a unique identifier, generated during the Tree-sitter traversal. This ID is used to implement comparison traits.

Eq is based on the unique ID and the range of the node, although comparing Arc pointers should be preferred because comparing 2 nodes of different trees might yield false negatives.

Downcasting to Concrete Types

The AstNode trait supports safe downcasting to concrete types through the Downcast trait from the downcast_rs crate.

With is::()

if node.is::<FunctionDefinition>() {
    println!("It's a function!");
}

With downcast_ref::()

// Attempt to downcast to a specific type
if let Some(function) = node.downcast_ref::<FunctionDefinition>() {
    // Work with the concrete FunctionDefinition type
    println!("Function name: {}", function.name);
}

Pattern Matching with Downcasting

match pass_statement.downcast_ref::<CompoundStatement_SimpleStatement>() {
    Some(CompoundStatement_SimpleStatement::SimpleStatement(
        SimpleStatement::PassStatement(PassStatement { .. }),
    )) => {
        // Successfully matched a `pass` statement
    },
    _ => panic!("Expected PassStatement"),
}

Building an AST

The generated AST includes:

  • Structs representing AST nodes.
  • Implementations of the AstNode trait.
  • A TryFrom implementation to build nodes from Tree-sitter.

Building an AST requires a salsa::Database, which is used to accumulate errors during parsing.

Using TryFrom

Each AST node type provides a TryFrom implementation that accepts a TryFromParams tuple. This is used to convert a Tree-sitter node into an AST node.

/// Parameters passed to `TryFrom` implementations for AST nodes.
pub type TryFromParams<'from> = (
    &'from Node<'from>,         // Tree-sitter node
    &'from dyn salsa::Database, // Salsa database
    &'from mut Builder,         // AST builder
    usize,                      // Node ID (auto-incremented by the builder)
    Option<usize>,              // Optional parent node ID
);

Example: Building a root node

// Create the AST builder
let mut builder = auto_lsp::core::ast::Builder::default();

// Build the root node from the Tree-sitter parse tree
let root = ast::generated::SourceFile::try_from((
    &tree.root_node(),
    db,            // Your salsa database
    &mut builder,
    0,             // Root node ID
    None,          // Root has no parent
))?;

// Retrieve all non-root nodes from the builder
let mut nodes = builder.take_nodes();

// Add the root node manually
nodes.push(std::sync::Arc::new(root));

// Optional: Sort the nodes by ID
nodes.sort_unstable();

Retrieving Errors

Errors that occur during AST construction are accumulated using the ParseErrorAccumulator struct. This allows partial AST construction even when some nodes fail to parse.

It’s recommended to use TryFrom inside a salsa::tracked function so you can retrieve errors using salsa::accumulated.

The default crate provides a get_ast query that builds the AST and collects errors. It is compatible with BaseDatabase.

ParseError Structure

The ParseError enum represents errors that can be encountered during parsing.

#[derive(Error, Clone, Debug, PartialEq, Eq)]
pub enum ParseError {
    #[error("{error:?}")]
    LexerError {
        range: lsp_types::Range,
        #[source]
        error: LexerError,
    },
    #[error("{error:?}")]
    AstError {
        range: lsp_types::Range,
        #[source]
        error: AstError,
    },
}
  • LexerError — Issues from Tree-sitter's lexer
  • AstError — Issues from a TryFrom implementation

You can retrieve lexer errors via get_tree_sitter_errors() from the default crate.

LexerError can either be a missing symbol error or a syntax error.

#[derive(Error, Clone, Debug, PartialEq, Eq)]
pub enum LexerError {
    #[error("{error:?}")]
    Missing {
        range: lsp_types::Range,
        error: String,
    },
    #[error("{error:?}")]
    Syntax {
        range: lsp_types::Range,
        error: String,
    },
}

ParsedAst struct

The result of get_ast is a ParsedAst struct, which holds the list of AST nodes and implements Deref for direct iteration.

pub struct ParsedAst {
    pub nodes: Arc<Vec<Arc<dyn AstNode>>>,
}

You can work with AST nodes in two ways:

  • Downcast a node to a concrete type and access its fields.
  • Iterate over all nodes and filter or match on their type.

Methods

  • get_root: Returns the root node.
  • descendant_at: Returns the first node that contains the given offset.

Example: Filtering nodes by type

let functions = get_ast(db file)
    .iter()
    .filter_map(|node| node.is::<FunctionDefinition>())
    .collect();

For convenience when calling methods on multiple node types, use the dispatch or dispatch_once macros. See Dispatch Pattern.

Dispatch

When working with the AST, you can either:

  • Manually walk the tree through concrete node types.
  • Iterate over node lists.

To make traversal easier, auto_lsp provides two macros: dispatch_once! and dispatch!, which call methods on nodes matching a given type.

dispatch_once

Calls the method on the first node that matches one of the specified types and returns early.

use ast::generated::{FunctionDefinition, ClassDefinition};
use auto_lsp::dispatch_once;

dispatch_once!(node.lower(), [
    FunctionDefinition => return_something(db, param),
    ClassDefinition => return_something(db, param)
]);
Ok(None)

dispatch

Calls the method on all matching node types.

use ast::generated::{FunctionDefinition, ClassDefinition};
use auto_lsp::dispatch;

dispatch!(node.lower(), [
    FunctionDefinition => build_something(db, param),
    ClassDefinition => build_something(db, param)
]);
Ok(())

Lower Method

The .lower() method retrieves the lowest-level (most concrete) AST node for a given input.

This avoids matching on enum variants by directly returning the most specific node type.

It behaves similarly to enum_dispatch, but instead of returning a concrete type, it returns a &dyn AstNode.

Note

lower() always returns the most specific variant. If an enum wraps another enum, lower() will recursively unwrap to reach the innermost node.

Example: dispatch_once! in a Hover Request

// Request for hover
pub fn hover(db: &impl BaseDatabase, params: HoverParams) -> anyhow::Result<Option<Hover>> {
    // Get the file in DB
    let uri = &params.text_document_position_params.text_document.uri;

    let file = db
        .get_file(uri)
        .ok_or_else(|| anyhow::format_err!("File not found in workspace"))?;

    let document = file.document(db);

    // Find the node at the position using `offset_at` method
    // Note that we could also iterate over the AST to find the node
    let offset = document
        .offset_at(params.text_document_position_params.position)
        .ok_or_else(|| {
            anyhow::format_err!(
                "Invalid position, {:?}",
                params.text_document_position_params.position
            )
        })?;

    // Get the node at the given offset
    if let Some(node) = get_ast(db, file).descendant_at(offset) {
        // Call the `get_hover` method on the node if it matches the type.
        dispatch_once!(node.lower(), [
            PassStatement => get_hover(db, file),
            Identifier => get_hover(db, file)
        ]);
    }
    Ok(None)
}

// Implementation of the `get_hover` method for `PassStatement` and `Identifier`

impl PassStatement {
    fn get_hover(
        &self,
        _db: &impl BaseDatabase,
        _file: File,
    ) -> anyhow::Result<Option<lsp_types::Hover>> {
        Ok(Some(lsp_types::Hover {
            contents: lsp_types::HoverContents::Markup(lsp_types::MarkupContent {
                kind: lsp_types::MarkupKind::Markdown,
                value: r#"This is a pass statement

[See python doc](https://docs.python.org/3/reference/simple_stmts.html#the-pass-statement)"#
                    .into(),
            }),
            range: None,
        }))
    }
}

impl Identifier {
    fn get_hover(
        &self,
        db: &impl BaseDatabase,
        file: File,
    ) -> anyhow::Result<Option<lsp_types::Hover>> {
        let doc = file.document(db);
        Ok(Some(lsp_types::Hover {
            contents: lsp_types::HoverContents::Markup(lsp_types::MarkupContent {
                kind: lsp_types::MarkupKind::PlainText,
                value: format!("hover {}", self.get_text(doc.texter.text.as_bytes())?),
            }),
            range: None,
        }))
    }
}

Tree-sitter queries

Document gives you access to the tree_sitter::Tree via the tree field.

From there, you can run any query you want instead of using the AST.

Example: Folding ranges in Python

// from: https://github.com/nvim-treesitter/nvim-treesitter/blob/master/queries/python/folds.scm
static FOLD: &str = r#"
[
  (function_definition)
  (class_definition)
  (while_statement)
  (for_statement)
  (if_statement)
  (with_statement)
  (try_statement)
  (match_statement)
  (import_from_statement)
  (parameters)
  (argument_list)
  (parenthesized_expression)
  (generator_expression)
  (list_comprehension)
  (set_comprehension)
  (dictionary_comprehension)
  (tuple)
  (list)
  (set)
  (dictionary)
  (string)
] @fold

(comment) @fold.comment

[
  (import_statement)
  (import_from_statement)
]+ @fold"#;

// Precompile the query
pub static FOLD_QUERY: LazyLock<tree_sitter::Query> = LazyLock::new(|| {
    tree_sitter::Query::new(&tree_sitter_python::LANGUAGE.into(), FOLD)
        .expect("Failed to create fold query")
});

/// Request for folding ranges
pub fn folding_ranges(
    db: &impl BaseDatabase,
    params: FoldingRangeParams,
) -> anyhow::Result<Option<Vec<FoldingRange>>> {
    // Get the file in DB
    let uri = params.text_document.uri;

    let file = db
        .get_file(&uri)
        .ok_or_else(|| anyhow::format_err!("File not found in workspace"))?;

    let document = file.document(db);

    let root_node = document.tree.root_node();
    let source = document.texter.text.as_str();

    // Creates a new query cursor
    let mut query_cursor = tree_sitter::QueryCursor::new();
    let mut captures = query_cursor.captures(&FOLD_QUERY, root_node, source.as_bytes());

    let mut ranges = vec![];

    // Iterate over the captures
    while let Some((m, capture_index)) = captures.next() {
        let capture = m.captures[*capture_index];
        let kind = match FOLD_QUERY.capture_names()[capture.index as usize] {
            "fold.comment" => FoldingRangeKind::Comment,
            _ => FoldingRangeKind::Region,
        };
        let range = capture.node.range();
        ranges.push(FoldingRange {
            start_line: range.start_point.row as u32,
            start_character: Some(range.start_point.column as u32),
            end_line: range.end_point.row as u32,
            end_character: Some(range.end_point.column as u32),
            kind: Some(kind),
            collapsed_text: None,
        });
    }

    Ok(Some(ranges))
}

Range Requests

Some LSP requests, such as semantic tokens, support ranges, meaning you should request information for a specific range of the document instead of the whole document.

To support this, you can use the get_ast method from the default crate to get the AST of a file.

Since the nodes are sorted by position, it is possible to iterate over the AST and perform operations only on a portion of the AST that contains the range.

Example: Semantic tokens for a range

pub fn semantic_tokens_range(
    db: &impl BaseDatabase,
    params: SemanticTokensRangeParams,
) -> anyhow::Result<Option<SemanticTokensResult>> {
    // Get the file in DB
    let uri = params.text_document.uri;

    let file = db
        .get_file(&uri)
        .ok_or_else(|| anyhow::format_err!("File not found in workspace"))?;

    let mut builder = SemanticTokensBuilder::new("".into());

    // Iterate over the AST
    for node in get_ast(db, file).iter() {
        // Skip nodes that are before the range
        if node.get_lsp_range().end <= params.range.start {
            continue;
        }
        // Stop at nodes that are after the range
        if node.get_lsp_range().start >= params.range.end {
            break;
        }
        // Dispatch on the node
        dispatch!(node.lower(),
            [
                FunctionDefinition => build_semantic_tokens(db, file, &mut builder)
            ]
        );
    }

    Ok(Some(SemanticTokensResult::Tokens(builder.build())))
}

Default Database

The default crate provides default components for managing source files and a database in the db module:

  • BaseDb: A basic database implementation using Salsa.
  • File: A struct representing a source file.
  • BaseDatabase: A trait for file retrieval.
  • FileManager: A trait for file updates.

These components are designed to cover common use cases but can be extended or replaced to suit your project’s specific needs.

File struct

The File struct is a salsa::input representing a source file, its Url, and a reference to its parser configuration.

#[salsa::input]
pub struct File {
    #[id]
    pub url: Url,
    pub parsers: &'static Parsers,
    #[return_ref]
    pub document: Arc<Document>,
}

BaseDb struct

BaseDb is the default implementation of a database. It stores File inputs.

#![allow(unused)]
fn main() {
#[salsa::db]
#[derive(Default, Clone)]
pub struct BaseDb {
    storage: Storage<Self>,
    pub(crate) files: DashMap<Url, File>,
}
}

To enable logging, use the with_logger method to initialize the database with a logger closure.

BaseDatabase trait

The BaseDatabase trait defines how to access stored files. It's meant to be implemented by any Salsa-compatible database. It is used to retrieve files from the database.

#[salsa::db]
pub trait BaseDatabase: Database {
    fn get_files(&self) -> &DashMap<Url, File>;

    fn get_file(&self, url: &Url) -> Option<File> {
        self.get_files().get(url).map(|file| *file)
    }
}

FileManager trait

The FileManager trait provides high-level methods to manage files (add, update, remove). It is implemented for any type that also implements BaseDatabase.

pub trait FileManager: BaseDatabase + salsa::Database {
    fn add_file_from_texter(
        &mut self,
        parsers: &'static Parsers,
        url: &Url,
        texter: Text,
    ) -> Result<(), DataBaseError>;

    fn update(
        &mut self,
        url: &Url,
        changes: &[lsp_types::TextDocumentContentChangeEvent],
    ) -> Result<(), DataBaseError>;

    fn remove_file(&mut self, url: &Url) -> Result<(), DataBaseError>;
}

Document

Acknowledgement

Thanks to texter crate, text in any encoding is supported.

texter also provides an efficient way to update documents incrementally.

The Document struct has the following fields:

  • texter: a texter struct that stores the document.
  • tree: The tree-sitter syntax tree.

Creating a document

Document can be created using either the from_utf8 or from_texter methods of FileManager.

Updating a document

The database support updating a document using the update method of FileManager.

update takes 2 parameters:

These changes are sent by the client when the document is modified.

registry.on_mut::<DidChangeTextDocument, _>(|session, params| {
    Ok(session.db.update(&params.text_document.uri, &params.content_changes)?)
})

update may return a DataBaseError if the update fails.

Configuring Parsers

To inform the server about which file extensions are associated with a parser, you need to use the configure_parsers! macro.

configure_parsers! takes as first argument the name of the list, then each entry is a parser configuration.

A parser requires the following informations:

  • A tree-sitter language fn.
  • The AST root node (often Module, Document, SourceFile nodes ...).

Example with python

configure_parsers!(
    PYTHON_PARSERS,
    "python" => {
        language: tree_sitter_python::LANGUAGE,
        ast_root: ast::generated::Module // generated by auto_lsp_codegen
    }
);

LSP Server

auto-lsp utilizes lsp_server from rust analyzer and crossbeam to launch the server.

Note

LSP Server is only available in the lsp_server feature.

Global State

The server's global state is managed by a Session.

Configuring a server

Pre-requisites

The server is generic over a salsa::Database, so you need to implement a database before starting the server.

You can use the default BaseDb database provided by auto_lsp or create your own.

The default module in server contains a file storage and file event handlers and workspace loading logic that are compatible with the BaseDatabase trait.

If you create your own database, you will have to create your own file storage and file event handlers.

Configuring

To configure a server, you need to use the create method from the Session struct wich takes 4 arguments.

  • parsers: A list of parsers (previously defined with the configure_parsers! macro)
  • capabilities: Server capabilities, see ServerCapabilities.
  • server_info: Optional information about the server, such as its name and version, see ServerInfo.
  • connection: The connection to the client, see Connection.
  • db: The database to use, it must implement salsa::Database.

create will return a tuple containing the Session and the InitializeParams sent by the client.

The server communicates with an LSP client using one of lsp_server's tranport methods: stdio, tcp or memory.

use std::error::Error;
use auto_lsp::server::{InitOptions, Session, ServerCapabilities};
use ast_python::db::PYTHON_PARSERS;

fn main() -> Result<(), Box<dyn Error + Sync + Send>> {
    // Enable logging and tracing, this is optional
    stderrlog::new()
        .modules([module_path!(), "auto_lsp"])
        .verbosity(4)
        .init()
        .unwrap();

    fastrace::set_reporter(ConsoleReporter, Config::default());

    // Server options
    let options = InitOptions {
        parsers: &PYTHON_PARSERS,
        capabilities: ServerCapabilities {
            ..Default::default()
        },
        server_info: None,
    };
    // Create the connection
    let (connection, io_threads) = Connection::stdio();
    // Create a database, either BaseDb or your own
    let db = BaseDb::default();

    // Create the session
    let (mut session, params) = Session::create(
        options,
        connection,
        db,
    )?;

    // This is where you register your requests and notifications
    // See the handlers section for more information
    let mut request_registry = RequestRegistry::<BaseDb>::default();
    let mut notification_registry = NotificationRegistry::<BaseDb>::default();

    // This will add all files available in the workspace.
    // The init_workspace is only available for databases that implement BaseDatabase or BaseDb
    let init_results = session.init_workspace(params)?;
    if !init_results.is_empty() {
        init_results.into_iter().for_each(|result| {
            if let Err(err) = result {
                eprintln!("{}", err);
            }
        });
    };

    // Run the server and wait for the two threads to end (typically by trigger LSP Exit event).
    session.main_loop(
        &mut request_registry,
        &mut notification_registry,
    )?;
    session.io_threads.join()?;

    // Shut down gracefully.
    eprintln!("Shutting down server");
    Ok(())
}

Default Capabilities

auto_lsp provides helper methods to configure some default capabilities.

Semantic Tokens

The semantic_tokens_provider method will configure the semantic_tokens_provider field of the ServerCapabilities struct.

Parameters:

  • range: Whether the client supports to send requests for semantic tokens for a specific range.
  • token_types: The list of token types that the server supports.
  • token_modifiers: The list of token modifiers that the server supports.
use auto_lsp::server::semantic_tokens_provider;

let capabilities = ServerCapabilities {
    semantic_tokens_provider: semantic_tokens_provider(false, Some(SUPPORTED_TYPES), Some(SUPPORTED_MODIFIERS)),
    ..Default::default()
};

Note

Except for semantic tokens, these default capabilities are only available if you use the BaseDb database.

Text Document Sync

Since the Document supports incremental updates, the text_document_sync field of the ServerCapabilities struct is configured to INCREMENTAL by default.

You can use the TEXT_DOCUMENT_SYNC constant to configure it.

use auto_lsp::server::TEXT_DOCUMENT_SYNC;

let capabilities = ServerCapabilities {
    text_document_sync: TEXT_DOCUMENT_SYNC.clone(),
    ..Default::default()
};

This is meant to be used for the default handler open_text_document to work.

Workspace Provider

The WORKSPACE_PROVIDER constant will configure the workspace field of the ServerCapabilities struct.

use auto_lsp::server::WORKSPACE_PROVIDER;

let capabilities = ServerCapabilities {
    workspace: WORKSPACE_PROVIDER.clone(),
    ..Default::default()
};

This is meant to be used for the default handler changed_watched_files to work.

Workspace initialization

When using BaseDb as the database, the init_workspace method will load all files in the workspace and associate them with a parser.

It will also send diagnostics for all files.

If you want to customize this behavior, you can implement your own init_workspace method and call it instead of the default one.

use auto_lsp::server::Session;

let (mut session, params) = Session::create(
    options,
    connection,
    db,
)?;

// This will add all files available in the workspace.
let init_results = my_init_workspace(&mut session, params)?;
if !init_results.is_empty() {
    init_results.into_iter().for_each(|result| {
        if let Err(err) = result {
            eprintln!("{}", err);
        }
    });
};

Handlers

All LSP requests and notifications must be registered before calling main_loop.

Handlers are registered using the RequestRegistry and NotificationRegistry structs. Both store handlers in internal HashMaps, using the method name as the key. When a request or notification is received, the corresponding handler is looked up and invoked based on the method name.

Handler callbacks receive two parameters:

  • session: The global state of the server.
  • parameters: The request or notification parameters.

Both registries implement Default, but require a salsa::Database type parameter.

Adding Handlers

The RequestRegistry and NotificationRegistry structs provide two methods to register a handler:

  • .on: Executes the handler in a separate thread. This is cancelable.
  • .on_mut: Executes the handler synchronously with mutable access to the session.
use capabilities::handle_document_symbols;
use capabilities::handle_folding_ranges;
use capabilities::handle_watched_files;
use auto_lsp::lsp_types::notification::DidOpenTextDocument;
use auto_lsp::lsp_types::request::{DocumentSymbolRequest, FoldingRangeRequest};
use auto_lsp::server::{NotificationRegistry, RequestRegistry};

fn main() -> Result<(), Box<dyn Error + Sync + Send>> {
    /* ... */

    let mut request_registry = RequestRegistry::<BaseDb>::default();
    let mut notification_registry = NotificationRegistry::<BaseDb>::default();

    request_registry
        // read only, will be executed in a separate thread
        .on::<DocumentSymbolRequest, _>(handle_document_symbols)
        .on::<FoldingRangeRequest, _>(handle_folding_ranges);

    notification_registry
        // mutable because we need to update the database
        .on_mut::<DidChangeWatchedFiles, _>(handle_watched_files);

    /* ... */
}

Custom Request

You can define your own request types by implementing the Request trait from lsp_types.

pub struct GetWorkspaceFilesUris {}

impl Request for GetWorkspaceFilesUris {
    type Params = (); // Parameters for the request
    type Result = Vec<String>; // Expected response type
    const METHOD: &'static str = "custom/GetWorkspaceFilesUris"; // Method name used in the request
}

Similarly, to define a custom notification, implement the Notification trait instead of Request.

Default Handlers

The default crates provides handlers for several LSP requests and notifications.

Note

These default handlers are available only if you use the BaseDb database.

changed_watched_files

This notification is handled by the changed_watched_files function in the default crate. It updates files in the workspace when external changes are detected.

To enable this handler, use the WORKSPACE_PROVIDER constant when configuring the server capabilities.

open_text_document

This notification is handled by the open_text_document function in the default module. It ensures that the file is added to the workspace if not already present.

To enable this handler, use the TEXT_DOCUMENT_SYNC constant during server capabilities configuration.

Configuring Semantic Tokens

To configure semantic tokens, you need to use the define_semantic_token_types and define_semantic_token_modifiers macros.

Token Types

use auto_lsp::define_semantic_token_types;

define_semantic_token_types![
    standard {
         "namespace" => NAMESPACE,
         "type" => TYPE,
         "function" => FUNCTION,
    }
    
    custom {
        "custom" => CUSTOM,
    }
];

This macro generates three components to streamline working with semantic token types:

  1. Constants: Creates a constant for each standard and custom token type.
  2. Supported Token Types: Generates a slice (SUPPORTED_TYPES) containing all supported token types that can be reused to inform the LSP client about available tokens.

Token Modifiers

use auto_lsp::define_semantic_token_modifiers;
define_semantic_token_modifiers![
    standard {
        DOCUMENTATION,
        DECLARATION,
    }

    custom {
        (READONLY, "readonly"),
        (STATIC, "static"),
    }
];

This generates:

  • Constants for standard (DOCUMENTATION, DECLARATION) and custom (READONLY, STATIC) modifiers.
  • A SUPPORTED_MODIFIERS slice that includes both standard and custom modifiers.

Example in AST

use auto_lsp::core::semantic_tokens_builder::SemanticTokensBuilder;

define_semantic_token_types![
    standard {
        FUNCTION,
    }

    custom {}
];

define_semantic_token_modifiers![
    standard {
        DECLARATION,
    }

    custom {}
];

impl MyType {
    fn build_semantic_tokens(&self, builder: &mut SemanticTokensBuilder) {
        builder.push(
            self.name.get_lsp_range(),
            SUPPORTED_TYPES.iter().position(|x| *x == FUNCTION).unwrap() as u32,
            SUPPORTED_MODIFIERS.iter().position(|x| *x == DECLARATION).unwrap() as u32,
        );
    }
}

Configuring a client

File extensions

The LSP server must know how each file extensions are associated with a parser.

The client is responsible for sending this information to the server.

Using VScode LSP client, this is done via providing perFileParser object in the initializationOptions of LanguageClientOptions.

import { LanguageClient, LanguageClientOptions, ServerOptions, RequestType } from 'vscode-languageclient/node';

// We tell the server that .py files are associated with the python parser defined via the configure_parsers! macro.
const initializationOptions = {
		perFileParser: {
			"py": "python"
		}
	}

const clientOptions: LanguageClientOptions = {
		documentSelector: [{ language: 'python' }],
		synchronize: {
			fileEvents: workspace.createFileSystemWatcher('**/*.py')
		},
		outputChannel: channel,
		uriConverters: createUriConverters(),
		initializationOptions
};

Logging and Tracing

auto-lsp uses fastrace and log for tracing and logging.

Logging

To enable logging, you can use any logger that implements the log crate.

For example, you can use stderrlog to log to stderr.

stderrlog::new()
    .modules([module_path!(), "auto_lsp"])
    .verbosity(4)
    .init()
    .unwrap();

Tracing

To enable tracing, follow the instructions in the fastrace documentation.

Snapshots

Since all AST nodes implement the Debug trait, it is very easy to take snapshots of them.

Therefore, you can use any snapshot testing library such as insta or expect_test.

Note

Since Some errors during parsing might be silent, it is recommended to have an utility fn/macro to create a Database and use get_ast with an accumulator to check for errors.