Skip to main content

Module semantic_tokens

Module semantic_tokens 

Source
Expand description

LSP semantic tokens handler.

Implements textDocument/semanticTokens/full by lexing the document with mz_sql_lexer::lexer::lex and mapping each token to a standard LSP token type. Comments (discarded by the lexer) are recovered via a separate pre-scan that is aware of strings and quoted identifiers.

The output is delta-encoded per LSP 3.16: tokens are sorted by byte offset, split across line boundaries (LSP tokens are line-local), and serialized as a flat [deltaLine, deltaStartChar, length, tokenType, 0] sequence.

Legend indices must match the order declared in the server’s SemanticTokensLegend (see legend_token_types).

StructsΒ§

LineToken πŸ”’
A line-local semantic token, after multi-line splitting.
RawSpan πŸ”’
Byte-offset span with an associated semantic token type.

ConstantsΒ§

TOKEN_TYPE_COMMENT πŸ”’
TOKEN_TYPE_KEYWORD πŸ”’
TOKEN_TYPE_NUMBER πŸ”’
TOKEN_TYPE_OPERATOR πŸ”’
TOKEN_TYPE_PARAMETER πŸ”’
TOKEN_TYPE_STRING πŸ”’
TOKEN_TYPE_VARIABLE πŸ”’

FunctionsΒ§

collect_comments πŸ”’
Pre-scan raw text for -- line comments and /* */ block comments. String bodies and quoted-identifier bodies are skipped so that comment markers inside them are not misidentified.
compute_semantic_tokens πŸ”’
Computes the semantic tokens for a SQL document.
encode_deltas πŸ”’
Delta-encode line-local tokens per LSP 3.16.
legend_token_types πŸ”’
Token types in the order required for legend indices.
lex_token_span πŸ”’
Map a lexer token to its byte span and semantic type.
line_for_offset πŸ”’
Binary search for the line containing offset.
line_starts πŸ”’
Byte offsets of the start of each line (including line 0 at offset 0).
saturating_u32 πŸ”’
Convert a usize (line/column/length in the document) into the u32 width required by the LSP semantic-token wire format. No-op on values below u32::MAX; saturates otherwise. LSP positions are specified to be u32, so any document large enough to saturate is already unrepresentable.
scan_dollar_quoted_len πŸ”’
Length of a $tag$body$tag$ dollar-quoted string. Matches the outer delimiter using its tag (possibly empty).
scan_hex_string_token_len πŸ”’
Length of a hex string token: x'...' or X'...'.
scan_ident_len πŸ”’
scan_parameter_len πŸ”’
scan_string_token_len πŸ”’
Length of a string token. May be a normal '...' or extended E'...' / e'...' form (the E prefix is part of the token offset).
skip_double_quoted πŸ”’
Skip a "..." quoted-identifier body (with doubled-quote escape).
skip_single_quoted πŸ”’
Skip a '...' string body (with doubled-quote escape). Returns index just past the closing quote, or bytes.len() if unterminated.
split_across_lines πŸ”’
Split each raw span across line boundaries and compute UTF-16 column offsets. Produces line-local tokens, still in byte-order.
trim_trailing_newline πŸ”’
Trim a trailing \n or \r\n from a segment so it doesn’t include the line terminator.
utf16_len πŸ”’
Number of UTF-16 code units in s. ASCII-only fast path returns the byte length; non-ASCII walks chars and sums len_utf16.