Expand description
LSP semantic tokens handler.
Implements textDocument/semanticTokens/full by lexing the document with
mz_sql_lexer::lexer::lex and mapping each token to a standard LSP
token type. Comments (discarded by the lexer) are recovered via a
separate pre-scan that is aware of strings and quoted identifiers.
The output is delta-encoded per LSP 3.16: tokens are sorted by byte
offset, split across line boundaries (LSP tokens are line-local), and
serialized as a flat [deltaLine, deltaStartChar, length, tokenType, 0]
sequence.
Legend indices must match the order declared in the serverβs
SemanticTokensLegend (see legend_token_types).
StructsΒ§
- Line
Token π - A line-local semantic token, after multi-line splitting.
- RawSpan π
- Byte-offset span with an associated semantic token type.
ConstantsΒ§
- TOKEN_
TYPE_ πCOMMENT - TOKEN_
TYPE_ πKEYWORD - TOKEN_
TYPE_ πNUMBER - TOKEN_
TYPE_ πOPERATOR - TOKEN_
TYPE_ πPARAMETER - TOKEN_
TYPE_ πSTRING - TOKEN_
TYPE_ πVARIABLE
FunctionsΒ§
- collect_
comments π - Pre-scan raw text for
--line comments and/* */block comments. String bodies and quoted-identifier bodies are skipped so that comment markers inside them are not misidentified. - compute_
semantic_ πtokens - Computes the semantic tokens for a SQL document.
- encode_
deltas π - Delta-encode line-local tokens per LSP 3.16.
- legend_
token_ πtypes - Token types in the order required for legend indices.
- lex_
token_ πspan - Map a lexer token to its byte span and semantic type.
- line_
for_ πoffset - Binary search for the line containing
offset. - line_
starts π - Byte offsets of the start of each line (including line 0 at offset 0).
- saturating_
u32 π - Convert a
usize(line/column/length in the document) into theu32width required by the LSP semantic-token wire format. No-op on values belowu32::MAX; saturates otherwise. LSP positions are specified to beu32, so any document large enough to saturate is already unrepresentable. - scan_
dollar_ πquoted_ len - Length of a
$tag$body$tag$dollar-quoted string. Matches the outer delimiter using its tag (possibly empty). - scan_
hex_ πstring_ token_ len - Length of a hex string token:
x'...'orX'...'. - scan_
ident_ πlen - scan_
parameter_ πlen - scan_
string_ πtoken_ len - Length of a string token. May be a normal
'...'or extendedE'...'/e'...'form (the E prefix is part of the token offset). - skip_
double_ πquoted - Skip a
"..."quoted-identifier body (with doubled-quote escape). - skip_
single_ πquoted - Skip a
'...'string body (with doubled-quote escape). Returns index just past the closing quote, orbytes.len()if unterminated. - split_
across_ πlines - Split each raw span across line boundaries and compute UTF-16 column offsets. Produces line-local tokens, still in byte-order.
- trim_
trailing_ πnewline - Trim a trailing
\nor\r\nfrom a segment so it doesnβt include the line terminator. - utf16_
len π - Number of UTF-16 code units in
s. ASCII-only fast path returns the byte length; non-ASCII walks chars and sumslen_utf16.