TL;DR
The edge-resolution heuristic tops out at 25-37% precision: homonyms, re-exports, and dynamic imports stay ambiguous. By wiring in rust-analyzer, pyright, gopls, and typescript-language-server over LSP, cartog climbs to 44-81% depending on the language.
To try it out: documentation · source code.
The article Tree-sitter and code graphs for better code navigation explains how cartog builds a directed graph: nodes are symbols (functions, classes, methods), edges are their relations (calls, imports, inheritance). To create an edge, you have to resolve each reference to the right target symbol.
A 6-level cascade (same file > import path > same directory > parent scope > unique global > discrimination by kind) does this fast across an entire project.
flowchart TD
A["validate(data)"] --> H["6-level heuristic<br>file · import · directory<br>scope · unique · kind"]
H -->|"3 homonyms,<br>no level decides"| U["Unresolved â"]
Three common failure cases:
Homonyms: several validate() functions in different modules. The âsame directoryâ heuristic finds 3 candidates: no way to decide.
Re-exports: Python and TypeScript lean heavily on re-exports (__init__.py, index.ts). The import path doesnât point to the real source file but to a barrel file.
Dynamic imports: getattr(module, func_name)() or require(variable), with no static trace to follow.
For queries like cartog impact validate --depth 3, an unresolved edge means a missing caller in the impact analysis.
The result is incomplete, and the agent makes decisions on a partial view of the code.
The problem: we cap at 25-37% precision depending on the language. In other words, out of 100 function calls extracted from the code, only 25 to 37 are correctly linked to their definition.
The rest point into the void or arenât resolved at all.
25% to 37% sounds low⊠Why?
And yet when my IDE renames a function across the whole project, nothing breaks⊠How does it do that?
IDEs query a language server: a partial compiler that maintains a complete semantic model of the project (types, scopes, re-exports, signatures).
When you hit âGo to Definitionâ in VS Code, itâs the matching LSP server (rust-analyzer, pyright, gopls, typescript-language-serverâŠ) that answers.
The Language Server Protocol is an open standard defined by Microsoft: a single server can power VS Code, Neovim, Helix, ZedâŠ
Rather than rewriting a compiler per language, cartog leans on these existing servers as a source of truth to resolve its edges.
sequenceDiagram
participant C as Cartog
participant LSP as Language Server
C->>LSP: initialize(rootUri)
LSP-->>C: capabilities
Note over C: For each unresolved edge...
C->>LSP: textDocument/definition(file, line, col)
LSP-->>C: targetFile:targetLine
Note over C: Look up symbol at this position
C->>C: Resolve edge â target_id
C->>LSP: shutdown
LSP is an optional feature enabled by default when the binary is compiled. Cartog stays fully functional without it: the heuristic already covers the majority of cases.
LSP kicks in only on the edges the heuristic couldnât resolve. It complements rather than replaces: so thereâs no degradation if the server is slow or fails to find the definition.
On startup, cartog detects the language servers available on the PATH:
| Language | Server | Detected command |
|---|---|---|
| Rust | rust-analyzer | rust-analyzer |
| Python | pyright | pyright-langserver |
| TypeScript/JS | typescript-language-server | typescript-language-server |
| Go | gopls | gopls |
| Ruby | ruby-lsp / solargraph | ruby-lsp, solargraph |
| Java | jdtls | jdtls |
| PHP | intelephense / phpactor | intelephense, phpactor |
If no server is found for a language, cartog quietly continues with the heuristic alone. No error, no forced dependency.
flowchart TD
IDX["Indexing<br>tree-sitter"] --> HEUR["Heuristic resolution<br>6 levels"]
HEUR --> CHECK{"Unresolved<br>edges?"}
CHECK -->|no| DONE["Complete graph"]
CHECK -->|yes| LSP_CHECK{"LSP servers<br>available?"}
LSP_CHECK -->|no| DONE
LSP_CHECK -->|yes| LSP_RESOLVE["LSP resolution<br>textDocument/definition"]
LSP_RESOLVE --> DONE
style LSP_RESOLVE fill:#e8f5e9,stroke:#4caf50
style HEUR fill:#fff3e0,stroke:#ff9800
The LspManager keeps one client per language, reused for all unresolved edges of that language.
For each unresolved edge, cartog:
- Builds the source file URI + position (line, column) of the call
- Sends a
textDocument/definitionrequest - Receives the file and position of the target definition
- Looks up which symbol occupies that position in the graph
- Resolves the edge:
target_id = found symbol
sequenceDiagram
participant C as Cartog
participant LSP as Language Server
participant DB as Graph
loop For each file (sequential)
C->>LSP: textDocument/didOpen(file)
loop For each unresolved edge in the file
C->>LSP: textDocument/definition(file, line, col)
alt Definition found
LSP-->>C: targetFile : targetLine
C->>DB: symbol at (targetFile, targetLine)?
DB-->>C: symbol_id
C->>DB: target_id = symbol_id â
else No response / outside project
LSP-->>C: â
Note over C: Edge left unresolved<br>(no degradation)
end
end
end
If the server doesnât answer or canât find the definition, the edge stays unresolved, no degradation. Requests are sent sequentially per file to respect the LSP protocol.
| Language | Heuristic precision | Precision with LSP |
|---|---|---|
| Python | ~25% | ~65% |
| TypeScript | ~30% | ~72% |
| Rust | ~37% | ~81% |
| Go | ~28% | ~44% |
These measurements were taken on a 69-file / 4k-line Python project, comparing resolved edges against ground truth obtained by manual resolution.
The gains vary by language.
Rust gets the best results: rust-analyzer maintains a complete semantic model of the project (types, traits, impls).
Go is more modest because gopls resolves calls through implicitly satisfied interfaces less well (Goâs structural duck typing).
Concretely, on a cartog impact validate --depth 3 analysis, the heuristic alone surfaces 12 symbols with 3 missing edges.
With LSP, the same impact returns 18 symbols: the graph is complete, and the agent has a reliable view of the consequences of a change.
If LSP gives much better precision, why doesnât cartog enable it everywhere by default?
Because LSP isnât free:
| Aspect | Heuristic only | With LSP |
|---|---|---|
| Indexing time | ~1s | +10-60s (server startup) |
| Dependencies | None | Language server on PATH |
| Precision | 25-37% | 44-81% |
| Reliability | Deterministic | Depends on the server |
Language server startup is the main cost (workspace initialization, type loading). Cartog enforces a 20-second timeout on the initialization phase (the CARTOG_LSP_READY_TIMEOUT_SECS environment variable adjusts it) and watches the LSP protocolâs window/workDoneProgress tokens: if the server reports no progress, itâs quietly abandoned and the edge stays on the heuristic alone.
For daily use (active development, frequent re-indexing), the heuristic alone is enough. LSP earns its keep for a deep impact analysis or a first full indexing.
The graph is now precise, but cartog re-parses the entire project on every cartog index call: over 10,000 files that becomes a drag, especially in watch mode where every save triggers a re-index.
The developer edits one file. Why re-parse the other 9,999?
That will be the topic of the next article in the series: incremental indexing via stable IDs and a Merkle tree.