The Lossless Semantic Tree: a compiler-accurate model of your code
Below the syntax tree
Meaning isn’t
in the file.
A parser sees one file in isolation. To know what a single token really is, the LST resolves it the way the compiler would, against the dependencies on the classpath, the compiler itself, the language version, and the build that wired them all together.
A foundation, not a parser
Don’t rebuild the foundation.
Build on it.
Getting one type right means modeling a language, its compiler, its build tools, and the way real projects wire dependencies together. Then you do it again for the next language, and keep every one of them correct as they all keep changing.
One model, across the estate
- Java
- JavaScript
- TypeScript
- C#
- Python
- Go
- Kotlin
- Groovy
- Scala
- COBOL
- XML
- YAML
Each with its own compiler, build tools, and dependency resolution, modeled once so you don’t have to.
The foundation layer
Find it once.
Fix it everywhere.
Semantic code search and deterministic, multi-repo refactoring, all built on the LST.
The Lossless Semantic Tree (LST) is a compiler-accurate, format-preserving model of source code, built on the open-source OpenRewrite engine. It adds full type attribution (symbol resolution, generics, inheritance, and transitive dependencies) while preserving formatting and comments, so automated semantic code search and code refactoring stay accurate and reviewable across thousands of repositories and many languages (Java, JavaScript, TypeScript, C#, Python, Go, Kotlin, Groovy, Scala, COBOL, XML, YAML).
Frequently asked questions
A Lossless Semantic Tree (LST) is a compiler-accurate code model that adds rich type and semantic attribution while preserving formatting, comments, and style. This fidelity enables precise code search and safe, automated refactoring across large, multi-repository codebases.
Unlike a traditional AST, which strips out formatting, comments, and type information, the LST preserves every detail while adding rich semantic context. That full fidelity makes both search and transformation accurate, and ensures results are idiomatically consistent with how developers actually write code.
Semantic code search goes beyond text queries. It lets developers search by code meaning (method calls, types, or API usage) rather than just strings. With the LST, semantic searches are precise and reliable across entire codebases.
By combining semantic understanding with full code fidelity, the LST enables automated refactorings that are accurate, safe, and developer-friendly, producing clean, minimal diffs that teams trust and merge.
Yes. While the LST originated with Java, it also supports JavaScript, TypeScript, C#, Python, Go, Kotlin, Groovy, Scala, COBOL, and infrastructure-as-code formats like XML and YAML. It’s built for polyglot enterprise environments where modernization spans many languages.
Yes. The LST gives AI models a rich, structured view of code with far deeper context than plain text or a traditional AST. That improves semantic search, code summarization, and issue detection, making AI more useful and reliable in large-scale code analysis.
When automated changes are idiomatically consistent with the original source, developers are much more likely to accept them. If comments vanish, developers lose trust in automation; if formatting changes unexpectedly, pull requests become noisy and unreviewable. The LST preserves it all (indentation, spacing, imports, and comments), so diffs stay clean and developers can focus on the change itself.