Auto-remediating code at scale with lossless semantic tree and generative AI

Moderne
|
July 20, 2023
Moderne experts talk LST and AI
Contents

Key Takeaways

Watch webinar now.

Our friends at VMware hosted a series of expert talks called “The Golden Path to SpringOne” covering tools and processes developers need to know about. Moderne was fortunate to present during this series (twice!), with the second of our sessions covering the technical innovations behind automated code remediation, namely the Lossless Semantic Tree (LST) and generative AI for code refactoring.

At Moderne, we have built our platform to tackle the challenges developers face in their attempts to maintain and secure large, assembled codebases. We have created a ‘best of all worlds’ automated code refactoring solution that uses a rules-based approach to automation, now augmented by AI. 

“I’ve edited many a file of source code with regular expression find and replaces. The challenge of programmatically editing not just one line of code but 10 million lines of code correctly without any mistakes becomes increasingly more difficult without running into an error or making an incorrect change.” —Sam Snyder, VP of engineering, Moderne

In this talk Sam Snyder, VP of engineering at Moderne, describes our revolutionary technology that underpins the auto-remediation function—the LST. It’s a technology leap from the common Abstract Syntax Tree, delivering the full fidelity of codebase knowledge that results in 100% accurate, style-preserving code transformations. 

As Sam explains: “If you move from a textual representation to a structural representation, now you can make edits and be pretty confident that you’re not going to mess up the syntax because you’re not fiddling with the text Class A—instead you’re fiddling with a tree representation of that class declaration.” 

The semantic-based data of the LST is manipulated by rule-based transformation recipes (programs) from OpenRewrite, an open-source, automated refactoring tech and ecosystem born at Netflix in 2016. OpenRewrite is also a foundational element of the Moderne platform. 

Unlike an AST, where the compiler generates bytecode and throws away comments and white space, the LST is designed to generate pull requests that a human will review and accept. As Sam acknowledges, “No human is going to accept an automated code change if you strip out all of their comments and totally mangle the white space.” That’s why the LST retains the white space, padding, and formatting, enabling our automated changes to make the least destructive diff possible. 

As you watch the webinar, you’ll get to see demos of auto-remediation in action in the Moderne platform, and how the diffs are shared with developers for review.

Next up in the webinar: Moderne’s resident AI researcher, Justine Gehring, talks about how the Moderne system can benefit from integrating the suggestive authorship of generative code AI—ultimately getting more value out of the LSTs. 

First she summarizes use cases for generative AI combined with rules-based auto-remediation from Moderne, including:

  • Assisting OpenRewrite recipe authorship 
  • Processing the LST to perform in-code classification problems and provide natural language descriptions of errors 

You’ll also get to see a demo of recipe authorship with machine learning (ML) in action at the end.

Justine then goes into the history of ML for code. You’ll learn why graph neural networks are not sufficient for ML for code, and how transformers changed the game. She then addresses pros and cons of today’s generative AI examples. If you’re getting up to speed on AI/ML for code, this session offers a good primer.

The Moderne team did a great job covering juicy technical topics during this session. Be sure to check out the full webinar.