Hi guys,
I’ve spent quite some time in the past using the Roslyn API, in an internship and during the Google Summer of Code 2015. Thought I’d be a great idea to present it here.
The .NET Compiler Platform, known by the name of Roslyn, was made available to the public as an extension to Visual Studio in 2010. During the Build 2014, it was made an open-source project by Microsoft. In short, Roslyn provides a set of open source compiler and code analysis API for .NET languages but only Visual C# and Visual Basic are supported at the moment.
Before, when programmers were writing some code in order to solve any kind of issue,they hoped for the code to build and then run some tests on it and finally, build it on TFS and hope it works. Compilers were seen as some kind of magical tool that would check up the code worked or not. In the past, this system was working but as we progressed and get more complicated software, we, as programmers, we need great tools such as refactorings to increase our productivity.
This is where the open source project gets in play. The .NET Compiler Platform enables users to get more information concerning their source code. Instead of simply writing code and hoping for the best, it there are different APIs that can now be use for different tasks such as code analysis. In short, source code analysis is the act of automatically testing source code in the hope of finding bugs within the source code so that it can be fixed. Using a tool like code analysis can lead to many favorable circumstances like code transformation.
The compiler platform was built on top on the C# and Visual Basic compilers. Roslyn can be seen like a compiler as a service. The term “compiler as a service” or CaaS should not be looked like other services like Platform as a service (PaaS) ou Infrastructure as a service (IaaS). Roslyn was created in order to reengineer the compiler of C# and VB.NET. It exposes different phases of the development phase such like the compiling time.
The service given by the CaaS empowers programmers like never before. As it was mentioned earlier, the compiled code is now exposed and can be manipulated by users. Basically, this black box (.net compiler) is now broken apart in many pieces, that when, they all are put together, it creates what is called the syntax tree. In brief, the syntax tree is a representation of the code written in a source file. The syntax tree is code agnostic; it will not have any issues representing the code from a .cs or .vb source file.
This syntax tree uses three different components; SyntaxNode, SyntaxTrivia and SyntaxToken. The SyntaxNode are the core members of the tree, tokens and trivia directly depend on them to exist. Nodes can represent an “infinite” number of syntaxic which can go from a class declaration to initializing an object. As for tokens and trivia; tokens represent small fragments in code, such as identifiers (ex: variable’s name) or keywords (ex: int) and trivia represent all that is remaining in the file such as whitespace or comments.
While they give a lot of information concerning the source code, they don’t give quite enough. This is where symbols come in play. They provide semantic information that the compiler knows about the source code. The symbols contain the information the compiler knows concerning the elements within the source file. Inside symbols, it is possible to find information such as in which assembly a type has been declared or various information concerning a class or namespace. With this kind of information, it’s now possible to validate the type of node during the code analysis. Symbols are separated in two distinct types (interfaces) : ISymbol and ITypeSymbol. ISymbol will help users retrieve the information.
Keeping this in mind, it is now possible to move over to code analyzers and code fix providers. Roslyn analyzers enables developers to enforce certain rules within their code base such as using only “var” as local variable type. A Roslyn analyzer must inherits from the base class DiagnosticAnalyzer which expose a set of diagnostics the analyzer will be responsible for, using the property called “SupportedDiagnostics”. Another important fact concerning diagnostic analyzers is that the analyzers are initialize via the Initialize method inside DiagnosticAnalyzer sub-class. Using the parameter of type “AnalysisContext”, the analyzer registers on the type of node given. Look at the example below. It uses the static class SyntaxKind. This class contains all types and tokens. This class is extremely helpful to find out if a token is a comma or to specify that you’re looking for an object creation expression for your analyzer.
public override void Initialize(AnalysisContext context) { context.RegisterSyntaxNodeAction( (nodeContext) => { Diagnostic diagnostic; if (TryGetRedundantNullableDiagnostic(nodeContext, out diagnostic)) { nodeContext.ReportDiagnostic(diagnostic); } }, SyntaxKind.ObjectCreationExpression //When users creates object ); }
A diagnostic analyzer is only used to find out if the rule was met. In a situation where the peculiar rule, let’s say that the call to the base constructor is redundant, the diagnostic analyzer will only look at object creation expressions and analyze them with the given algorithm. A code fix provider will fix the source code that needs fixing. The most important element in the code fix provider is the method “RegisterCodeFixesAsync”. The information that developers should keep in mind is that information concerning source code is immutable, it cannot be changed.
Using the Document property of the CodeFixContext class, it will be possible to access the syntax root of the source code by calling the method GetSyntaxRootAsync(CancellationToken). The remaining work for the developer will now be to retrieve the SyntaxNode object that created a diagnostic. Finally, the developer needs to create a new root tree where the node has either been removed or replaced by making sure it respects the diagnostic rule.
Well, that’s about it readers. This should be a great introduction for those wanting to dive into the code analysis world.
Kevin out.