NetBeans™ C/C++ Development Pack
Code Assistance Modules Architecture

This document describes the architecture of the modules that provide code assistance functionality. By code assistance we mean such features as code completion, hyperlink navigation, code folding, class view, etc.

Main Components

Main code assistance components

The figure below shows the (simplified) relationships between code assistance components. Let us first introduce some terminology

Language Model

The Language Model gives a developer a programmatic way to see the source code. It is a set of interfaces that represent source code and these interfaces implementation. These interfaces are to describe every detail of the uderlying source code. It does not contain any user-visible features, but is rather a basement for these features. Language Model consists of the Language Model API and the Language Model Implementation

Language Model API

Language Model API is a set of interfaces that represent the underlying source code. Those who write the end-user functionality that works with C/C++ source code use Language Model API and know nothing about how it is implemented. So we can switch to another parser, or start getting information from compiled code, or whatever - this won't cause any changes in the code related with end-user funcionality.

Language Model Implementation

The implementation of the Language Model API. It will be described briefly in the next sections

Language Model Clients

Language Model Client is code that uses Language Model API. All modules that provide particular code assistance features - Code Completion, Class View, etc. are Language Model clients.

Language Model API

As it was mentioned above, the Language Model API is a set of interfaces that represent source code

This API is experimental. There is no warranty that it will not be changed in subsequent releases. Moreover, some changes most likely will be introduced in the next release. So the idea of this article is rather to describe the direction we are moving towards, than to encourage you writing plugins that are using this API right now.

You can surely do this on your own risk. The ideology of the API will definitely survive; the experiece of the C/C++ Pack development team shows that moving to the next Netbeans release does not cause code rewrite; most likely you'll have to to make just several small changes in your code.

Main Building Blocks

The API consists of a set of interfaces, an interface per each language consrtuct. For example, a C++ class is represented via CsmClass interface, a C or C++ function is represented via CsmFunction interface; each source file is represented via CsmFile interface, and so on.

There are also interfaces for statements ( CsmStatement) and expressions ( CsmExpression). For now (version 6.0) the expressions aren't implemeted well enough (they contain only top-level expression as a whole and do not return the sub-expressions tree).

The "Csm" prefix stands for C/C++ Source Model.

The root of the API class hierarchy is a CsmObject. It acts as the common ancestor for all interfaces in the Language Model API.

Most interfaces that correspond top-level elements (classes and their members, functions, variables, etc) are derived from CsmDeclaration.

Almost all interfaces in the API extends CsmOffsetable. It represents an element that resides in a continuous segment of a single source file. It allows to know, which file the given element resides in, and what are the start and the end position of the given elemet within that file. The getContainingFile() method returns the file that contains the given element and a pair of method. The getStartOffset() and getEndOffset() methods return the start and the end offset of the given construct in the file it resides in.

The IDE project is represented via CsmProject interface. One can ask, why don't we just use Project interface from NetBeans API. It's because CsmProject has some specific functionality, such as getGlobalNamespace() or waitParse() methods. Apart from instaces that represent IDE projects, there usually exist several so-called "artificial" projects. The latter acts as containers for header files that are used in some IDE project, though does not belong to any IDE project. The examples are system headers from /usr/include. Such artificial projects are also represented by instances of CsmProject class. Usually one artificial project per include directory is created.

There is a special interface, which acts like a central point to the Language Model - CsmModel. It allows you to add and remove listeners, get the list of open projects, and so on.

Logical View and Physical View

IDE graphical user interface has two representations of the program structure. When user opens a file, the Navigator shows this file structure. We call this physical structure, because it corresponds to the physical order of the declarations within the file. IDE also has another view - Class View - which shows the structure of the entire project from language pomt of view. We call this logical structure.

Language Model API also supports two views physical and logical.

Each source file is represented via CsmFile interface. The CsmFile has a method that returns a collection of all declarations within this file - getDeclarations(). This is a physical view.

Logical view is represented via CsmProject and CsmNamespace interfaces. Each user project has correspondent instance of CsmProject interface; each namespace is represented via CsmNamespace interface.

For better understanding of how to go through the both hierarchies, you can look the source code of the CsmTracer class (org.netbeans.modules.cnd.api.model.util.CsmTracer). It was written for tracing and testing purposes. It dumps model. The dumpModel(CsmProject project) method dumps the logical structure of the project passed as a parameter. The dumpModel(CsmFile file) method dumps the physical structure of the given file.

Instances lifecycle, UIDs and persistence

The instances that implement the mentioned above interfaces has different lifecycles. The most general rule is that they can not be considered as long living objects. These objects can be garbage collected and, later on, replaced by the new instances.

The Language Model data is spacious. To not let it occupy too much memory, a persistence was introduced. The persistence is a mechanizm that allows to store instance data on a disk, and to get it back on request. Each instance is associated with a special unique identifier , UID. Instead of storing a reference to the object itself, its UID is stored. So instance can be garbage collected. Later on, the instance can be gotten via this UID.

There is an in-memory cache for objects that uses LRU strategy. So, for recently used objects, getting by UID is just getting from a map. In this case you get the same instance that you (or someone else) used recently. If the object has been already garbage collected, new instance is read from persistence.

Two key interfaces that are related with UIDs are CsmUID and CsmIdentifiable. The UIDs are represented with instances of the CsmUID interface. It has the only method - getObject(). The method returns the object that corresponds the UID.

The CsmIdentifiable interface represents an object that can has an UID associated, and be restored from this UID later on. It also has the only method - getUID(). Each of the above mentioned interfaces that describe the program structure is derived from CsmIdentifiable.

As you see, using UIDs is quite straightforward. Just store UID instead of direct reference, and use getObject() later on, when you need the correspondent instance.

Do not use instanceof

A lot of Language Model API methods return a collection of declarations. For example CsmNamespace.getDeclarations() returns a collection of all the declarations that belong to this namespace. You often need to understand, what are these declarations - for each declaration you typically are interested to know, whether it is a class, or a function, or a typedef, etc. The first thing that comes to developer's mind is to check this via instanceof operator.

However we strongly discourage you from using instanceof. The reason is that sometimes an implementation class can inherit some iterface "occasionally" - just for some implementation-related reason.

The best way is to use a CsmKindUtilities class. It has a plenty of is...() methods: isClass(), isField(), isFunctionDefinition() and so on.

There also is an enum CsmDeclaration.Kind and a method CsmDeclaration.getKind() that were initially created for distingwishing different kind of objects. But infortunately there are several drawbacks. The main drawback is that the method returns a single kind, while the object can belong to seceral kinds at once. Consuder some class' method. Is it a fuction? Yes, it is. Is it a member? Yes, sure. In the case it contains its body, it is also a function definition as well. Which of the three kinds should be return then? This shows a design flaw that isn't fixed yet. It will be definitely corrected by the moment of making this API public. For now just use CsmKindUtilities class.

Events and Listeners

Language model supports three kind of listeners - CsmModelListener , CsmModelStateListener and CsmProgressListener . The CsmModel class maintains the list of the listeners. For each of the three, it has a pair of add***Listener() and remove***Listener() methods.


This listener is notified about each change in the Language Model content. When some model elements are changed or removed, or added, the modelChanged() method is called.

It is also notified when a project is opened (projectOpened() method) or closed (projectClosed() method). The instance of the CsmProject interface that represents the project is passed as a parameter to these methods.


The listener is notified when the Language Model state is changed as a result of switching modules on and off. Sometimes clients register these listener and dismiss unnecessary resources when the model is going down.


This one allows you to track the process of parsing files. For example, Code Assistance uses this listener to show the "Parsing..." progress indicator

Some code examples

We will not provide examples right here. Instead we give a few links.

There are some examples in How to Write a Simple Code Analyzer Using NetBeans™ C/C++ Development Pack API article. They mostly illustrate how to traverse Language Model trees. It is simple and good for first aquaintance.

There is a comprehensive example of going through the Language Model hierarchies in the CsmTracer class (org.netbeans.modules.cnd.api.model.util.CsmTracer).

The best examples can be found in the code of the end user features of the NetBeans™ C/C++ Pack itself. The most convenient web interface for browsing the sources is OpenGrok at The code can also be found at

Project Features

About this Project

CND was started in November 2009, is owned by DimaZh, and has 191 members.
By use of this website, you agree to the NetBeans Policies and Terms of Use (revision 20160708.bf2ac18). © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo
Please Confirm