IntelliJ IDEA internal design

The first version of IntelliJ IDEA was released in January 2001, and at that time it was one of the first available Java IDE with advanced code navigation and code refactoring capabilities integrated.

In 2009 JetBrains open sourced its community version. And since then, many IDEs based on it were created, like Android Studio from Google.

Let’s go inside the community version of Intellij IDEA using JArchitect, and discover some internal design choices.

1- Modularity

Intellij IDEA is modularized using many projects; the main one is “idea”. The utility classes are implemented in the “util” project and the “openapi” jar contains the types needed to develop the Intellij IDEA plugins.

Here’s the list of the Intellij IDEA projects, and some statistics about their types:

intellij4

Each project contains many packages to modularize its code base and the Package-by-feature approach is adopted.

Package-by-feature uses packages to reflect the feature set. It places all items related to a single feature (and only that feature) into a single directory/package. This results in packages with high cohesion and high modularity, and with minimal coupling between packages. Items that work closely together are placed next to each other.

Here are for example some packages from the idea project, which show that the types are grouped by feature.

intellij1

 

2-The Intellij IDEA developers use widely the GoF Design Patterns

Design Patterns are a software engineering concept describing recurring solutions to common problems in software design. GoF patterns are the most popular ones.

The Intellij IDEA developers use extensively the GOF patterns; here are some of them used in the source code.

2-1 Factory

Using factory is interesting to isolate the instantiation logic and enforce the cohesion; here is the list of factories defined in the source code:

intellij3

 

Many factories are implemented; here are some of them inheriting from the TextEditorHighlihtingPassFactory.

intellij2

2-2 Adapter

Adapter pattern works as a bridge between two incompatible interfaces. This type of design pattern comes under structural pattern as this pattern combines the capability of two independent interfaces.

Many adapters are implemented in the Intellij IDEA source code:

idea8

2-3 Decorator

The decorator pattern can be used to extend (decorate) the functionality of a certain object without altering its structure. Many decorators are implemented in Intellij IDEA.

idea9

2-4 Proxy

A proxy, in its most general form, is a class functioning as an interface to something else.

Here’s for example the use of two proxies VirtualMachineProxy and StackFrameProxy by the classes FieldBreakpoint and FrameVariablesTree. The VirtualMachineProxy interface is used instead of the implementation. However, it’s not the case of the StackFrameProxyImpl which is coupled to FrameVariablesTree. Maybe a refactoring to remove this dependency is suitable.

intellij5

2-5 Facade

Facade pattern hides the complexities of the system and provides an interface to the client using which the client can access the system. Here’s an example of the CodeStyle facade implemented in Intellij IDEA.

intellij6

2-6 Visitor

The visitor design pattern is a way of separating an algorithm from an object structure on which it operates.

The highlighting feature is implemented using a visitor pattern.

intellij7

2-7 Strategy

There are common situations when classes differ only in their behavior. In these cases, it is a good idea to isolate the algorithms in separate classes in order to have the ability to select different algorithms at runtime.

Many classes implement the strategy pattern in the Intellij IDEA source code:

intellij8

2-8 Builder

This pattern allows a client object to construct a complex object; The ConrtolFlowBuilder is one of the builders implemented in Intellij IDEA source code.

Here are the methods called by the ControlFlowBuilder.build method:

intellij9

3- Coupling 

Low coupling is desirable because a change in one area of an application will require fewer changes throughout the entire application. In the long run, this could alleviate a lot of time, effort, and cost associated with modifying and adding new features to an application.
Here are three key benefits derived from using interfaces:

  • An interface provides a way to define a contract that promotes reuse. If an object implements an interface then that object is to conform to a standard. An object that uses another object is called a consumer. An interface is a contract between an object and its consumer.
  • An interface also provides a level of abstraction that makes programs easier to understand. Interfaces allow developers to start talking about the general way that code behaves without having to get in to a lot of detailed specifics.
  • An interface enforces low coupling between components, what’s make easy to protect the interface consumer from any implementation changes in the classes implementing the interfaces.

Many interfaces and abstract classes are defined in Intellij IDEA to enforce the low coupling:

intellij11

 

And here’s in blue the distribution in the Metric View of these types in the source code.

intellij10

In the Metric View, the code base is represented through a Treemap. Treemapping is a method for displaying tree-structured data by using nested rectangles. The tree structure used  is the usual code hierarchy:

  • Project contains packages.
  • Package contains types.
  • Type contains methods and fields.

The treemap view provides a useful way to represent the result of a CQLinq request; the blue rectangles represent this result, so we can visually see the types concerned by the request.

As we can observe the interfaces and abstract classes are defined in almost all packages, which is useful to present the features provided by a package as contracts.

4- Cohesion

The single responsibility principle states that a class should not have more than one reason to change. Such a class is said to be cohesive. A high LCOM value generally pinpoints a poorly cohesive class. There are several LCOM metrics. The LCOM takes its values in the range [0-1]. The LCOM HS (HS stands for Henderson-Sellers) takes its values in the range [0-2]. A LCOM HS value highest than 1 should be considered alarming. Here are  to compute LCOM metrics:

LCOM = 1 – (sum(MF)/M*F)
LCOM HS = (M – sum(MF)/F)(M-1)

Where:

  • M is the number of methods in class (both static and instance methods are counted, it includes also constructors, properties getters/setters, events add/remove methods).
  • F is the number of instance fields in the class.
  • MF is the number of methods of the class accessing a particular instance field.
  • Sum(MF) is the sum of MF over all instance fields of the class.

The underlying idea behind these formulas can be stated as follow: a class is utterly cohesive if all its methods use all its instance fields, which means that sum(MF)=M*F and then LCOM = 0 and LCOMHS = 0.

LCOMHS value higher than 1 should be considered alarming.

intellij12

 

Only very few types could be considered as not cohesive.

5- Multithreading and concurrency

To make the Intellij IDEA more reactive, many threads are created which improves the user experience .

Let’s search for all methods starting directly or indirectly threads:

intellij13

The concurrency logic is isolated in the following packages:

intellij14

And to facilitate the concurrency development the JSR166 is used.

Here’s the list of all the types used from the jsr166 jar:

intellij32

 

6- Abstractenss vs Instability graph

The idea behind this graph is that the more a code element of a program is popular, the more it should be abstract. Or in other words, avoid depending too much directly on implementations, depend on abstractions instead. By popular code element I mean a project (but the idea works also for packages and types) that is massively used by other projects of the program.

It is not a good idea to have concrete types very popular in your code base. This provokes some Zones of Pains in your program, where changing the implementations can potentially affects a large portion of the program. And implementations are known to evolve more often than abstractions.

The main sequence line (dotted) in the below diagram shows how abstractness and instability should be balanced. A stable component would be positioned on the left. If you check the main sequence you can see that such a component should be very abstract to be near the desirable line – on the other hand, if its degree of abstraction is low, it is positioned in an area that is called the “zone of pain”.

AbstractnessVSInstability

Only util is in the zone of pain, which is not really problematic. Indeed, in general an utility library  provides more some utility classes than features defined by interfaces.

7- Open API and plugin system

The use of plugins allows you to extend the Intellij IDEA. The “openapi” jar is provided to achieve this goal.

The openapi jar provides many interfaces that represent all the features we can use and extend from our plugins.

intellij24

An Intellij IDEA plugin contains one or many actions; many thousands of actions are implemented in the source code as shown by this following CQLinq query:

intellij23

Exploring an existing implemented action could help developers to develop easily their custom plugins.

8- Improve performance using cache

Using a cache is a popular way to optimize your application. Intellij IDEA uses two cache managers:

intellij26

The CacheManager interface is used by the FindInProjectTask to search for words.

Here is the list of all methods called by FindInProjectTask.getFilesForFastWordSearch method:

intellij25

9- External libraries used

Intellij IDEA uses many external jars, here’s the list of all the jars used:
intellij21
intellij22

When external libs are used, it’s better to check if we can easily change a third party lib by another one without impacting the whole application. There are many reasons that can encourage us to change a third party lib. The other lib could:

  • Have more features.
  • Be more powerful.
  • Be more secure.

Let’s discover if some external libs are highly coupled or not.

Swing:

Swing implements a set of components for building graphical user interfaces (GUIs) and adding rich graphics functionality and interactivity to Java applications. The Swing components are implemented entirely in the Java programming language. The pluggable look and feel lets you create GUIs that can either look the same across platforms or can assume the look and feel of the current OS platform (such as Microsoft Windows, Solaris™ or Linux).

Let’s search for all types using directly swing components:

intellij18

Many types use directly swing components as shown by the following treemap which shows in blue these types.

intellij16

It’s not easy to change swing by another Gui framework. And even if Swing is  a subject of controversy, the amazing GUI of Intellij IDEA proves that Swing is a good choice for Gui needs.

Netty:

Netty is an asynchronous event-driven network application framework for rapid development of maintainable high performance protocol servers & clients.

Here’s the list of all types using this library:

intellij20

Only some few types use it directly, which is very useful if we want to change it with another library.

ASM:

ASM is a very small and very fast Java bytecode manipulation framework. It becomes very popular, many tools use it. We used it also in our tool JArchitect to analyse the bytecode.

Here’s the list of all types using ASM directly:

intellij19

As Netty , the use of ASM is isolated in some packages, and we can change it easily.

Except Swing almost all other external jars are not highly coupled to Intellij IDEA.

10- Statistics

10-1 Most used type

It’s interesting to know the most used types in a project; indeed these types must be well designed, implemented and tested. And any change occurs to them could impact the whole project.

We can find them using the TypesUsingMe metric:

intellij28

However, there’s another interesting metric to search for popular types: TypeRank.

TypeRank values are computed by applying the Google PageRank algorithm on the graph of types’ dependencies. A homothety of center 0.15 is applied to make it so that the average of TypeRank is 1.

Types with high TypeRank should be more carefully tested because bugs in such types will likely be more catastrophic.

Here’s the result of all popular types according to the TypeRank metric:

intellij27

With this metric PsiElement became the most used type instead of the Project interface.

10-2 Most used methods:

intellij30

10-3 Methods calling many other methods

It’s interesting to know the methods using many other ones; a design problem could be revealed in these methods. And in some cases a refactoring is needed to make them more readable and maintainable.

intellij31

Summary

Intellij IDEA is very well designed and implemented, many patterns are used and many best practices are implemented. Exploring its source code is a pragmatic way to learn how to design and implement your application. It’s better than reading only books and articles from the web to improve your design skills.