Hacking on Clang

This document provides some hints for how to get started hacking on Clang for developers who are new to the Clang and/or LLVM codebases.

Coding Standards

Clang follows the LLVM Coding Standards. When submitting patches, please take care to follow these standards and to match the style of the code to that present in Clang (for example, in terms of indentation, bracing, and statement spacing).

Clang has a few additional coding standards:

Developer Documentation

Both Clang and LLVM use doxygen to provide API documentation. Their respective web pages (generated nightly) are here:

For work on the LLVM IR generation, the LLVM assembly language reference manual is also useful.

Debugging

Inspecting data structures in a debugger:

Debugging using Visual Studio

The files llvm/utils/LLVMVisualizers/llvm.natvis and clang/utils/ClangVisualizers/clang.natvis provide debugger visualizers that make debugging of more complex data types much easier.

Depending on how you configure the project, Visual Studio may automatically use these visualizers when debugging or you may be required to put the files into %USERPROFILE%\Documents\Visual Studio <version>\Visualizers or create a symbolic link so they update automatically. See Microsoft's documentation for more details on use of NATVIS.

Testing

Testing on Unix-like Systems

Clang includes a basic regression suite in the tree which can be run with make test from the top-level clang directory, or just make in the test sub-directory. make VERBOSE=1 can be used to show more detail about what is being run.

If you built LLVM and Clang using CMake, the test suite can be run with make check-clang from the top-level LLVM directory.

The tests primarily consist of a test runner script running the compiler under test on individual test files grouped in the directories under the test directory. The individual test files include comments at the beginning indicating the Clang compile options to use, to be read by the test runner. Embedded comments also can do things like telling the test runner that an error is expected at the current line. Any output files produced by the test will be placed under a created Output directory.

During the run of make test, the terminal output will display a line similar to the following:

--- Running clang tests for i686-pc-linux-gnu ---

followed by a line continually overwritten with the current test file being compiled, and an overall completion percentage.

After the make test run completes, the absence of any Failing Tests (count): message indicates that no tests failed unexpectedly. If any tests did fail, the Failing Tests (count): message will be followed by a list of the test source file paths that failed. For example:

  Failing Tests (3):
      /home/john/llvm/tools/clang/test/SemaCXX/member-name-lookup.cpp
      /home/john/llvm/tools/clang/test/SemaCXX/namespace-alias.cpp
      /home/john/llvm/tools/clang/test/SemaCXX/using-directive.cpp

If you used the make VERBOSE=1 option, the terminal output will reflect the error messages from the compiler and test runner.

The regression suite can also be run with Valgrind by running make test VG=1 in the top-level clang directory.

For more intensive changes, running the LLVM Test Suite with clang is recommended. Currently the best way to override LLVMGCC, as in: make LLVMGCC="clang -std=gnu89" TEST=nightly report (make sure clang is in your PATH or use the full path).

Testing using Visual Studio on Windows

The Clang test suite can be run from either Visual Studio or the command line.

Note that the test runner is based on Python, which must be installed. Find Python at: https://www.python.org/downloads/. Download the latest stable version.

The GnuWin32 tools are also necessary for running the tests. Get them from http://getgnuwin32.sourceforge.net/. If the environment variable %PATH% does not have GnuWin32, or if other grep(s) supercedes GnuWin32 on %PATH%, you should specify LLVM_LIT_TOOLS_DIR to CMake explicitly.

The cmake build tool is set up to create Visual Studio project files for running the tests, "check-clang" being the root. Therefore, to run the test from Visual Studio, right-click the check-clang project and select "Build".

Please see also Getting Started with the LLVM System using Microsoft Visual Studio and Building LLVM with CMake.

Testing on the Command Line

If you want more control over how the tests are run, it may be convenient to run the test harness on the command-line directly. Before running tests from the command line, you will need to ensure that lit.site.cfg files have been created for your build. You can do this by running the tests as described in the previous sections. Once the tests have started running, you can stop them with control+C, as the files are generated before running any tests.

Once that is done, to run all the tests from the command line, execute a command like the following:

  python (path to llvm)\llvm\utils\lit\lit.py -sv
  --param=build_mode=Win32 --param=build_config=Debug
  --param=clang_site_config=(build dir)\tools\clang\test\lit.site.cfg
 (path to llvm)\llvm\tools\clang\test

For CMake builds e.g. on Windows with Visual Studio, you will need to specify your build configuration (Debug, Release, etc.) via --param=build_config=(build config). You may also need to specify the build mode (Win32, etc) via --param=build_mode=(build mode).

Additionally, you will need to specify the lit site configuration which lives in (build dir)\tools\clang\test, via --param=clang_site_config=(build dir)\tools\clang\test\lit.site.cfg.

To run a single test:

  python (path to llvm)\llvm\utils\lit\lit.py -sv
  --param=build_mode=Win32 --param=build_config=Debug
  --param=clang_site_config=(build dir)\tools\clang\test\lit.site.cfg
  (path to llvm)\llvm\tools\clang\test\(dir)\(test)

For example:

  python C:\Tools\llvm\utils\lit\lit.py -sv
  --param=build_mode=Win32 --param=build_config=Debug
  --param=clang_site_config=C:\Tools\build\tools\clang\test\lit.site.cfg
  C:\Tools\llvm\tools\clang\test\Sema\wchar.c

The -sv option above tells the runner to show the test output if any tests failed, to help you determine the cause of failure.

You can also pass in the --no-progress-bar option if you wish to disable progress indications while the tests are running.

Your output might look something like this:

lit.py: lit.cfg:152: note: using clang: 'C:\Tools\llvm\bin\Release\clang.EXE'
-- Testing: Testing: 2534 tests, 4 threads --
Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Testing Time: 81.52s
  Passed           : 2503
  Expectedly Failed:   28
  Unsupported      :    3

The statistic, "Failed" (not shown if all tests pass), is the important one.

Testing changes affecting libc++

Some changes in Clang affect libc++, for example:

After adjusting libc++ to work with the changes, the next revision will be tested by libc++'s pre-commit CI.

For most configurations, the pre-commit CI uses a recent nightly build of Clang from LLVM's main branch. These configurations do not use the Clang changes in the patch. They only use the libc++ changes.

The "Bootstrapping build" builds Clang and uses it to build and test libc++. This build does use the Clang changes in the patch.

Libc++ supports multiple versions of Clang. Therefore when a patch changes the diagnostics it might be required to use a regex in the "expected" tests to make it pass the CI.

Libc++ has more documentation about the pre-commit CI. For questions regarding libc++, the best place to ask is the #libcxx channel on LLVM's Discord server.

Creating Patch Files

To contribute changes to Clang see LLVM's Getting Started page

LLVM IR Generation

The LLVM IR generation part of clang handles conversion of the AST nodes output by the Sema module to the LLVM Intermediate Representation (IR). Historically, this was referred to as "codegen", and the Clang code for this lives in lib/CodeGen.

The output is most easily inspected using the -emit-llvm option to clang (possibly in conjunction with -o -). You can also use -emit-llvm-bc to write an LLVM bitcode file which can be processed by the suite of LLVM tools like llvm-dis, llvm-nm, etc. See the LLVM Command Guide for more information.