Design Notes

AST Traversal

During the AST traversal stage, the complete AST (generated by the clang frontend) is walked beginning with the root TranslationUnitDecl node. It is during this stage that USRs (universal symbol references) are generated and hashed with SHA1 to form the 160 bit SymbolID for an entity. With the exception of built-in types, all entities referenced in the corpus will be traversed and be assigned a SymbolID; including those from the standard library. This is necessary to generate the full interface for user-defined types.

Bitcode

AST traversal is performed in parallel on a per-translation-unit basis. To maximize the size of the code base MrDox is capable of processing, Info types generated during traversal are serialized to a compressed bitcode representation. Once AST traversal is complete for all translation units, the bitcode is deserialized back into Info types, and then merged to form the corpus. The merging step is necessar as there may be multiple identical definitions of the same entity (e.g. for class types, templates, inline functions, etc), as well as functions declared in one translation unit & defined in another.

The Corpus

After AST traversal and Info merging, the result is stored as a map of Info`s indexed by their respective `SymbolID`s. Documentation generators may traverse this structure by calling `Corpus::traverse with a Corpus::Visitor derived visitor and the SymbolID of the entity to visit (e.g. the global namespace).

Namespaces

Namespaces do not have a source location. This is because there can be many namespaces. We probably don’t want to store any javadocs for namespaces either.

Paths

The AST visitor and metadata all use forward slashes to represent file pathnames, even on Windows. This is so the generated reference documentation does not vary based on the platform.

Exceptions

In functions which cannot return an error, such as work submitted to a thread pool or in a constructor, the implementation usually throws Error. These are caught and reported, and the process exits gracefully. If any exception is thrown which is not derived from Error, then it should not be caught. The uncaught exception handler should print a stack trace and exit the process immediately.