We record gflags-specific information by defining a custom node kind.

Tools can record arbitrary library- or project-specific information in the Kythe graph. We recently added support for the gflags command line flags parsing library to demonstrate this. gflags command-line flags are declared and defined in C++ using macros:

DECLARE_bool(secure);
DEFINE_string(address, "127.0.0.1", "Listen on this address.");

(...)

auto my_uri = (FLAGS_secure ? "https://" : "http://") + FLAGS_address + ":80";

The Kythe C++ indexer will record that the flag variable FLAGS_address is defined inside the macro. One can go further and treat the flag being defined as a first-class object. In this view, the DEFINE_string macro defines a new flag object to which FLAGS_address refers. The data about the underlying C++ variables remain in the graph, but now we have added some library-specific semantic information.

In our model of the gflags library, each flag definition or declaration gives rise to a node with kind google/gflag. We add this kind underneath the google/ prefix to avoid polluting the base namespace of Kythe nodes. Like variables, gflags can be definitions (via DEFINE_) or incomplete (via DECLARE_); they are also named with the identifier that the programmer writes down in the macro invocation. This name is distinct from the name of the variable that the macro creates. Later references to that variable are also references to the flag with which it is associated.

With a complete implementation, the following verifier test should pass:

// Checks that we can complete a string flag decl.
#include "flags_string.h"
  // This header contains these lines:
  // #include "gflags.h"
  // DECLARE_string(stringflag);
//- StringFlagDeclAnchor defines StringFlagDecl
//- StringFlagDecl.complete incomplete
//- StringFlagDecl.node/kind google/gflag
//- @stringflag defines StringFlag
//- StringFlag.complete definition
//- StringFlag.node/kind google/gflag
//- @stringflag completes StringFlagDecl
DEFINE_string(stringflag, "gnirts", "rtsgni");
//- @FLAGS_stringflag ref StringFlag
//- @FLAGS_stringflag ref FlagVar
//- FlagVar.node/kind variable
auto s = FLAGS_stringflag;

The actual business of finding and labeling flags requires some work with Clang’s syntax tree. We define an auxiliary function that, given a variable declaration (a clang::VarDecl), walks around in the tree to try to find the location of the flag identifier in the macro that caused that declaration. Of course, not many variables will be associated with flags. In that case, this routine (GetVarDeclFlagDeclLoc in //kythe/cxx/indexer/cxx/IndexerLibrarySupport.cc) returns a result that is marked invalid. Since this function will be called once per variable declaration discovered by the indexer as it traverses the syntax tree, we take extra care to quickly reject variables that could not possibly be flags (such as variables that do not begin with "FLAGS_" or variables that were not declared inside a macro).

We now have our graph representation and a procedure to collect the information necessary to generate it. Now, by using the C++ indexer’s LibrarySupport interface, we can listen for variable declarations and references to check for the ones for which we need to emit flag annotations. This interface differs from Clang’s RecursiveASTVisitor as it provides Kythe-specific information as well as pointers into the AST; for example, it will provide VNames for the variables it encounters. Since every flag is associated with at least one variable definition or declaration (which is, in turn, not associated with any other flag), and because the VName for that variable def/decl is a globally-unique identifier, we can use it as a base for the name of the google/gflag node we will create. In practice, building this derived name requires only that we add a prefix to the signature component of the VName (and by convention, we’ll use google/gflag#).

With this logic in place, we can check our verifier test above (and write some other ones, too). The new data in the graph are available to any Kythe tool. These tools are free to ignore it, to present it as generic graph relationships, or to interpret it with special knowledge about what being a google/gflag implies.