MADlib® uses Doxygen for documentation. Doxygen is a standard tool for generating documentation from annotated C++ sources, but it also supports other popular programming languages such as C and Python.
MADlib Doxygen Sites
User doc (latest) http://madlib.incubator.apache.org/docs/latest/
Doc web page http://madlib.incubator.apache.org/documentation.html
Source https://github.com/apache/incubator-madlib/tree/master/src/ports/postgres/modules
Documenting SQL
- SQL documentation is supported by a Doxygen filter that translates CREATE FUNCTION / CREATE AGGREGATE statements to (empty) C++ function definitions. The source code for the SQL2C++ filter consists of the flex and bison files sql.ll and sql.yy at https://github.com/apache/incubator-madlib/tree/master/doc/src.
Current features:
- Translate CREATE FUNCTION and CREATE AGGREGATE statements into empty C++ function definitions
- Both inline (C-style) comments of the form
/** ... /
and end-of-line comments of the form--! ...\n
are recognized as Doxygen comments Since PostgreSQL and Greenplum disallow labeling the arguments of aggregate functions, the filter will automatically uncomment C-style comments that start with
/*+
(currently only at spots where it makes sense). The same can be used for default arguments. (This is useful when using function overloading to mimic default arguments, which are not supported by Greenplum or PostgreSQL <= 8.2). Example:CREATE AGGREGATE fancyAggregate(/*+ "identifierA" */ INTEGER) ( ... ) CREATE FUNCTION amazingFn(val DOUBLE PRECISION /*+ DEFAULT .01 */) RETURNS INTEGER ...
will be translated into:
<inferredReturnType> fancyAggregate(integer identifierA) { }; integer amazingFn(float8 val = .01) { };
- For aggregates, the return type will be automatically inferred from the transition state / final function
- Capitalization of identifiers will be preserved if put in quotes
"iDeNtiFiEr"
- Line numbers are preserved
- Still to be implemented:
- Support for all PostgreSQL types
- Automatically generate documentation for type definitions
General
- All module documentation should be moved to
.sql_in
files. Seebayes.sql_in
andregression.sql_in
as examples. - All uninstallation SQL files should end in
"_drop.sql_in"
. Otherwise, they show up in Doxygen as visual clutter in the file list. - All files containing a
"/sql/"
in their path are excluded. These files are assumed to belong to regression tests and should not clutter the file list, either. - When in doubt, stick to the best practices of the language you are using. E.g., Python gives the following advice for its docstrings: http://www.python.org/dev/peps/pep-0257
Section Guide
- Create a new group for your module (in methods/mainpage.dox) and use
@addtogroup your_module
. - Write
@about
section to describe your algorithm. - Write
@prereq
section, for example:Requires SVEC MADlib module.
Nothing about PostreSQL or Greenplum database. - Write
@usage
section to describe the API. (In the future we may need to split this into os-level side and in-db side.) - Write
@examp
section. The reason we say 'examp' (instead of example) is because we don't want to see this on a Doxygen example tab. - Use
@literature
to list your references.
See bayes.sql_in for an example.