MADlib^® uses Doxygen for documentation. Doxygen is a standard tool for generating documentation from annotated C++ sources, but it also supports other popular programming languages such as C and Python.

MADlib Doxygen Sites

User doc (latest) http://madlib.incubator.apache.org/docs/latest/
Doc web page http://madlib.incubator.apache.org/documentation.html
Source https://github.com/apache/incubator-madlib/tree/master/src/ports/postgres/modules

Documenting SQL

SQL documentation is supported by a Doxygen filter that translates CREATE FUNCTION / CREATE AGGREGATE statements to (empty) C++ function definitions. The source code for the SQL2C++ filter consists of the flex and bison files sql.ll and sql.yy at https://github.com/apache/incubator-madlib/tree/master/doc/src.
Current features:
- Translate CREATE FUNCTION and CREATE AGGREGATE statements into empty C++ function definitions
- Both inline (C-style) comments of the form /** ... / and end-of-line comments of the form --! ...\n are recognized as Doxygen comments
- Since PostgreSQL and Greenplum disallow labeling the arguments of aggregate functions, the filter will automatically uncomment C-style comments that start with /*+ (currently only at spots where it makes sense). The same can be used for default arguments. (This is useful when using function overloading to mimic default arguments, which are not supported by Greenplum or PostgreSQL <= 8.2). Example:
```
CREATE AGGREGATE fancyAggregate(/*+ "identifierA" */ INTEGER) ( ... )
CREATE FUNCTION amazingFn(val DOUBLE PRECISION /*+ DEFAULT .01 */) RETURNS INTEGER ...
```
  will be translated into:
```
<inferredReturnType> fancyAggregate(integer identifierA) { };
integer amazingFn(float8 val = .01) { };
```
- For aggregates, the return type will be automatically inferred from the transition state / final function
- Capitalization of identifiers will be preserved if put in quotes "iDeNtiFiEr"
- Line numbers are preserved
Still to be implemented:
- Support for all PostgreSQL types
- Automatically generate documentation for type definitions

General

All module documentation should be moved to .sql_in files. See bayes.sql_in and regression.sql_in as examples.
All uninstallation SQL files should end in "_drop.sql_in". Otherwise, they show up in Doxygen as visual clutter in the file list.
All files containing a "/sql/" in their path are excluded. These files are assumed to belong to regression tests and should not clutter the file list, either.
When in doubt, stick to the best practices of the language you are using. E.g., Python gives the following advice for its docstrings: http://www.python.org/dev/peps/pep-0257

Section Guide

Create a new group for your module (in methods/mainpage.dox) and use @addtogroup your_module.
Write @about section to describe your algorithm.
Write @prereq section, for example: Requires SVEC MADlib module. Nothing about PostreSQL or Greenplum database.
Write @usage section to describe the API. (In the future we may need to split this into os-level side and in-db side.)
Write @examp section. The reason we say 'examp' (instead of example) is because we don't want to see this on a Doxygen example tab.
Use @literature to list your references.

See bayes.sql_in for an example.

Page tree

Documentation Guide (Doxygen)

MADlib Doxygen Sites

Documenting SQL

General

Section Guide