You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Contents:

  1. Introduction
  2. Naming Conventions
    • 2.1. Schema
    • 2.2. Functions & Aggregates
  3. Functions and Languages
  4. Function Name Overloading
  5. Guide to Driver UDFs
    • 5.1. Input Definition
    • 5.2. Output Definition
    • 5.3. Logging
    • 5.4. Parameter Validation
    • 5.5. Multi-User and Multi-Session Execution
  6. Support Modules
    • 6.1. DB connectivity module: plpy.py
    • 6.2. Python abstraction layer.

1. Introduction

The purpose of this document is to define the SQL Interface for MADlib algorithms.

2. Naming Conventions

Names should use lower-case characters separated with underscores.

This is applicable to all database objects (tables, views, functions, function parameters, datatypes, operators, etc).

2.1. Schema

All database objects should be created in the default MADlib schema. Use MADLIB_SCHEMA as the schema prefix for your tables/views/functions/etc. in any scripts. This literal will be replaced during the installation with the target schema name (configured by the user in Config.yml). Code examples below use prefix madlib for illustration purposes only.

2.2. Functions & Aggregates

All non-user facing routines should be named with a "__" (double underscore) prefix to make the catalog easier to read.

Module specific routines should have a SHORT and COMMON prefix based on the module they belong to; for example:

 

madlib.mregr_coef(...)

 

 * Naive-Bayes classification functions could start with **nb_**: 

madlib.nb_create_view(...)

 

See the current function catalog for more examples.

General purpose routines should be named without a reference to any module and should be created inside a general purpose MADlib modules (e.g. Array Operations). For example:

  • Function returning the key of the row for which value is maximal:

 

madlib.argmax (integer key, float8 value)

 

 

 

 

  • No labels