This is a quick start guide for installing from binaries or compiling from source for MADlib®.
Please note that a Greenplum database sandbox VM with MADlib pre-installed is also available to get started quickly, as an alternative to following the installation steps described on this page.
Quick Start With Binaries
Prerequisites
Install and configure your database of choice. MADlib currently supports the following platforms:
- PostgreSQL
- Greenplum database
- Apache HAWQ (incubating)
This guide describes the installation steps for Postgres and Greenplum. (HAWQ installation steps will be added at a later date.)
Postgres platform notes:
- Ensure that you install Postgres with the Python extension specified.
- If environment variables are defined, this can save you some typing.
Installing MADlib
- Download the MADlib binary
- Postgres: Get either the OSX or Redhat/CentOS binary from the MADlib download page
- Pivotal Greenplum Database: Download the .gppkg binary from Pivotal Network
- Install the package at the OS level.
- Postgres:
- on OSX double click the installer package
on Redhat / CentOS run the following as root:
yum install <madlib_package> --nogpgcheck
- Pivotal Greenplum Database:
on Redhat / CentOS run the following as gpadmin:
gppkg install <madlib_package>
- Postgres:
Ensure that the environment is setup for your database deployment and that the database is up and running.
Ensure that psql, postgres, and pg_config are in your path
which psql which postgres which pg_config
Ensure that the database is started and running
psql -c 'select version()'
The above may need user/port/password setting depending on how the database has been configured.
Run the MADlib deployment utility to install MADlib into each database that you want to use it:
Postgres:
/usr/local/madlib/bin/madpack -s madlib –p postgres install
if environment variables are defined. Otherwise use a fully defined connection string:
/usr/local/madlib/bin/madpack -s madlib -p postgres -c [user[/password]@][host][:port][/database] install
Pivotal Greenplum Database:
/usr/local/madlib/bin/madpack –p greenplum install
The above may need user/port/password setting depending on how the database has been configured.
For more information on madpack:
/usr/local/madlib/bin/madpack --help
Help output is also attached to this wiki page for your reference.
Test your installation
Postgres:
/usr/local/madlib/bin/madpack -s madlib –p postgres install-check
Pivotal Greenplum Database:
/usr/local/madlib/bin/madpack –p greenplum install-check
The above may need user/port/password setting depending on how the database has been configured.
Compiling From Source
Prerequisites
Requirements for installing MADlib:
- gcc (For OSX, Clang will work for compiling the source, but not for documentation.)
- An installed version of HAWQ, Greenplum Database 4.2+ or Postgre (64-bit) 9.2+ with plpython support enabled. Note: plpython may not be enabled in Postgres by default.
Installing MADlib
In the $MADLIB_ROOT
directory (location of MADlib source) run the following commands:
mkdir build cd build cmake .. make
Above, we built the executables in the build
folder. This can, however, be any user-named folder (henceforth called $BUILD_ROOT
).
Deploying MADlib
Deploy MADlib into the database with MADlib package manager madpack
located under $BUILD_ROOT/src/bin
.
To install:
$BUILD_ROOT/src/bin/madpack -p postgres -c [user[/password]@][host][:port][/database] install
To make sure that the installation is successful:
$BUILD_ROOT/src/bin/madpack -p postgres -c [user[/password]@][host][:port][/database] install-check
For more information on the usage of madpack:
$BUILD_ROOT/src/bin/madpack --help
Defining environment variables
The variables below will be automatically used by the madpack
installer if no connection string is provided:
- User:
PGUSER
orUSER
(defaults to OS username) - Password:
PGPASSWORD
(defaults to empty) - Host:
PGHOST
(defaults to 'localhost') - Database:
PGDATABASE
(defaults to OS username) - Port:
PGPORT
(defaults to 5432)
An example of deploying MADlib using the environment variables:
export PGPORT=5430 export PGHOST=127.0.0.1 export PGDATABASE=madlibtest $BUILD_ROOT/src/bin/madpack -p postgres install