You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 26 Next »

To set up PostgreSQL and MADlib with Anaconda Python on OSX, follow the super quick start.

Otherwise, follow the regular guides for installing from binaries or compiling from source.

Please note that a Greenplum database sandbox VM with MADlib pre-installed is also available to get started quickly, as an alternative to following the installation steps described on this page.

 

Super Quick Start

To set up PostgreSQL + MADLib with Anaconda Python on OSX:

 

PYTHON=/Users/janedoe/anaconda/bin/python 

brew install postgresql --with-python
brew services start postgresql
— — Set up database and roles
— — Install the .dmg of madlib 1.9.1 downloaded from MADlib website
/usr/local/madlib/bin/madpack -s madlib -p postgres install

Quick Start With Binaries

Prerequisites

Install and configure your database of choice. MADlib currently supports the following platforms:

  • PostgreSQL
  • Greenplum database
  • Pivotal HDB/Apache HAWQ (incubating)

Postgres platform notes:

  • Ensure that you install Postgres with the Python extension specified.  If not you will see an error message like the one below:
 /usr/local/madlib/bin/madpack -s madlib -p postgres install
madpack.py : INFO : Detected PostgreSQL version 9.5.
madpack.py : INFO : *** Installing MADlib ***
madpack.py : INFO : MADlib tools version = 1.9.1 (//usr/local/madlib/Versions/1.9.1/bin/../madpack/madpack.py)
madpack.py : INFO : MADlib database version = None (host=localhost:5432, db=postgres, schema=madlib)
madpack.py : INFO : Testing PL/Python environment...
madpack.py : INFO : > Creating language PL/Python...
madpack.py : ERROR : SQL command failed:
SQL: CREATE LANGUAGE plpythonu;
ERROR: could not access file "$libdir/plpython2": No such file or directory
madpack.py : ERROR : Cannot create language plpythonu. Stopping installation...
madpack.py : ERROR : MADlib installation failed.

To install PostgreSQL with PL/Python server-side language, use the ```--with-python``` parameter as described here in the PosgreSQL documentation.

Installing MADlib

  1. Download the MADlib binary

Install the package at the OS level.

  • Postgres:
    • on OSX double click the installer package
    • on Redhat / CentOS run the following as root:

      yum install <madlib_package> --nogpgcheck
  • Greenplum Database:
    • on Redhat / CentOS run the following as gpadmin:

      gppkg -i <madlib_package>
  • HDB/HAWQ:
    • on Redhat / CentOS run the following as gpadmin:

      gppkg -i <madlib_package>
  1. Ensure that the environment is setup for your database deployment and that the database is up and running.
    • Ensure that psql, postgres, and pg_config are in your path

      which psql
      which postgres 
      which pg_config
    • Ensure that the database is started and running

      psql -c 'select version()'

      The above may need user/port/password setting depending on how the database has been configured.

  1. Run the MADlib deployment utility to install MADlib into each database that you want to use it:

    • Postgres:

      /usr/local/madlib/bin/madpack -s madlib –p postgres install

      if environment variables are defined. Otherwise use a fully defined connection string:

      /usr/local/madlib/bin/madpack -s madlib -p postgres -c [user[/password]@][host][:port][/database] install
    • Greenplum Database:

      /usr/local/madlib/bin/madpack –p greenplum install

      The above may need user/port/password setting depending on how the database has been configured.

    • HDB/HAWQ:

      /usr/local/madlib/bin/madpack –p hawq install

      The above may need user/port/password setting depending on how the database has been configured.

    For more information on madpack:

    /usr/local/madlib/bin/madpack --help

    Help output for madpack is also attached to this wiki page for your reference.

    After installation gpadmin should grant all privileges on schema madlib to users who will be accessing MADlib functions.  Otherwise, users will get "ERROR: permission denied for schema MADlib."  See the PostgreSQL docs for information on schemas and privileges.


  2. Test your installation

    • Postgres:

      /usr/local/madlib/bin/madpack -s madlib –p postgres install-check
    • Greenplum Database:

      /usr/local/madlib/bin/madpack –p greenplum install-check

      The above may need user/port/password setting depending on how the database has been configured.

    • HDB/HAWQ:

      /usr/local/madlib/bin/madpack –p hawq install-check

      The above may need user/port/password setting depending on how the database has been configured.

Installing from PGXN (PostgreSQL)

Prerequisites

Requirements for installing MADlib:

  • gcc (For OSX, Clang will work for compiling the source, but not for documentation.)
  • pgxn installed
  • PostgreSQL (64-bit) 9.2+ with plpython support enabled. Note: plpython may not be enabled in Postgres by default.

 

Use below commnd to install and load the latest MADlib package uploaded on PGXN.  

pgxn install madlib
pgxn load madlib 

 

Compiling From Source

Prerequisites

Requirements for installing MADlib:

  • gcc (For OSX, Clang will work for compiling the source, but not for documentation.)
  • An installed version of HDB/HAWQ, Greenplum Database 4.2+ or PostgreSQL (64-bit) 9.2+ with plpython support enabled. Note: plpython may not be enabled in Postgres by default.

Installing MADlib

In the $MADLIB_ROOT directory (location of MADlib source) run the following commands:

mkdir build 
cd build 
cmake .. 
make

Above, we built the executables in the build folder. This can, however, be any user-named folder (henceforth called $BUILD_ROOT).

Deploying MADlib

Deploy MADlib into the database with MADlib package manager madpack located under $BUILD_ROOT/src/bin.

  • To install:

    $BUILD_ROOT/src/bin/madpack -p postgres -c [user[/password]@][host][:port][/database] install
  • To make sure that the installation is successful:

    $BUILD_ROOT/src/bin/madpack -p postgres -c [user[/password]@][host][:port][/database] install-check
  • For more information on the usage of madpack:

    $BUILD_ROOT/src/bin/madpack --help

Defining environment variables

The variables below will be automatically used by the madpack installer if no connection string is provided:

  1. User: PGUSER or USER (defaults to OS username)
  2. Password: PGPASSWORD (defaults to empty)
  3. Host: PGHOST (defaults to 'localhost')
  4. Database: PGDATABASE (defaults to OS username)
  5. Port: PGPORT (defaults to 5432)

An example of deploying MADlib using the environment variables:

export PGPORT=5430
export PGHOST=127.0.0.1
export PGDATABASE=madlibtest
$BUILD_ROOT/src/bin/madpack -p postgres install

 

 
  • No labels