...
- Register the module.
- Define the SQL functions.
- Implement the functions in C++.
- Register the C++ header files.
The files for this exercise can be found in the hello world folder of the source code repository.
1. Register the module
Add the following line to the file called Modules.yml
under ./src/config/
...
Code Block | ||
---|---|---|
| ||
SELECT madlib.avg_var(second_attack) FROM patients; -- ************ -- -- Result -- -- ************ -- +-------------------+ | avg_var | |-------------------| | [0.5, 0.25, 20.0] | +-------------------+ -- (average, variance, count) -- |
...
The files for the above exercise can be found in the hello world folder of the source code repository.
Anchor | ||||
---|---|---|---|---|
|
...
Compared to the steps presented in the last session, here we do not need to modify the Modules.yml
file because we are not creating new module. Another difference is that we create an additional .py_in
python file along with the .sql_in
file. That is where most of the iterative logic will be implemented.
The files for this exercise can be found in the hello world folder of the source code repository.
1. Overview
The overall logic is split into three parts. All the UDF and UDA are defined in simple_logistic.sql_in
. The transition
, merge
and final
functions are implemented in C++. Those functions together constitute the UDA called __logregr_simple_step
which takes one step from the current state to decrease the logistic regression objective. And finally in simple_logistic.py_in
the plpy
package is used to implement in python a UDF called logregr_simple_train
which invokes __logregr_simple_step
iteratively until convergence.
...
Code Block | ||
---|---|---|
| ||
SELECT madlib.logregr_simple_train( 'patients', -- source table 'logreg_mdl', -- output table 'second_attack', -- labels 'ARRAY[1, treatment, trait_anxiety]'); -- features SELECT * FROM logreg_mdl; -- ************ -- -- Result -- -- ************ -- +--------------------------------------------------+------------------+ | coef | log_likelihood | |--------------------------------------------------+------------------| | [-6.27176619714, -0.84168872422, 0.116267554551] | -9.42379 | +--------------------------------------------------+------------------+ |
...
...
...