Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A common discrepancy not highlighted in the document above is `col` vs `column`. Some arguments names are of the form `*_column_name` while others are `*_col_name`. One of these two must be chosen and applied across the whole product (including internal source code).

Named parameters

Change the parameter lists to named parameters like scikit-learn, rather than the ordered set of parameters currently used in MADlib where you can't do things out of order.

Interfaces for Cross validation

...

 

group_col1

group_col2

coef

std_err

...

u1

v1

<coef for u1, v1>

<std. err for u1, v1>

 

u1

v2

<coef for u1, v2>

  

...

    

u2

v1

<coef for u2, v1>

  

 

Summary table could include:

...

- total_rows_processed    

- total_rows_skipped  skipped 

- time_stamp_start

- time_stamp_end

 - elapsed_time

- user_string_1 (user label or name)

- user_string_2 (user description)