Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • `experimental` is defined as 
    1. exploratory data analysis
    2. development in notebooks
    3. essentially ad-hoc choice of tools
    4. generally batch only, "one off", manual execution
    5. small data, manual sampling
    6. models are trained offline
    7. the end result being reports, diagrams, etc, 
  • `production` = pretty much the opposite
    1. end result are enterprise data science applications 
    2. ran in production 
    3. with large, multi-dimensional data set`s that do not fit in RAM, logically infinite
    4. hence the algorithms / analysis must be incremental
    5. use of managed `data set`s : `data lake`s, `feature store`s
    6. models are trained onlineincrementally (_"training offline periodically and refreshed/deployed every few hours/days"_)
    7. with awareness of `concept drift`, `distribution drift`, `adversarial attacks` and able to adapt
    8. use complex orchestration between core analysis and  decision layer, model monitoring and other application logic and business processes, some involving human interactions

...