Object Oriented Programming for Machine Learning with Python

January 1, 0001

I will remember 2019 as a year in which I fully embraced Python as my go-to machine learning tool. (I am still going for R when I need to do exploratory data analysis, because tidyverse is just easier to use in that case than pandas). Scikit-learn universe with unified API’s, database connections and proper programming toolset is simply too good to not use it for complex machine learning projects.

After writing and adapting the same functions for couple of different projects I decided to move on and try out object programming in order to keep the code clean and readable. If used well, OOP for machine learning can help you symplify your project. Aim of this blogpost is to provide example machine learning class and show how the workflow could look like.

I am not a seasoned python developer yet, so I assume there are parts that can be written better way than this, but this example is to show the possibilities.

How OOP can help?

  • Organizes code into smaller parts by using methods, which makes it more readable and easier to maintain
  • Most of the functions in ML are still being used once per project, but they require a lot of code, so it’s nicer to keep it as part of a class
  • You can import only one class, instead of multiple one time use functions

Typical machine learning class:

class MLModel():

    def __init__(self,
            estimators,
            parameters,
            ensemble,
            verbose,
            sql_con):
        
        self.estimators = estimators
        self.parameters = parameters
        self.verbose = verbose
        self.sql_con = sql_con
        self.x_train = None
        self.x_test = None
        self.y_train = None
        self.y_test = None
        self.column_transformer = None
        
        
        KaggleMember.kaggle_member_count+=1 #line7
        self.kaggle_id=KaggleMember.kaggle_member_count #line8
        
        
    
    def get_data(self): 
        
    
    
    
    
    
    
    
    
    
    

Usage of the class:

mlmodel = MLModel(
    njobs=-1,
    selected_models='rf,lgbm',
    model_grid_list=get_model_params(),
    search_method='random',
    search_iters=1,
    search_scorer='f1',
    ensemble_method='soft_voting',
    verbose=3,
    sql_engine=sql_engine,
    with_mlflow=1,
    mlflow_experiment=1)

mlmodel.fit_full()

Full script:

import os
from sqlalchemy import create_engine
from model_class import MLModel
sql_con = create_engine("your connection definition")