Object Oriented Programming for Machine Learning with Python
I will remember 2019 as a year in which I fully embraced Python as my go-to machine learning tool. (I am still going for R when I need to do exploratory data analysis, because tidyverse is just easier to use in that case than pandas). Scikit-learn universe with unified API’s, database connections and proper programming toolset is simply too good to not use it for complex machine learning projects.
After writing and adapting the same functions for couple of different projects I decided to move on and try out object programming in order to keep the code clean and readable. If used well, OOP for machine learning can help you symplify your project. Aim of this blogpost is to provide example machine learning class and show how the workflow could look like.
I am not a seasoned python developer yet, so I assume there are parts that can be written better way than this, but this example is to show the possibilities.
How OOP can help?
- Organizes code into smaller parts by using methods, which makes it more readable and easier to maintain
- Most of the functions in ML are still being used once per project, but they require a lot of code, so it’s nicer to keep it as part of a class
- You can import only one class, instead of multiple one time use functions
Typical machine learning class:
class MLModel():
def __init__(self,
estimators,
parameters,
ensemble,
verbose,
sql_con):
self.estimators = estimators
self.parameters = parameters
self.verbose = verbose
self.sql_con = sql_con
self.x_train = None
self.x_test = None
self.y_train = None
self.y_test = None
self.column_transformer = None
KaggleMember.kaggle_member_count+=1 #line7
self.kaggle_id=KaggleMember.kaggle_member_count #line8
def get_data(self):
Usage of the class:
mlmodel = MLModel(
njobs=-1,
selected_models='rf,lgbm',
model_grid_list=get_model_params(),
search_method='random',
search_iters=1,
search_scorer='f1',
ensemble_method='soft_voting',
verbose=3,
sql_engine=sql_engine,
with_mlflow=1,
mlflow_experiment=1)
mlmodel.fit_full()
Full script:
import os
from sqlalchemy import create_engine
from model_class import MLModel
sql_con = create_engine("your connection definition")