import pandas as pd import numpy as np import joblib from sklearn.model_selection import train_test_split from sklearn.impute import SimpleImputer from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.pipeline import make_pipeline, Pipeline from sklearn.preprocessing import StandardScaler, OneHotEncoder from sklearn.compose import ColumnTransformer d1 = {‘Social_media_followers’:[1000000, np.nan, 2000000, 1310000, 1700000, np.nan, 4100000, 1600000, 2200000, 1000000], ‘Sold_out’:[1,0,0,1,0,0,0,1,0,1]} df1 = […]
Train Test Split
Train Test Split is an important concept that future Data Scientists or Machine Learning Engineers need to pick up early on. When building models, you’ll want to split your data into two different sets. One for training a model, and one for testing a model. This article is based on the popular YouTube video on […]
K-Nearest Neighbors
Comprehensive Understanding to K-Nearest Neighbors (KNN) in Supervised Machine Learning. K-Nearest Neighbors (KNN) is a simple, widely used supervised learning algorithm in data science and machine learning It was developed by Evelyn Fix and Joseph Hodges in 1951. Known for it usefulness and versatality, KNN can handle both classification and regression tasks when needed. https://youtu.be/Nz73vXn5afE […]
Optuna Hyperparameter Tuning
Optuna is a hyperparameter optimization framework for machine learning models. It can help automate and streamline the process of tuning the hyperparameters. It’s quite popular among Kaggle users and you’ll see it used within competitions. In this article, we will go over an example of using it on a basic dataset. There is also a […]
Ordinal Encoder
When working with real world data, you’ll often have to deal with categorical information. This can be a problem when working with Machine Learning models as most cannot use it. Instead, Data Scientists and Machine Learning engineers need to convert this into a numerical format. This is where the Ordinal Encoder in Scikit-Learn can help. […]
One Hot Encoder
In the realm of machine learning and data science, preparing your data is often as crucial as the modeling itself. One of the essential preprocessing steps when working with categorical data is one-hot encoding. This technique transforms categorical variables into a format that can be provided to machine learning algorithms to improve predictions and insights. […]