import pandas as pd import numpy as np import joblib from sklearn.model_selection import train_test_split from sklearn.impute import SimpleImputer from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.pipeline import make_pipeline, Pipeline from sklearn.preprocessing import StandardScaler, OneHotEncoder from sklearn.compose import ColumnTransformer d1 = {‘Social_media_followers’:[1000000, np.nan, 2000000, 1310000, 1700000, np.nan, 4100000, 1600000, 2200000, 1000000], ‘Sold_out’:[1,0,0,1,0,0,0,1,0,1]} df1 = […]
Train Test Split
Train Test Split is an important concept that future Data Scientists or Machine Learning Engineers need to pick up early on. When building models, you’ll want to split your data into two different sets. One for training a model, and one for testing a model. This article is based on the popular YouTube video on […]
PACF Partial Autocorrelation Function
In this Data Science article, we are going to take a look at the Partial Autocorrelation Function (PACF). We will go over the background and then look at plotting both non stationary and stationary data. If you want to watch a video based around this tutorial, it is embedded below. https://youtu.be/XstPVx78yi8 PACF Background The PACF […]
ACF Autocorrelation Function
In this Data Science lesson we are going to take a look the Autocorrelation Function. Often abbreviated as ACF it can let us know if our data is stationary or not. We will go over some of the background behind it and plot it with the help of Python. If you want to watch a […]
Pandas Series
What is a Series in Python Pandas In Python’s Pandas library, one of the foundational data structures you’ll encounter is the Series. At first glance, a Series may seem simple—much like a single column or row in a spreadsheet. However, there’s more to it than meets the eye. A Pandas Series is a one-dimensional labeled […]
Pandas Index
An index within Python Pandas is a way to identify a specific row within a dataframe. In this lesson we will be going over string and integer indexes as well as multindexes. If you want to watch a video based on the tutorial, it is linked down below. https://youtu.be/eEXju_yrxpM Indexes vs Indices Often you’ll hear […]
Pandas Pivot
In this lesson, we are going over 6 different examples of how you can utilize pivot within python pandas. Pivot allows you to reshape a dataframe and grab aggregate values quite fast in just one line of code. We will go through some east examples and then add on complexity as the lesson progresses. If […]
Pandas loc
In the world of data manipulation with Python’s Pandas library, selecting and accessing data efficiently is a fundamental skill. One of the most commonly used tools for this is the .loc[] indexer, which stands for location. In this article, we’re diving into 15 examples that demonstrate how to use .loc[] effectively. Unlike its counterpart .iloc[], […]
Pandas iloc
In Python Pandas iloc stands for integer location. In this lesson we are going over 12 different examples of how we can utilize this to grab data within our dataframes. If you want to watch a video tutorial of this lesson it is linked below. Import in Pandas To start we’re going to create a […]
Pandas Merge
Merges in Python Pandas are like joins in SQL. In this lesson we are going to go through 7 different examples of using Merge. It will cover frequently used merges like left and inner while still going over infrequently used ones like full outer and cross. This tutorial is based on a YouTube video we […]