• Blog
  • YouTube
  • Discord
Sponsorships
Mentorships

Blog

  • Home
  • Blog
Python

Augmented Dickey–Fuller test

May 21, 2025 Ryan Nolan No comments yet

#import pandas as pd import numpy as np import matplotlib.pyplot as plt from statsmodels.tsa.stattools import adfuller # Generate synthetic stationary and non-stationary data np.random.seed(17) # Stationary data: White noise stationary_data = np.random.normal(size=82) # Non-stationary data: Random walk non_stationary_data = np.cumsum(np.random.normal(size=82)) # Plot the data plt.figure(figsize=(10,5)) plt.subplot(1, 2, 1) plt.plot(stationary_data) plt.title(‘Stationary Data’) plt.subplot(1, 2, 2) plt.plot(non_stationary_data) […]

Python

KPSS-test

May 21, 2025 Ryan Nolan No comments yet

#import pandas as pd import numpy as np import matplotlib.pyplot as plt from statsmodels.tsa.stattools import kpss # Generate synthetic stationary and non-stationary data np.random.seed(17) # Stationary data: White noise stationary_data = np.random.normal(size=100) # Create a random walk with larger step size to make it more volatile random_walk = np.cumsum(np.random.normal(scale=2, size=n)) # Increase the scale for […]

scikit-learn

Multicollinearity

May 21, 2025 Ryan Nolan No comments yet

dividing the total number of bases a player records by their total number of at-batsmaybe replace this with something else? CORRELATION MATRIX VIF Instead of using raw height, you might normalize or categorize height into bins, which could reduce the numerical interdependence.Calculate Condition Index (CI) How to address MulticollinearityDrop a Feature (At Bats) look at […]

Python Pandas

Pandas Sample

May 21, 2025 Ryan Nolan No comments yet

To start we’re going to create a simple dataframe in python: https://youtu.be/REhRhRUcluI Example 1 – if else state location To start we’re going to create a simple dataframe in python: #DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None, ignore_index=False) import pandas as pd import random import string import numpy as np Prep The Dataframe # Function to […]

Time Series

Simple Exponential Smoothing

May 21, 2025 Ryan Nolan No comments yet

import pandas as pd import numpy as np import matplotlib.pyplot as plt from statsmodels.tsa.api import SimpleExpSmoothing df = pd.read_csv(‘/content/all_stocks_5yr.csv’) apple_df = df[df[“Name”] == “AAPL”].copy() apple_df[“date”] = pd.to_datetime(apple_df[“date”]) apple_df.sort_values(“date”, inplace=True) apple_df.set_index(“date”, inplace=True) apple_df = apple_df.asfreq(‘B’) apple_df[“close”] = apple_df[“close”].interpolate() apple_close = apple_df[“close”] plt.figure(figsize=(10, 4)) plt.plot(apple_close, label=”Apple Closing Price”, color=”black”) plt.title(“Apple Stock Closing Prices”) plt.xlabel(“Date”) plt.ylabel(“Price”) plt.legend() plt.grid(True) […]

Python Pandas

Pandas MultiIndex

May 21, 2025 Ryan Nolan No comments yet

https://www.youtube.com/watch?v=XHOmBV4js_Emaybe other ideas#https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.html#https://pandas.pydata.org/pandas-docs/version/1.2.1/user_guide/advanced.html #https://jessicastringham.net/2019/12/10/multiindex/ #You have now created a multi-index, or hierarchical index (become comfortable with both these terms as you’ll find them used interchangeably)#It may be important to address that despite being able to convert the contents of more than one column into index, we cannot consider that now one row has several indexes. […]

Python Pandas

Pandas Replace

May 21, 2025 Ryan Nolan No comments yet

import pandas as pd import numpy as np #Similiar to map # .replace() is not operating on the contents of the DataFrame as strings—it’s trying to match the entire value data = { ‘Player’: [‘Barry Bonds’, ‘Hank Aaron’, ‘Babe Ruth’, ‘Alex Rodriguez’, ‘Albert Pujols’, ‘Willie Mays’, ‘Ken Griffey Jr.’], ‘HR’: [762, 755, 714, 696, 703, […]

Python Pandas

Pandas Where

May 19, 2025 Ryan Nolan No comments yet

https://www.youtube.com/watch?v=Y7HMkDuR_DA&feature=youtu.be The where() function in Pandas is used to replace values in a DataFrame or Series where a condition is not met.  It is used to check a data frame for one or more conditions and return the result. To start with we import pandas and numpy import pandas as pd import numpy as np […]

Python Pandas

Pandas Mask

May 18, 2025 Ryan Nolan No comments yet

To start we’re going to create a simple dataframe in python: https://www.youtube.com/watch?v=XHOmBV4js_E Prep the Data To start we’re going to create a simple dataframe in python: import pandas as pd import numpy as np df = pd.DataFrame({ ‘Hourly_Salary’: [‘500.00’, ‘10000.00’, ‘200.00’, ‘20.00’, np.nan] }) df[‘Hourly_Salary’] = pd.to_numeric(df[‘Hourly_Salary’]) Example 1 – if else state location To […]

Python Pandas

Pandas Interpolation

May 12, 2025 Ryan Nolan No comments yet

To start we’re going to create a simple dataframe in python: https://youtu.be/BJHwPeRvyPE?si=lvsDqXBjb0mcC4ae import pandas as pd import numpy as np data = { ‘day’: pd.date_range(start=’2025-04-19′, periods=7), ‘temperature’: [np.nan, 30, np.nan, np.nan, 45, 40, np.nan] } df = pd.DataFrame(data) df2 = df.copy() df3 = df.copy() df4 = df.copy() Example 1 To start we’re going to create […]

Posts pagination

1 2 … 7 Next

Search

Categories

  • LangChain 2
  • LeetCode 8
  • Python 8
  • Python Pandas 28
  • scikit-learn 11
  • Time Series 4
  • Uncategorized 2

Recent posts

  • Augmented Dickey–Fuller test
  • KPSS-test
  • Multicollinearity

Helping Data Professions further there careers

Important Links
  • Blog
  • Sponsorships
  • Mentorships
LinkedIn
  • Ryan Nolan
  • Matt Payne
Get in touch
  • ryannolandata@gmail.com

© Ryan & Matt Data Science

  • Terms & Conditions
  • Privacy Policy