scikit-learn - Ryan & Matt Data Science

gradient boosting regressor

June 24, 2025 Pere No comments yet

Boosting in machine learning is a technique that combines multiple simple models, often decision trees into a single, stronger model. It works with regression trees and improves performance by sequentially learning from the mistakes of previous models. According to the scikit-learn documentation, at each stage, a regression tree is fit on the negative gradient of […]

scikit-learn

Random Forest Regressor

June 24, 2025 Ryan Nolan No comments yet

Random forest regressor is a variant of the random forest classifier. It is primarily used for classification tasks. This model is an ensemble of decision trees. It combines the predictions of multiple individual trees to imrpove performance. By aggregating the results from those trees, typically through votng or avaeraging. It produces a final prediction that […]

scikit-learn

machine learning imbalanced classes

June 6, 2025 Ryan Nolan No comments yet

#Read over#data professor#emma Ding#mahesh huddar#ritvik mathPart 1 Load a Dataset Part 2 SIMPLE EDA Part 3 Set Up the Data Part 4 BASELINE MODEL – NO FIXING THE IMBALANCE part 5Oversampling ExampleOversampling Example 1 RandomOverSampler To start we’re going to create a simple dataframe in python led to overfitting part 6Oversampling Method Example 2 SMOTE […]

scikit-learn

Column Transformer

June 6, 2025 Ryan Nolan No comments yet

#drop #Example Passthrough some columns, drop offthers

scikit-learn

extra trees classifier

June 6, 2025 Ryan Nolan No comments yet

The Extra Trees Classifier is an ensemble machine learning methid that cimbines predictions from many individual trees. https://youtu.be/S2e70seVw3k Aggregates the results from group of decision trees (Like a random forest)Difference1. ETC randomly selects the value to split features unlike a DTC which looks for the best2. Makes ETC More random + Faster Algorithm which […]

scikit-learn

Lasso Regression

June 6, 2025 Ryan Nolan No comments yet

https://youtu.be/LmpBt0tenJE#LASSO stands for Least Absolute Shrinkage and Selection Operator#L1 regularization #address overfitting – A model that is too complex may fit the training data very well#but perform poorly on new, unseen data #will get rid ofe useless features (make coefficients independent var next to 0)#- automatic feature selection # lead to a simpler model that […]

scikit-learn

Ridge Regressor

June 4, 2025 Ryan Nolan No comments yet

https://youtu.be/GMF4Td7KtB0#Ridge Regression which is considered #L2 Regularization #helps with overfitting in linear regression models #keeping the coefficients small # lead to a model that is less prone to overfitting #balance between fitting the data and keeping the coefficients small #more robust and stable models, particularly when dealing with datasets that have highly correlated predictor variables […]

scikit-learn

Stacking Regressor

June 3, 2025 Ryan Nolan No comments yet

SEE ALL NULL VLAUES voting classifier hyperparamater tuning

scikit-learn

Multicollinearity

May 21, 2025 Ryan Nolan No comments yet

dividing the total number of bases a player records by their total number of at-batsmaybe replace this with something else? CORRELATION MATRIX VIF Instead of using raw height, you might normalize or categorize height into bins, which could reduce the numerical interdependence.Calculate Condition Index (CI) How to address MulticollinearityDrop a Feature (At Bats) look at […]

scikit-learn

Sklearn Gaussian Mixture Models

April 3, 2025 Ryan Nolan No comments yet

In Scikit-Learn Gaussian Mixture Models allow you to represent clusters of data into multiple normal distributions. This tutorial will walk you through two different examples of utilizing GMMs. We will go through one with generated blobs and another with baseball card values. If you want to watch a video based around the tutorial, we have […]

gradient boosting regressor

Random Forest Regressor

machine learning imbalanced classes

Column Transformer

extra trees classifier

Lasso Regression

Ridge Regressor

Stacking Regressor

Multicollinearity

Sklearn Gaussian Mixture Models

Important Links

LinkedIn

Get in touch