Ryan Nolan - Ryan & Matt Data Science

Python Pandas JSON

July 19, 2025 Ryan Nolan 1 comment

JSON (JavaScript Object Notation) is a lightweight, human-readable data interchange format that is widely used for both data storage and transfer. It is structured using key-value pairs and supports various data types, including strings, numbers, booleans, arrays, and nested objects. JSON is a standard format commonly used in APIs and web data, which makes it […]

Web Scraping

beautifulsoup pagination

July 19, 2025 Ryan Nolan No comments yet

import requests – Allows us to make HTTP requests to web pages. from bs4 import BeautifulSoup –It is used to parse and extract data from HTML content. import pandas as pd – It is used for organizing and manipulating data in table format. import re – It enables pattern matching using regular expressions. from time […]

scikit-learn

adaboost classifier

July 12, 2025 Ryan Nolan No comments yet

Adaptive Boosting, or AdaBoost, is a boosting algorithm that combines multiple low-accuracy (weak) models to form a single high-accuracy (strong) model. It works by sequentially training these weak learners, each one focusing more on the errors made by the previous ones. Any machine learning algorithm that supports weighted training samples—such as Decision Trees, Logistic Regression, […]

Python

Gradient boosting classifier

July 12, 2025 Ryan Nolan No comments yet

Gradient Boosting is an ensemble technique that builds a strong model by combining multiple weak decision trees. While it may seem similar to a Random Forest, there’s a key difference: in Random Forests, each tree is built independently, whereas in Gradient Boosting, trees are built sequentially, with each new tree correcting the errors of the […]

scikit-learn

Kaggle House price prediction Regression Analysis

July 12, 2025 Ryan Nolan No comments yet

train_df = train_df.drop(columns=[‘PoolQC’, ‘MiscFeature’, ‘Alley’, ‘Fence’, ‘GarageYrBlt’, ‘GarageCond’, ‘BsmtFinType2’]) test_df = test_df.drop(columns=[‘PoolQC’, ‘MiscFeature’, ‘Alley’, ‘Fence’, ‘GarageYrBlt’, ‘GarageCond’, ‘BsmtFinType2’]) #drop GarageArea or GarageCars #build models

scikit-learn

kaggle titanic tutorial

July 12, 2025 Ryan Nolan No comments yet

#military – Capt, Col, Major #noble – Jonkheer, the Countess, Don, Lady, Sir #unmaried Female – Mlle, Ms, Mme #NEW Drop Sibsp, Parch, TicketNumberCounts #OLD #X = train_df.drop([‘Survived’], axis=1) #y = train_df[‘Survived’] #X_test = test_df.drop([‘Age_Cut’, ‘Fare_Cut’], axis=1)

Statistics

python variance and standard deviation

July 6, 2025 Ryan Nolan No comments yet

https://youtu.be/p4H2b2x_nWc#population and sample variance/std deviationVariance measures how far each data point in the set is from the mean andthus from every other point in the set. It is the average of the squared differences from the mean.Population variance is calculated when you have data for the entire population.It gives a measure of the dispersion of […]

scikit-learn

hyperparameter tuning with scikit learn

July 5, 2025 Ryan Nolan No comments yet

We would be looking at tuning hyperparameters with Scikit-Learn. Scikit-Learn is a powerful machine learning library for Python. It provides simple , efficient tools for data analysis and modeling. Hyperparameter tuning is the process of finding the best values for the settings of a machine learning model that are not learned from data, but set […]

scikit-learn

principal component analysis scikit learn

July 5, 2025 Ryan Nolan No comments yet

PCA (Principal Component Analysis) in Python using Scikit-learn is a technique used to reduce the number of features in a dataset while preserving most of the variance (information). It works by: Finding new axes (principal components) that capture the most variance. Projecting the data onto these fewer dimensions. It’s useful for visualization, speeding up models, […]

Python

Reflexion Prompting

July 3, 2025 Ryan Nolan No comments yet

This technique is highly effective for chatbots and problem-solving tasks. It also helps reduce hallucinations by incorporating a form of quality control. The process involves: Starting with an initial prompt Getting the AI’s first response Sending a reflexion prompt asking the AI to review and reflect on its first answer Receiving an optimized response, improved […]

Python Pandas JSON

beautifulsoup pagination

adaboost classifier

Gradient boosting classifier

Kaggle House price prediction Regression Analysis

kaggle titanic tutorial

python variance and standard deviation

hyperparameter tuning with scikit learn

principal component analysis scikit learn

Reflexion Prompting

Important Links

LinkedIn

Get in touch