By using query, you can simply filter down a dataframe in a more readable format. In this lesson we will go over how to use this with numbers, strings, variables, and more. If you want to watch a YouTube video, the one this lesson is based on is down below. Let’s start by importing in […]
Pandas Shift
The pandas.shift() function is a powerful and versatile tool in data analysis with Python. It allows you to shift the values of a DataFrame or Series up or down along an axis, making it especially useful for comparing a row or column to its previous or future counterpart. This functionality is commonly applied in time […]
Python Pandas Explode
Using explode within Python Pnadas allows you to transform each element of a list to a new row within a dataframe. Python Pandas Explode YouTube Video If you want a video demonstration of how the code works, checkout the video below on our YouTube channel. When working with Python Pandas the first step is to […]
K-Nearest Neighbors
Comprehensive Understanding to K-Nearest Neighbors (KNN) in Supervised Machine Learning. K-Nearest Neighbors (KNN) is a simple, widely used supervised learning algorithm in data science and machine learning It was developed by Evelyn Fix and Joseph Hodges in 1951. Known for it usefulness and versatality, KNN can handle both classification and regression tasks when needed. https://youtu.be/Nz73vXn5afE […]
Tree of Thought Prompting
In the ever-evolving field of artificial intelligence, reasoning and problem-solving capabilities have seen remarkable advancements. One such innovation that stands out is the use of Tree of Thoughts (TOT) in AI reasoning, particularly in the realms of mathematical reasoning and writing TOT excels in guiding the progression of thoughts, making the problem-solving process more comprehensive […]
Chain of Thought Prompting
Chain of Thought Prompting also known as COT is a way to enhance the reasoning and problem-solving abilities of Large Language Models (LLM). It helps guide the LLM through a step-by-step process to arrive at a final result. It’s like showing your work on a math problem. This is done by breaking down the prompt […]
Python Pandas GroupBy
Many data analysts begin their journey with SQL, learning how to use GROUP BY to aggregate and summarize data. As they advance, they often transition to Python for more complex data manipulation. One of the key features in Python’s Pandas library is the groupby function, which allows for powerful and flexible data grouping and aggregation. […]
Optuna Hyperparameter Tuning
Optuna is a hyperparameter optimization framework for machine learning models. It can help automate and streamline the process of tuning the hyperparameters. It’s quite popular among Kaggle users and you’ll see it used within competitions. In this article, we will go over an example of using it on a basic dataset. There is also a […]
Ordinal Encoder
When working with real world data, you’ll often have to deal with categorical information. This can be a problem when working with Machine Learning models as most cannot use it. Instead, Data Scientists and Machine Learning engineers need to convert this into a numerical format. This is where the Ordinal Encoder in Scikit-Learn can help. […]
One Hot Encoder
In the realm of machine learning and data science, preparing your data is often as crucial as the modeling itself. One of the essential preprocessing steps when working with categorical data is one-hot encoding. This technique transforms categorical variables into a format that can be provided to machine learning algorithms to improve predictions and insights. […]
