Pandas Mask
The mask() method is used to replce values where certian conditions are met.
The mask() method in Pandas is used to replace values where certain conditions are met.
Prep the Data
To start we’re going to import pandas as pd.
and also import numpy as np
import pandas as pd
import numpy as np
Next, we create a DataFrame with ‘Hourly_Salary’ as column.
we store it in variable df.
df = pd.DataFrame({
'Hourly_Salary': ['500.00', '10000.00', '200.00', '20.00', np.nan]
})
Next, we convert the column ‘Hourly_Salary’ from string to numeric (integer).
df['Hourly_Salary'] = pd.to_numeric(df['Hourly_Salary'])

Example 1 - if else state location
Here, we replace all values in df that are greater than or equal to 1000 with NaN, leaving the rest unchanged.
df_mask = df.mask(df >= 1000)

Example 2 Keep under 1000, replace with other value - Doesnt fix null value
Here, we replace all values in df that are greater than or equal to 1000 with 999.
df_mask_2 = df.mask(df >= 1000, other=999)

Example 3 fill null value
Here we replace all NaN values in df with 0
df_mask_3 = df.mask(df.isnull(), 0)

Example 4 Column Example Replace anything over 100000 with null
To start we’re going to create a simple dataframe in python:
df2 = pd.DataFrame({
'Running Back': ['Barry Sanders', 'Walter Payton', 'Emmitt Smith', 'Jim Brown'],
'Career Rushing Yards': [152690, 16726, 18355, 12312],
'Touchdowns': [99, 110, 164, 106]
})

Here, we replace values with NaN in any row where “Career Rushing Yards” is greater than or equal to 100000
df2.mask(df2["Career Rushing Yards"] >= 100000)
Example 5 Multiple Conditions and
Here, we replace values with NaN in rows where Touchdowns is greater than 99 and Career Rushing Yards == 18355.
df2.mask((df2["Touchdowns"] > 99) & (df2["Career Rushing Yards"] == 18355))

Example 6 Multiple Conditions or, filters outside
This creates a boolean Series that’s True for rows where Touchdowns is greater than 108, and False otherwise.
filter1 = df2["Touchdowns"] > 108
This creates a boolean Series that’s True for rows where Career Rushing Yards
is exactly 18355, otherwise False.
filter2 = df2["Career Rushing Yards"] == 18355
Here, we replace values with NaN in rows where filter1 OR filter2 is True
df2.mask(filter1 | filter2)

Example 7 create new column, flag if a total is less
Here, we set “touchdown_totals” to the original Touchdowns value, but replaces any value < 100 with the text “Less Than 100”
df2["touchdown_totals"] = df2["Touchdowns"].mask(df2["Touchdowns"] < 100, other="Less Than 100")

Ryan is a Data Scientist at a fintech company, where he focuses on fraud prevention in underwriting and risk. Before that, he worked as a Data Analyst at a tax software company. He holds a degree in Electrical Engineering from UCF.