Pandas loc

Pandas loc stands for location. Today we are going through 15 different examples showcasing how this works.

Unlike iloc which uses integers for location, loc utilizes strings.

If you want to follow this tutorial on YouTube, we have a video down below.

Start by importing pandas as pd.

  import pandas as pd

We are going to create a dataframe from a dictionary. Let’s assume we are taking a look at merchant data. We have the merchant_id, the state/city and the volume a merchant processes.

  data = { "merchant_id": ['qtz-837-mnb-294', 'vxl-512-ghj-763', 'prk-924-zxc-105', 'mfs-376-lkd-842', 'jdn-689-qwe-417'], "merchant_state": ["CA", "NY", "TX", "FL", "WA"], "merchant_city": ["Los Angeles", "New York", "Dallas", "Tampa", "Seattle"], "merchant_volume": [50000, 75000, 30000, 90000, 45000] }

Once we pass in the dictionary to DataFrame() we now need to set the merchant_id as the index. Once that is done we are good to go for this tutorial.

  df = pd.DataFrame(data) df.set_index("merchant_id", inplace=True) df.head(10)

Example 1 Selecting a Single Row by Label

Since our index is the merchant id. We pass in the ID of the row we want to see.

  df.loc['qtz-837-mnb-294']

Example 2 Selecting multiple rows

Use two brackets if you want to look at multiple rows.

  df.loc[['qtz-837-mnb-294', 'prk-924-zxc-105']]

Example 3 Selecting Row and Specific Column

If you use a comma in loc, you can also select the specific row you want to see. A bit later in this tutorial we will go over how to see only a full row, but for now let’s grab the state of California.

  df.loc['qtz-837-mnb-294', 'merchant_state']

Example 4 Selecting Rows and Multiple Columns

Like we saw earlier with rows, you can also select multiple columns if you utilize another pair of brackets.

  df.loc['qtz-837-mnb-294', ['merchant_state', 'merchant_city']]

Example 5 Selecting Multiple Rows and columns

The code below showcases both multiple rows and columns.

  df.loc[['qtz-837-mnb-294', 'prk-924-zxc-105'], ['merchant_state', 'merchant_city']]

Example 6 filter on one condition

For our first filtering example, lets only look at rows that have the state as FL.Â

  df.loc[df['merchant_state'] == 'FL']

Our next example filters on Merchant volume being greater than or equal to 50000

  df.loc[df['merchant_volume'] >= 50000]

Example 7 filter on one condition, select columns

From the code above, we can specify what columns we want to take a look at. Remember the 2nd parameter of loc is the columns.

  df.loc[df['merchant_volume'] >= 50000, ['merchant_city', 'merchant_volume']]

Example 8 filter on multiple conditions

If you want to filter on multiple conditions, ensure that each is wrapped in (). If you want both conditions to be true use &.

  df.loc[(df['merchant_volume'] >= 50000) & (df['merchant_state'] == 'FL')]

If you only want one condition to be true use | which is substituted for or. Remember that you still need to use ().

  df.loc[(df['merchant_volume'] >= 50000) | (df['merchant_state'] == 'WA')]

Example 9 filter on multiple conditions, certain rows

We can expand on above by once again filtering on certain columns. Remember condition first, then a comma, than the columns you want shown.

  df.loc[(df['merchant_volume'] >= 50000) | (df['merchant_state'] == 'WA'), ['merchant_city', 'merchant_volume']]

Example 10 Modify data

Using loc, we can also modify data. In this example we change the state of the Tampa merchant from Florida to Kansas.

  df.loc['mfs-376-lkd-842', 'merchant_state'] = 'KS' df.head()

Example 11 single column

We highlighted earlier that you can also select columns, but never went over the full way to do so. Well for the rows we need to use a colon :. The reasoning, well a colon allows you to slice the dataframe. In this instance a colon without a start or stop position grabs all the rows, which is needed for a column.

  df.loc[:, 'merchant_city']

Example 12 Slicing rows

So since the colon is used for slicing, it gives us the opportunity to grab a few rows or columns in order. It’s worth noting that this is different than iloc. loc includes the last one where as iloc does not [Start(included):End(Also included, not included in iloc)]

This code showcases that merchants QTZ and MFS are both shown in the dataframe below.

  df.loc['qtz-837-mnb-294':'mfs-376-lkd-842']

Example 13 Slicing Columns

The same loc rules for slicing are present for columns as well. The code below grabs state through city.
  df.loc[:, 'merchant_state':'merchant_city' ]

Example 14 Skip with slicing Row

We can utilize a second colon to allow for skipping of rows.Â

  df.loc[::2]

Example 15 Skip with slicing Column

The same as above can be applied to columns. Just make sure to use a single colon first to select all the rows.

  df.iloc[:, ::2]

Ryan is a Data Scientist at a fintech company, where he focuses on fraud prevention in underwriting and risk. Before that, he worked as a Data Analyst at a tax software company. He holds a degree in Electrical Engineering from UCF.

Leave a Reply

Your email address will not be published. Required fields are marked *