How to select rows from a DataFrame based on column values in Python?


In Python, you can select rows from a Pandas DataFrame based on column values using boolean indexing. Here’s an example:

import pandas as pd

# create a DataFrame
data = {‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’, ‘David’],
‘Age’: [25, 32, 18, 47],
‘Country’: [‘USA’, ‘Canada’, ‘Australia’, ‘USA’]}
df = pd.DataFrame(data)

# select rows where the Country column is ‘USA’
usa_df = df[df[‘Country’] == ‘USA’]



In this example, we first create a Pandas DataFrame called df. Then, we use boolean indexing to select rows where the value in the ‘Country’ column is ‘USA’. Specifically, we use the expression df['Country'] == 'USA' to create a boolean mask, which is a Series of True and False values indicating which rows satisfy the condition. We then pass this boolean mask to the DataFrame using square brackets ([]), which selects only the rows where the mask is True.

You can also use other comparison operators and combine multiple conditions using logical operators like & (and) and | (or). For example:


# select rows where the Age column is greater than 30 and the Country column is ‘USA’ or ‘Canada’
selected_df = df[(df[‘Age’] > 30) & ((df[‘Country’] == ‘USA’) | (df[‘Country’] == ‘Canada’))]



In this example, we select rows where the ‘Age’ column is greater than 30 and the ‘Country’ column is either ‘USA’ or ‘Canada’. We use parentheses to group the two ‘Country’ conditions together, and use the & operator to combine the two conditions into a single boolean mask.

  • You must to post comments
Showing 0 results
Your Answer

Please first to submit.