Skip to main content

Python

Boolean indexing using Pandas

Introduction

Here’s a quick reference for exploring data using Boolean indexing in Pandas. Remember that a boolean operation returns only True or False, and can only be performed against a series.

Using booleans in data is usually a two-step process: the first step is to evaluate the series and identify which cell is true and which is false relative to the given boolean. The second step is to filter the series based on the boolean result.

Operators

|Operator | Description |

| ----------- | ----------- |

|== a | True if a is True, else False|

|~a| True if a is False else False (Not a) |

|a > b| True if a is greater than b else False|

|a < b| True if a is less than b else False|

|a & b | True if both a and b are True, else False|

|a | b| True if either a or b is True. False if both are False|

Filtering a dataframe

Consider the following simple dataframe, which samples four people associated with the number of times the person has seen Star Wars:

Let's establish a boolean to evaluate which people have see the film fewer than 10 times.

`evaluation = df['number'] < 10

What this does is evaluate every entry in the column 'number', compare it with the boolean, and place the result in the series "evaluation." Now we have evaluation, a series that evaluates the truthiness of the boolean.

The second step is to use this true/false evaluation as a filter, creating a new series populated by values that are true.

`deprived_persons = df['evaluation']

Now we have a new series called deprivedpersons that contains two rows. If we like, we can filter just one column:

`deprived_persons = df['evaluation', "name"]

Selecting specific data

Select all rows whose country value is either Brazil or Venezuela

`brazil_venezuela = f500[(f500["country"] == "Brazil") | (f500["country"] == "Venezuela")]

Select the first five companies in the Technology sector for which the country is not the Germany

`outside_Germany = df[ ~(df["country"] == "Germany") & (df["sector"] == "Technology")].head(5)