Adding a new column in Pandas dataframe based on existing columns

Pandas dataframe allows adding new columns to it based on the values of the existing columns.

Example Scenario

Suppose, a dataset contains Football match results of some matches like below:

Date	Venue	Opponent	GF	GA
08-27-2017	H	Arsenal FC	4	0
09-16-2017	H	Burnley FC	1	1

Here, GF stands for Goals For and GA stands for Goals Against (i.e., number of goals conceded by a team). Now we want to add a new column called Result to this dataframe that will contain the result of the match.

The condition to generate the Result column is:

If GF > GA, Result should contain W
If GF < GA, Result should contain L
Otherwise Result should contain D.

Solution using Dataframe apply method

We can use pandas.DataFrame.apply method to add the new Result column.

import pandas as pd


def get_result(row):
    if row['GF'] == row['GA']:
        return 'D'
    elif row['GF'] > row['GA']:
        return 'W'
    return 'L'


data = pd.read_csv("scores.csv")
data['Result'] = data.apply(lambda row: get_result(row), axis=1)
print(data)

Output:

scores.csv:

Date,Venue,Opponent,GF,GA
08-27-2017,H,Arsenal FC,4,0
09-16-2017,H,Burnley FC,1,1

Explanation

We check each row of the existing dataframe. We assign a value to the newly created Result column based on the condition given in the get_result method. axis=1 in the apply method tells of applying the logic in all rows of the dataframe.

Reference

Pandas official documentation on pandas.DataFrame.apply

Adding a new column in Pandas dataframe based on existing columns

By Ahmedur Rahman Shovon

Example Scenario

Solution using Dataframe apply method

Explanation

Reference

Citation

Previous post

Next post

Ahmedur Rahman Shovon

Example Scenario

Solution using Dataframe apply method

Explanation

Reference

Citation

APA Style

MLA Style

BibTeX entry

Previous post

Next post

Ahmedur Rahman Shovon