How do you handle duplicate data in a DataFrame?

Question

Please log in or register to answer this question.

1 Answer

Find MCQs & Mock Test

Categories

kvdevika · Answer 1 · 2023-09-04T02:03:56+0000

You can handle duplicate data using the drop_duplicates() method to remove duplicate rows or by using the duplicated() method to identify duplicates.

Example Code:

import pandas as pd

data = {'A': [1, 2, 2, 3, 4],
        'B': ['X', 'Y', 'Y', 'Z', 'X']}
df = pd.DataFrame(data)

# Remove duplicate rows
df.drop_duplicates(inplace=True)

# Identify duplicate rows based on a subset of columns (column 'B' in this case)
duplicates = df[df.duplicated(subset=['B'])]

How do you handle duplicate data in a DataFrame?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Find MCQs & Mock Test

Related questions

Categories