Use app×
Join Bloom Tuition
One on One Online Tuition
JEE MAIN 2025 Foundation Course
NEET 2025 Foundation Course
CLASS 12 FOUNDATION COURSE
CLASS 10 FOUNDATION COURSE
CLASS 9 FOUNDATION COURSE
CLASS 8 FOUNDATION COURSE
0 votes
52 views
in Artificial Intelligence (AI) by (159k points)
How do you handle duplicate data in a DataFrame?

Please log in or register to answer this question.

1 Answer

0 votes
by (159k points)

You can handle duplicate data using the drop_duplicates() method to remove duplicate rows or by using the duplicated() method to identify duplicates.

Example Code:

import pandas as pd

data = {'A': [1, 2, 2, 3, 4],
        'B': ['X', 'Y', 'Y', 'Z', 'X']}
df = pd.DataFrame(data)

# Remove duplicate rows
df.drop_duplicates(inplace=True)

# Identify duplicate rows based on a subset of columns (column 'B' in this case)
duplicates = df[df.duplicated(subset=['B'])]

Welcome to Sarthaks eConnect: A unique platform where students can interact with teachers/experts/students to get solutions to their queries. Students (upto class 10+2) preparing for All Government Exams, CBSE Board Exam, ICSE Board Exam, State Board Exam, JEE (Mains+Advance) and NEET can ask questions from any subject and get quick answers by subject teachers/ experts/mentors/students.

Categories

...