Outliers can significantly affect percentile calculations. One approach to handle outliers is to use the "trimmed" dataset (removing the extreme values) to calculate percentiles. Another approach is to use robust estimators like the median, which is less sensitive to outliers compared to the mean.
Example Code: Calculating Percentiles in Python
import numpy as np
# Sample dataset
data = [12, 15, 18, 20, 22, 25, 28, 30, 35, 40]
# Calculate the 25th and 75th percentiles (first and third quartiles)
q1 = np.percentile(data, 25)
q3 = np.percentile(data, 75)
print("Q1 (25th percentile):", q1)
print("Q3 (75th percentile):", q3)
In this example, the numpy library's percentile function is used to calculate the desired percentiles.
Remember, practicing with real datasets and experimenting with various percentiles and statistical measures will help you solidify your understanding of these concepts in data science interviews.