So I am working with a data set and I am trying to remove an outlier row from my data. The value is 123 for an employment length. There were some other outliers I had removed and that worked fine. I created a new data set that updates the previous data set just without the data set. Original data set is named credit.
Here is my code:
# Outlier cutoff for annual income (Bigger than Q3 + 1.5*IQR) outlier_annual_inc <- quantile(credit$person_income, 0.75) + 1.5*IQR(credit$person_income) outliers <- which(credit$person_income > outlier_annual_inc) hist(no_outlier_data$person_income, 50, xlab = "annual income", main = "Person Income") #Outlier cutoff for age outlier_age <- which(credit$person_age > 122) outlier_emp_length <- which(credit$person_emp_length > 100) outlier_emp_length outlier_age clean_data <- (credit[-outlier_age, ]) clean_data <- clean_data[-outliers, ] clean_data <- clean_data[-outlier_emp_length, ]
When you view the clean_data it remove the age outliers and the income outliers, but not the employment length outlier (outlier_emp_length) and I am wondering why.
https://stackoverflow.com/questions/67014489/able-to-identify-outlier-in-r-but-when-i-attempt-to-remove-it-is-not-getting-del April 09, 2021 at 10:58AM
没有评论:
发表评论