2021年4月8日星期四

Able to identify outlier in R but when I attempt to remove it is not getting deleted

So I am working with a data set and I am trying to remove an outlier row from my data. The value is 123 for an employment length. There were some other outliers I had removed and that worked fine. I created a new data set that updates the previous data set just without the data set. Original data set is named credit.

Here is my code:

# Outlier cutoff  for annual income (Bigger than Q3 + 1.5*IQR)  outlier_annual_inc <- quantile(credit$person_income, 0.75) + 1.5*IQR(credit$person_income)  outliers <- which(credit$person_income > outlier_annual_inc)  hist(no_outlier_data$person_income, 50, xlab = "annual income", main = "Person Income")  #Outlier cutoff for age  outlier_age <- which(credit$person_age > 122)  outlier_emp_length <- which(credit$person_emp_length > 100)  outlier_emp_length  outlier_age  clean_data <- (credit[-outlier_age, ])  clean_data <- clean_data[-outliers, ]  clean_data <- clean_data[-outlier_emp_length, ]  

When you view the clean_data it remove the age outliers and the income outliers, but not the employment length outlier (outlier_emp_length) and I am wondering why.

https://stackoverflow.com/questions/67014489/able-to-identify-outlier-in-r-but-when-i-attempt-to-remove-it-is-not-getting-del April 09, 2021 at 10:58AM

没有评论:

发表评论