2021年5月4日星期二

np.where condition is not getting satisfied

In the following line of code, I get the error shown below.

d3["WOE"] = np.where(((d3.DIST_EVENT==0) | (d3.DIST_NON_EVENT ==0)) ,np.nan ,np.log(d3.DIST_EVENT/d3.DIST_NON_EVENT))  

I if the numerator or denominator is 0, then the condition for np.nan should satisfy and d3["WOE"] shoud be nan. Why is the following error being produced?

---------------------------------------------------------------------------  FloatingPointError                        Traceback (most recent call last)  <ipython-input-56-a9b015683238> in <module>  ----> 1 final_iv, IV = data_vars(df_leads_short,df_leads_short.close_flag)        2 IV.sort_values('IV')    <ipython-input-55-5530ad13fa5a> in data_vars(df1, target)      122                 count = count + 1      123             else:  --> 124                 conv = char_bin(target, df1[i])      125                 conv["VAR_NAME"] = i      126                 count = count + 1    <ipython-input-55-5530ad13fa5a> in char_bin(Y, X)       92     d3["DIST_EVENT"] = d3.EVENT/d3.sum().EVENT       93     d3["DIST_NON_EVENT"] = d3.NONEVENT/d3.sum().NONEVENT  ---> 94     d3["WOE"] = np.where(((d3.DIST_EVENT==0) | (d3.DIST_NON_EVENT ==0)) ,np.nan ,np.log(d3.DIST_EVENT/d3.DIST_NON_EVENT))       95     #d3["WOE"] = np.log(d3.DIST_EVENT/d3.DIST_NON_EVENT)       96     d3["IV"] = np.where((d3.DIST_EVENT==0) | (d3.DIST_NON_EVENT ==0 ),np.nan ,(d3.DIST_EVENT-d3.DIST_NON_EVENT)*np.log(d3.DIST_EVENT/d3.DIST_NON_EVENT))    /opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)     1934         self, ufunc: Callable, method: str, *inputs: Any, **kwargs: Any     1935     ):  -> 1936         return arraylike.array_ufunc(self, ufunc, method, *inputs, **kwargs)     1937      1938     # ideally we would define this to avoid the getattr checks, but    /opt/conda/lib/python3.7/site-packages/pandas/core/arraylike.py in array_ufunc(self, ufunc, method, *inputs, **kwargs)      356         # ufunc(series, ...)      357         inputs = tuple(extract_array(x, extract_numpy=True) for x in inputs)  --> 358         result = getattr(ufunc, method)(*inputs, **kwargs)      359     else:      360         # ufunc(dataframe)    FloatingPointError: divide by zero encountered in log  
https://stackoverflow.com/questions/67393952/np-where-condition-is-not-getting-satisfied May 05, 2021 at 09:06AM

没有评论:

发表评论