2021年2月5日星期五

Pass arrays from DatafFame into function with arrays grouped and flattened

I have a dataframe with X position data for each participant, and three grouping variables (with each X array being 1000 points in length). Preview of dataframe:

          X    Z  participantNum  obsScenario  startPos  targetPos  16000 -16.0 -5.0         6950203            2         2          3  16001 -16.0 -5.0         6950203            2         2          3  16002 -16.0 -5.0         6950203            2         2          3  16003 -16.0 -5.0         6950203            2         2          3  16004 -16.0 -5.0         6950203            2         2          3  16005 -16.0 -5.0         6950203            2         2          3  16006 -16.0 -5.0         6950203            2         2          3  16007 -16.0 -5.0         6950203            2         2          3  16008 -16.0 -5.0         6950203            2         2          3  16009 -16.0 -5.0         6950203            2         2          3  

I need to pass all of the X data into a function, with the X data grouped by the 3 grouping variables and with each X data array in its own column. Right now they are all stacked on top of each other.

These are the functions: (It goes through calc_confidence_interval first)

def mean_confidence_interval(data, confidence=0.95):      a = 1.0*np.array(data)      n = len(a)      m, se = np.mean(a), scipy.stats.sem(a)      h = se * scp.stats.t._ppf((1+confidence)/2., n-1)      return m, m+h, m-h    def calc_confidence_interval(data):      mean_ci = []      top_ci =[]      bottom_ci=[]      for column in data.T:          m, t,b=mean_confidence_interval(column)          mean_ci.append(m); top_ci.append(t);bottom_ci.append(b)      return mean_ci, top_ci, bottom_ci  

And I'm trying to make something like this work:

calc_CI = df.groupby(['obsScenario', 'startPos', 'targetPos'])['X'].apply(calc_confidence_interval)  calc_CI = calc_CI.join(calc_CI.rename('calc_CI'),           on = ['obsScenario', 'startPos', 'targetPos'])  

But I'm getting the error: TypeError: object of type 'numpy.float64' has no len(), because it is currently passing the X data as a single array rather than separate columns for each participant, grouped by the three grouping variables.

## Traceback  ```python  --------------------------------------------------------------------------      calc_CI = allDataF.groupby(['obsScenario', 'startPos', 'targetPos'])['X'].apply(calc_confidence_interval)      File "/opt/anaconda3/lib/python3.8/site-packages/pandas/core/groupby/generic.py", line 226, in apply      return super().apply(func, *args, **kwargs)      File "/opt/anaconda3/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 870, in apply      return self._python_apply_general(f, self._selected_obj)      File "/opt/anaconda3/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 892, in _python_apply_general      keys, values, mutated = self.grouper.apply(f, data, self.axis)      File "/opt/anaconda3/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 213, in apply      res = f(group)      File "/Users/lillyrigoli/Desktop/PhD/PhD_Projects/RouteSelection/Analysis_RS/load_filter_plot_CI_RS.py", line 221, in calc_confidence_interval      m, t,b=mean_confidence_interval(column)      File "/Users/lillyrigoli/Desktop/PhD/PhD_Projects/RouteSelection/Analysis_RS/load_filter_plot_CI_RS.py", line 210, in mean_confidence_interval      n = len(a)    TypeError: object of type 'numpy.float64' has no len()  

The functions return the confidence intervals (top, middle & bottom) as lists.

The output I should get at the end is like this, with the output (mean_ci, top_ci, bottom_ci arrays) for each grouping combination.

obsScenario  startPos  targetPos  mean_ci                 top_ci                 bottom_ci  0             1          1     [array of length 1000] [array of length 1000] [array of length 1000]    0             2          2     [array of length 1000] [array of length 1000] [array of length 1000]    1             1          1     [array of length 1000] [array of length 1000] [array of length 1000]   1             2          2     [array of length 1000] [array of length 1000] [array of length 1000]   
https://stackoverflow.com/questions/66057266/pass-arrays-from-dataffame-into-function-with-arrays-grouped-and-flattened February 05, 2021 at 11:42AM

没有评论:

发表评论