2021年1月21日星期四

Is there an efficient way in python to treat a single variable same as a list without explicit wrapping?

I have many cases in a certain set of data that a value could be either a list or a singular value of the same type (if someone needs context, they come from an ElasticSearch DB). For instance (not valid json, just to illustrate the idea)

var_of_data_type_x = {     item_a: { data_structure_a }  }    or     var_of_data_type_x = {      item_a: [        { data_structure_a },        { data_structure_a },        { data_structure_a }     ]  }  

to make matters worse, data_structure_a fields could be similar, up to scalar/list of scalar level, maybe nested for 2-3 levels.

So all my processing code needs to verify if an item is a list or a singular value and unwrap the list if necessary in the style shown below. This means a lot of code duplication, unless I create so many tiny functions (each processing code is around 5-10 lines in most cases). Even If i moved common code to functions, the pattern shown below gets repeated, sometimes even nested for 2-3 levels.

# list-checking-code    if instanceof(var, list):     for x in var:        # item wise processing code for (x) ...  else:     # exactly same code as above for (var)  

I know, this is a nightmare design, I'd rather the data structures be consistent, but this is my input. I could write some simple preprocessing to make it consistent, to make all singular instances wrapped in lists. That would create a lot of single-element lists though, as in many cases the values are singular.

What would be the best approach for tackling this? So far, all approaches I see have their own problems:

  1. creating double code (as above) for list vs singular cases: probably the most efficient, but readability hell as this happens a lot, especially nested! This is my preferred method for efficiency reasons although it's a code/maintain nightmare.
  2. preprocess data and wrap each singular item in a list: not sure how efficient creating a lot of single-element lists is. Also, most such items in data will be accessed only once.
  3. write a lot of functions for itel-level processing, which will save some complexity of code, but add a lot of 5-10 line functions.
  4. do (3) above, additionally move above #list-checking-code pattern to another function, which will take function in (3) as an argument.
  5. write functions to accept var-args, and pass all arguments as unwrapped lists. This will eliminate the instanceof() check and if-then-else but not sure if unwrapping has its own overhead. (The lists in question have very few elements typically.)

What could be the best approach here, or is there a better more pythonic way? Performance and efficiency are concerns.

https://stackoverflow.com/questions/65835485/is-there-an-efficient-way-in-python-to-treat-a-single-variable-same-as-a-list-wi January 22, 2021 at 04:48AM

没有评论:

发表评论