I have many cases in a certain set of data that a value could be either a list or a singular value of the same type (if someone needs context, they come from an ElasticSearch DB). For instance (not valid json, just to illustrate the idea)
var_of_data_type_x = { item_a: { data_structure_a } } or var_of_data_type_x = { item_a: [ { data_structure_a }, { data_structure_a }, { data_structure_a } ] }
to make matters worse, data_structure_a
fields could be similar, up to scalar/list of scalar level, maybe nested for 2-3 levels.
So all my processing code needs to verify if an item is a list or a singular value and unwrap the list if necessary in the style shown below. This means a lot of code duplication, unless I create so many tiny functions (each processing code is around 5-10 lines in most cases). Even If i moved common code to functions, the pattern shown below gets repeated, sometimes even nested for 2-3 levels.
# list-checking-code if instanceof(var, list): for x in var: # item wise processing code for (x) ... else: # exactly same code as above for (var)
I know, this is a nightmare design, I'd rather the data structures be consistent, but this is my input. I could write some simple preprocessing to make it consistent, to make all singular instances wrapped in lists. That would create a lot of single-element lists though, as in many cases the values are singular.
What would be the best approach for tackling this? So far, all approaches I see have their own problems:
- creating double code (as above) for list vs singular cases: probably the most efficient, but readability hell as this happens a lot, especially nested! This is my preferred method for efficiency reasons although it's a code/maintain nightmare.
- preprocess data and wrap each singular item in a list: not sure how efficient creating a lot of single-element lists is. Also, most such items in data will be accessed only once.
- write a lot of functions for itel-level processing, which will save some complexity of code, but add a lot of 5-10 line functions.
- do (3) above, additionally move above
#list-checking-code
pattern to another function, which will take function in (3) as an argument. - write functions to accept var-args, and pass all arguments as unwrapped lists. This will eliminate the
instanceof()
check andif-then-else
but not sure if unwrapping has its own overhead. (The lists in question have very few elements typically.)
What could be the best approach here, or is there a better more pythonic way? Performance and efficiency are concerns.
https://stackoverflow.com/questions/65835485/is-there-an-efficient-way-in-python-to-treat-a-single-variable-same-as-a-list-wi January 22, 2021 at 04:48AM
没有评论:
发表评论