Enhanced Lookup — Excel and PyXLL June 2023
I built two lookup functions suitable for engineering.
Occasionally in engineering, I have a list of possible answers — say a group of beams that are sufficient — and I want to find the minimum cost or minimum weight beam from that group. These functions allow finding the first 'passing' result.
Given an input value and a group of Excel ranges, from the list, I want to find the first 'passing' result, either:
- the next value greater than or equal to the input
- the next value less than than or equal to the input
pos_QQ is an optional parameter; one can select either the indexed value within the list or the index itself.
@xl_func("var inp_list, var inp_value, bool pos_QQ: var") def vlookup_LTE_for_xl(inp_list, inp_value, pos_QQ=False): """mimics Vlookup from MS Excel inp_list needn't be sorted pos_QQ = True, function returns the INDEX NUMBER of the greatest value less than or equal to the inp_value pos_QQ = False, function returns the greatest value less than or equal to the inp_value""" sorted_inp_list = sorted([e[0] for e in inp_list]) n = bisect.bisect(sorted_inp_list, inp_value) - 1 # return False if the n index is out of range if n<0 or n>(len(inp_list)-1): return False if pos_QQ: return n else: return sorted_inp_list[n] @xl_func("var inp_list, var inp_value, bool pos_QQ: var") def vlookup_GTE_for_xl(inp_list, inp_value, pos_QQ=False): """mimics Vlookup from MS Excel inp_list needn't be sorted pos_QQ = True, function returns the INDEX NUMBER of the least value greater than or equal to the inp_value pos_QQ = False, function returns the least value greater than or equal to the inp_value""" print("INP LIST: ", inp_list) # sorted_inp_list = sorted([e[0] for e in inp_list]) sorted_inp_list = sorted([e[0] for e in inp_list]) n = bisect.bisect_left(sorted_inp_list, inp_value) # return False if the n index is out of range if n<0 or n>(len(inp_list)-1): return False if pos_QQ: return n else: return sorted_inp_list[n]
Note that I use a helper function, union_for_xl(), to transform any group of Excel ranges to a flattened list of values. The helper function is available for manual input but hidden to deter use by an inexperienced user. The Excel ranges in the group may be rows, columns, single cells, or any combination.
This helper function is far more interesting than the lookups — put simply, values from a group of Excel ranges are concatenated into a 1‑D 'streamed' list of values (therefore, unsorted).
@xl_func("var *[][]=None: var", hidden=True) def union_for_xl(*inp_ranges): """accepts a parameter array of XL ranges returns a flattened list of values from the input""" inp_list = list(inp_ranges) # build a list of lists from range(s) input list_of_lists = [] for e in inp_list: if isinstance(e, list): list_of_lists.extend(e) else: list_of_lists.append([e]) # flatten the list_of_lists flattened_list = [] for e in itertools.chain.from_iterable(list_of_lists): flattened_list.append(e) return flattened_listFeatures of the Helper Function
Notice the PyXll signature: @xl_func("var *[][]=None: var", hidden=True)
The function expects an input of zero or more Excel ranges … "var *[][]=None" … :
- The [][] double brackets indicate a single range input or several range input.
- The * preceding the double brackets indicates an optional input; i.e. the parameter can be omitted entirely.
- The =None after the double brackets indicates a default input of None
- After experimenting, I say:
- Empty cells are taken as zero values
- Range values are 'streamed' in row order
- If a user prefers that empty cells are ignored, then the following signature and code is satisfactory:
@xl_func("var *numpy_array<float_nan>: var", hidden=False) def union_for_xl_no_empty_cells(*inp_ranges): """accepts a parameter array of XL ranges returns a flattened list of values from the input""" inp_list = list(inp_ranges) # build a list of list from range(s) input list_of_lists = [] for e in inp_list: if isinstance(e, list): list_of_lists.extend(e) else: list_of_lists.append([e]) # flatten the list_of_lists flattened_list = [] for e in itertools.chain.from_iterable(list_of_lists): if e != None: flattened_list.append(e) return flattened_list
According to PyXLL, if the signature *numpy_array<float_nan> is used, then empty input cells are represented as None. See the PyXLL User Guide, search for 'float_nan'.
As mentioned in other posts, I prefer using PyXLL because:
- it allows the function to reside in a central location outside of the Excel spreadsheet
- avoids XLAM files
- avoids broken Excel links
- clean — no clumsy Excel data sections/tabs
- it is easily programmed and maintained using Python
- it is easily customized
- multiple versions of the function can be produced as needed (to address building code changes, for instance)