2020年12月19日星期六

Bitwise binary operator for BYTES requires equal length of the inputs. Got 16 bytes on the left hand side and 4 bytes on the right hand side

I'm trying to match users to their geo locations using their IPs.

IPs is a mix of ipv4, ipv6 and some rows with invalid entries.

I used instructions from this article by Felipe Hoffa https://towardsdatascience.com/geolocation-with-bigquery-de-identify-76-million-ip-addresses-in-20-seconds-e9e652480bd2 to match my IPs with the IPs in the blocks file.

The problem is I get an error when trying to use the NET.SAFE_IP_FROM_STRING(ip_address) & NET.IP_NET_MASK(4,mask) functions. The error is:

Bitwise binary operator for BYTES requires equal length of the inputs. Got 16 bytes on the left hand side and 4 bytes on the right hand side.

So I tried to ensure IPs are valid before passing them to the function using regex. This works for ipv4. I've now figured out regex to match ipv6 as well, and from my checks, it seems accurate. However I still get the error. I can't figure out why and how to fix my query to get the correct result.

See whole query below:

(      SELECT        *,        NET.SAFE_IP_FROM_STRING(ip_address) & NET.IP_NET_MASK(4,          mask) network_bin      FROM (        SELECT          * EXCEPT (is_valid)        FROM (          SELECT            *,            CASE              WHEN (REGEXP_CONTAINS(ip_address, r'\A[a-f0-9]{1,4}(:[a-f0-9]{1,4}){7}\z') OR (NOT REGEXP_CONTAINS(ip_address, r'\A(.*?[a-f0-9](:|\z)){8}') AND REGEXP_CONTAINS(ip_address, r'\A([a-f0-9]{1,4}(:[a-f0-9]{1,4}){0,6})?::([a-f0-9]{1,4}(:[a-f0-9]{1,4}){0,6})?\z')) OR REGEXP_CONTAINS(ip_address, r'\A[a-f0-9]{1,4}(:[a-f0-9]{1,4}){5}:(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}\z') OR (NOT REGEXP_CONTAINS(ip_address, r'\A(.*?[a-f0-9]:){6}') AND REGEXP_CONTAINS(ip_address, r'\A([a-f0-9]{1,4}(:[a-f0-9]{1,4}){0,4})?::([a-f0-9]{1,4}:){0,5}(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}\z')) ) THEN TRUE              WHEN REGEXP_CONTAINS(ip_address, r"^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$") THEN TRUE            ELSE            FALSE          END            AS is_valid          FROM (            SELECT              user,              ip_address,              date            FROM               `project.dataset.table`)        WHERE          is_valid IS TRUE),        UNNEST(GENERATE_ARRAY(9,32)) mask)  
https://stackoverflow.com/questions/65377005/bitwise-binary-operator-for-bytes-requires-equal-length-of-the-inputs-got-16-by December 20, 2020 at 12:48PM

没有评论:

发表评论