2021年3月17日星期三

Split column into rows for every 6 chars in BigQuery

I need to migrate simple spark statement into Bigquery as part of GCP migration. A table contains 2 columns which need to be split according to spark statements into BigQuery.

in Spark:

t1 = spark.sql("select col1,col2 from db.table")

df1 = t1.withColumn("col2",explode(split(col("col2"), "(?<=\\G......)"))).withColumn("col2",trim(col("col2")))

Input Table:

+----+----------------------+  |col1|                  col2|  +----+----------------------+  |d1|X11   F11   1000KG123456|  |d2|X22   F22   3500Kabcdefgh|  

Expected output:

+------------+  |col1|col2|  +------------+  |d1|     X11|  |d1|     F11|  |d1|  1000KG|  |d1|  123456|  |d2|     X22|  |d2|     F22|  |d2|  3500Ka|  |d2|  bcdefg|  |d2|       h|  

actually col2 has big value "X11 F11 1000KG123456.........". In col2 - we need to consider the data each 6 chars. I tried with few of queries but it did not worked out. can anyone suggest the relevant bigQuery to convert that spark statement ? Request to suggest generic query because I have lot of records in it.

https://stackoverflow.com/questions/66649540/split-column-into-rows-for-every-6-chars-in-bigquery March 16, 2021 at 01:25PM

没有评论:

发表评论