I need to migrate simple spark statement into Bigquery as part of GCP migration. A table contains 2 columns which need to be split according to spark statements into BigQuery.
in Spark:
t1 = spark.sql("select col1,col2 from db.table")
df1 = t1.withColumn("col2",explode(split(col("col2"), "(?<=\\G......)"))).withColumn("col2",trim(col("col2")))
Input Table:
+----+----------------------+ |col1| col2| +----+----------------------+ |d1|X11 F11 1000KG123456| |d2|X22 F22 3500Kabcdefgh| Expected output:
+------------+ |col1|col2| +------------+ |d1| X11| |d1| F11| |d1| 1000KG| |d1| 123456| |d2| X22| |d2| F22| |d2| 3500Ka| |d2| bcdefg| |d2| h| actually col2 has big value "X11 F11 1000KG123456.........". In col2 - we need to consider the data each 6 chars. I tried with few of queries but it did not worked out. can anyone suggest the relevant bigQuery to convert that spark statement ? Request to suggest generic query because I have lot of records in it.
https://stackoverflow.com/questions/66649540/split-column-into-rows-for-every-6-chars-in-bigquery March 16, 2021 at 01:25PM
没有评论:
发表评论