2021年4月12日星期一

remove quotes from Spark headers and fields

I have a csv file and am using the following code to upload it:

val bank = spark.read.format("com.databricks.spark.csv").   | option("header", true).   | option("ignoreLeadingWhiteSpace", true).   | option("inferSchema", true).   | option("quote", "").   | option("delimiter", ";").   | load("bank_dataset.csv")  

I am getting the following:

"age ""job"" ""marital"" ""income""
"58 ""tech"" ""married"" 58000

Oddly enough, the first column has only a quote at the beginning and the rest of the columns have double quotes. Except for age, which has the quote in front of it, the other numbers don't have any quotes.

I need to process it so that it looks like this:

age job marital income
58 tech married 58000
https://stackoverflow.com/questions/67051695/remove-quotes-from-spark-headers-and-fields April 12, 2021 at 09:40AM

没有评论:

发表评论