2020年12月19日星期六

Reading a multiline CSV file in Spark

I am trying to read a multiline csv file in spark. My schema is: Id, name and mark. My input and actual output are given below. I am not getting the expected output. Can someone please help what I am missing in my code.

Code:

val myMarkDF =   spark                  .read                  .format("csv")                  .option("path","mypath\\marks.csv")                  .option("inferSchema","true")                  .option("multiLine","true")                  .option("delimiter",",")                  .load  

Input:

1,A,  97,,  1,A,98  1,A,  99,,  2,B,100  2,B,95  

Actual output:

+---+----+----+  |_c0| _c1| _c2|  +---+----+----+  |  1|   A|null|  | 97|null|null|  |  1|   A|  98|  |  1|   A|null|  | 99|null|null|  |  2|   B| 100|  |  2|   B|  95|  +---+----+----+  

Expected output:

+---+----+----+  |_c0| _c1| _c2|  +---+----+----+  |  1|   A|  97|  |  1|   A|  98|  |  1|   A|  99|  |  2|   B| 100|  |  2|   B|  95|  +---+----+----+  

Thanks!



from Recent Questions - Stack Overflow https://stackoverflow.com/questions/65376761/reading-a-multiline-csv-file-in-spark user3103957 http://ifttt.com/images/no_image_card.png

没有评论:

发表评论