2021年3月9日星期二

Spark Scala UnitTest not serializable exception

I have the following code

class ATest extends FunSuite with SharedSparkContext with RDDComparisons with BeforeAndAfter with Serializable {    val validData: String = "/users/data.csv"    test("1.ATest") {        val localConf = super.conf      localConf.setAppName("Atest").        set("spark.input.path", validData).        val a = new A(localConf){          override def validate(): (List[String], List[String]) = {          (List(validData),List[String]())        }        }      a.process()  

And the actual class A is as follows

class A extends Serialibale{    def process(){ validate() ... }    def validate(){...}  }  

In the test case to mock the call to validate I'm overriding it with the test values. The problem is. This codes as seen above throws

org.apache.spark.SparkException: Task not serializable  Cause: java.io.NotSerializableException: org.scalatest.Assertions$AssertionsHelper  [info] Serialization stack:  

However, when I move the instance level "validData" definition inside the test method it works fine. As ATest extends Serializable the whole class should be serializable right? Why I'm getting this exception?

-Raj

https://stackoverflow.com/questions/66556946/spark-scala-unittest-not-serializable-exception March 10, 2021 at 09:03AM

没有评论:

发表评论