2021年1月23日星期六

How can i get a Cartesian product filtering out the pair of tuples with repeated results?

Im trying to define a function that gets the cartesian product of a given list with itself , however i nedd to filter out the elemnts that contains the same pairs.

For example: Getting the cartesian product of rdd and fiter out the results ((1,0),(1,0)),((2,0),(2,0)) and ((3,0),(3,0))

rdd = sc.parallelize([(1,0), (2,0), (3,0)])      def get_cart(rdd):           a=sorted(rdd.cartesian(rdd).collect())       aRDD=sc.parallelize(a)         return aRDD  

Im expecting to get the output:

[((1, 0), (2, 0)), ((1, 0), (3, 0)), ((2, 0), (1, 0)), ((2, 0), (3, 0)), ((3, 0), (1, 0)), ((3, 0), (2, 0))]  

Instead im getting:

[((1, 0), (1, 0)),   ((1, 0), (2, 0)),   ((1, 0), (3, 0)),   ((2, 0), (1, 0)),   ((2, 0), (2, 0)),   ((2, 0), (3, 0)),   ((3, 0), (1, 0)),   ((3, 0), (2, 0)),   ((3, 0), (3, 0))]  
https://stackoverflow.com/questions/65866091/how-can-i-get-a-cartesian-product-filtering-out-the-pair-of-tuples-with-repeated January 24, 2021 at 08:42AM

没有评论:

发表评论