I have a dataframe with an attribute which contains geometries defined via WKT. I have a single image. I want to segment an image into the boundaries defined by these polygons.
I had planned to use rasterio and to distribute the image (after reading) via the udf. However, it appears the rasterio internal memory cannot be pickled and distributed. Here is a code snippet describing what I was attempting to accomplish.
img = rasterio.open('/dbfs/mnt/mymount/USDA_cropscape_exports/CDL_Iowa_2020.tif') def cliplu ( wkt_in ) : poly = shwkt.loads( wkt_in ) sub_image_nd = rasterio.mask.mask( img , [poly] , crop=True )[0].read(1) return sub_image_nd clip_landuseimg_udf = pyspkF.udf( lambda x : cliplu(x) , StringType() ) section_boundaries_df = section_boundaries_df.withColumn( 'clipped_img' , clip_landuseimg_udf( 'bbox_wkt' ) ) Is there any way to distribute segmenting using libraries compatible with databricks on Azure?
PS: I am unable to install rasterframes due to an incompatibility with databricks on Azure.
https://stackoverflow.com/questions/67238456/chopping-image-using-polygons-via-pyspark-udfs-on-azure-databricks April 24, 2021 at 09:07AM
没有评论:
发表评论