I need to recognize text on ARkit captured image and display box on screen.
from official example in RealtimeNumberReader, this code converts recognized box: layer.layerRectConverted(fromMetadataOutputRect: box.applying(self.visionToAVFTransform)), in which layer is a AVCaptureVideoPreviewLayer.
However, I'm using ARSceneview captured image, not AVCaptureVideoPreview as from example above. layerRectConverted only works for AVCapture.
So I did lots experiment, tried to find out how a VNImageRequestHandler detected box can be converted to my ARSceneView's layer.
Very strangely, y-axis always gives magical result while x-axis is predictable. Let's say my Recognize area ROI is (80, 100, 300, 50), recognized box is (x, y, w, h), then layerRectConverted will be (300*x+80, magical number, 300*w, 50*h)
Can anybody help and explain why? Thank you very much!!!
https://stackoverflow.com/questions/66756293/what-does-layerrectconvertedfrommetadataoutputrect-really-do March 23, 2021 at 10:07AM
没有评论:
发表评论