Swift iOS – Vision does not return any observations from cgImage


I’m attempting to program an app that allows the user to trace Japanese characters, which will then use Vision to determine whether they have been traced correctly. To test this, I’m attempting to use English and Japanese characters in my tests, but neither seem to return any observations and therefore no recognised strings.

import UIKit
import Vision

func convertCanvasToImage(view: UIView) -> UIImage {
    let renderer = UIGraphicsImageRenderer(size: view.bounds.size)
    return renderer.image { ctx in
        view.drawHierarchy(in: view.bounds, afterScreenUpdates: true)
    }
}

func runVisionRecognition(canvas: Canvas) {

    NSLog("Start runVisionRecognition")
    let uiImage = convertCanvasToImage(view: canvas)
    guard let cgImage = uiImage.cgImage else { return }

    let requestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
    let request = VNRecognizeTextRequest(completionHandler: recognizeTextHandler)
    request.recognitionLevel = .accurate
    request.recognitionLanguages = ["en-US"]
    //request.minimumTextHeight = 0.1
    request.usesLanguageCorrection = true
    // request.maximumRecognitionCandidates = 10
    
    do {
        try requestHandler.perform([request])
    } catch {
        NSLog("Uh oh! \(error).")
    }
}

func recognizeTextHandler(request: VNRequest, error: Error?) {
    guard let observations =
            request.results as? [VNRecognizedTextObservation] else {
        NSLog("Whoops, observations like \(request.results)")
        return
    }
    let recognizedStrings = observations.compactMap { observation in
        return observation.topCandidates(1).first?.string
    }
    
    NSLog("Observation: \(observations)")
    NSLog("Recognised Strings: \(recognizedStrings)")
}

My characters come from a Canvas that I then translate into an image to be fed into Vision (see the top function).

Below are some examples of my handwriting on the Canvas. NSLog returns ‘Observation: [], Recognised Strings: []’, but as you can see, the Photos app seems to recognise the inputs fine!

View of my app, with the letters drawn onto a Canvas

Screenshot from the Photos app, showing the letters being read clearly (both English and Japanese)

My theory is something is going wrong between the conversion from Canvas to cgImage, but if you look in the first image, the top ‘APPLE’ is a converted image from my bottom drawing of ‘APPLE’.

Latest articles

spot_imgspot_img

Related articles

Leave a reply

Please enter your comment!
Please enter your name here

spot_imgspot_img