Cloud Development Kit

Clarity on Content Types and InputUnits in Text Recognition CDK API

contentTypes and inputUnits is an array in the JSON request. 

1.Can you please provide a detailed documentation on how  contentTypes and inputUnits are interrelated and affect the recognition? 

2. Can we send multiple type of fields (combination of strokes) under inputUnits and specify their corresponding content type in contentType array? For eg- If I have "ELEPHANT" and "123456" as two field and I send strokes for "ELEPHANT" in inputUnits[0] position specifying contentType[0] as "text' and strokes for "123456" in inputUnits[1] position specifying contentType[1] as "number', how will the recognition take place?

1 Comment

Dear Piyush,

thank you for contacting us. To answer your questions:
-Regarding input units, these can be "Multi line" (the default one), "single Line", or Character. When using Single Line input unit and writing several lines, these will be merged as one line, resulting in a poor result. Regarding the Character input unit type, this latter one provides with the best accuracy for word recognition or numbers.... nevertheless, it doesn't accept cursive handwriting, you should ensure each letter is properly separated and writtent in the proper box.
-Regarding content type, the default one is "TEXT", which shall be used for any note... If you want to recognize special words (e.g. medical terms), you can create your own content type, which will enhance the accuracy. For numbers, you can use the corresponding content type.

Regarding your question 2, you should currently send 2 requests, as you can only set one content type per request. So, in request 1, you send "elephant" with corresponding content type, and for request 2, you send "123456" with numbers content type.

Best regards,

Login or Signup to post a comment