Sorry about the repeated questions but our customer would like to have a definitive answer about the possibility to improve handwritten recognition for both English and Japanese, so if possible could you answer whether the following is possible. If it is, how it can be done and if not, why it can't be done.
The problem as stated before is that the quality of recognition for English text with an editor configured for Japanese is not so good. However, it is pretty good if the editor is configured for English. So we would like a way to somehow be able to submit handwritten strokes for recognition to either an English editor or a Japanese editor while editing a single document and display the result on the screen normally.
In a previous thread, we discussed transferring data between 2 editors configured for Japanese and English respectively, but if I am not mistaken, the sample code you provided transfers text that has already been converted by the Japanese editor to the English editor. What we would actually need is to transfer the handwritten input from the Japanese editor to the English editor to then do the recognition.
Here are our questions:
1. Using the UIRefenrenceImplementation framework, is it possible to create an editor configured for en_US along the one created by the EditorView (which we configured for ja_JP).
2. Is it possible to export the handwritten input data from the Japanese editor?
3. Is it possible to import the handwritten data from 2. into the English editor?
4. Is it possible to convert the imported handwritten data in 3. with the English editor?
5. Is it possible to export the converted text from 4. ?
6. Is it possible to import the exported text from 5. back into the Japanese editor?
If any of the above is not possible, could you tell us which and explain why it is not possible?
We also thought about another way to reach our goal. We would provide a button to select between English and Japanese input and switch the current editor between one configured for English and one configured for Japanese when the user presses the button. So the user would be able to input some text in Japanese using the Japanese editor, then switch to the English editor to input text in English and back and forth. In such a case, if the user enters Japanese text while the English editor is the one selected, the text would probably get converted to something strange in English, but that is OK. We have the following question about this scenario.
1. Is it possible to create text blocks associated with different editors in the same EditorView?
I tried creating a test project where I modified the code of EditorView to try to implement the above, but it is causing all kinds of display and crash problems.
Our customer is putting a lot of importance on improving the detection to decide whether or not to move on with the project, so if there is any way to reach our goal, either with the above suggestions or some other way, we would really like to know it.
P.S. Our customer is in a rush to get a reply about the above, so if possible, could you give us a quick reply as you usually do?
Thank you!
Best Answer
O
Olivier @MyScript
said
almost 4 years ago
Dear Nicolas,
to answer your questions: Thank you for your suggestion to create a custom InputController. This is something we might look into, but we were wondering if you ever tried such an implementation and had some sample code you could share. >>This has not been done on our side, but but we do not see any reason it would not work. Indeed, the idea behind the uireferenceimplementation is that it is provided "as-is", and customers can re-implement and tune it according to their needs.
We don't see at this time how we should proceed to display the output from both editors on the same page. Do you have any suggestion for how that could be achieved? >>The only solution would be that you have 2 renderers (one for each editor) ; you would then create a custom view on top, in which would display the content of both renderers. To my knowledge, this is something that has never been done by customers, and is likely to be demanding to implement, without being sure it will work as expected.
Also, would it be possible to send those pointer events to the "Text" part of an English editor in the background to get an English conversion that we could then export to the foreground? Or is it impossible because the JIIX data is unfit for that or for some other reason? >>proceeding this way may work: you export as JIIX both your English and Japanese parts. With the JIIX, you know the coordinates/position of your English and Japanese text results. This allows you to re-create a "text string" with English words inserted into Japanese. You can then import the latter in your Japanese part.
As you guess based on my answers, your use-case is really in the limits of our iink SDK, and I am afraid we may not be able to match the expectations of your customer. Maybe the specification of the application should be reviewed, or address the use-case differently. If it would be possible to have more specification, maybe we could think of another way to proceed.
currently, as you rose several questions about recognizing Japanese mixed with English, I am going to explain a bit further our recognition in general works, which will provide you with a better overview of what can be achieved with our technology and what cannot: -Basically, for a given language, e.g. en_US, we have ink databases, which contain the way "american people write". -We also have lexicons, that basically contain words, first/last-names of a country (USA in this example) -We also have a "langauge model" that based on the context of the sentence can give more or less weight when our solution has several candidates available with basically the same probability. -This allows to have the best accuracy for a given country.
For example, for the en_GB, the ink databases we have were mostly collected in the UK, and it includes english words, such as "color" rather thann "colour" (for the en_US)...
This explains why we have language and country spcific languages, so that we have the best accuracy.
Also, as we saw english words are commonly added in any language, our solution also embeds some english words in any language. This is rather to recognize isolated words than to recognize an English sentence in another language (there is no advanced language model and other things)...
From this, you undserstand that our solution will never be perfect for recognizing English sentences mixed with Japanese sentences. Result may be acceptable, but will never be as good as if the English sentences were identified and recognized with our en_US language and Japanese sentences properly identified and recognized with the Japanese recognizer.
This probably answers most of the complaints you have from your customer.
Now, to answer your specific questions: 1. Using the UIRefenrenceImplementation framework, is it possible to create an editor configured for en_US along the one created by the EditorView (which we configured for ja_JP). 2. Is it possible to export the handwritten input data from the Japanese editor? 3. Is it possible to import the handwritten data from 2. into the English editor? >>This cannot be done easily. The easier is that you manually create a second editor, set it the en_US configuration. Then, you could create your own implementation of the InputController, which has 2 editors as input ; this way, in the handleOnTouchForPointer function, you could feed both editors (the en_US one and the ja_JP one) at the same time. 4. Is it possible to convert the imported handwritten data in 3. with the English editor? >>our solution will not be able to recognize properly the Japanese ink with the English editor. It will return english words, which the result cannot be predicted....
5. Is it possible to export the converted text from 4. ? >>Yes, just call the "export" function, but as said the result for the Japanese ink will be anything be Japanese...
6. Is it possible to import the exported text from 5. back into the Japanese editor? As-is, if the English ink is mixed with Japanese, there is no easy solution.... as the "import" function will add the content of the import, then the Japanese ink that would be reconized will be concatenated after.
If any of the above is not possible, could you tell us which and explain why it is not possible? >>Let us know if the above helps.
We also thought about another way to reach our goal. We would provide a button to select between English and Japanese input and switch the current editor between one configured for English and one configured for Japanese when the user presses the button. So the user would be able to input some text in Japanese using the Japanese editor, then switch to the English editor to input text in English and back and forth. In such a case, if the user enters Japanese text while the English editor is the one selected, the text would probably get converted to something strange in English, but that is OK. We have the following question about this scenario. >>This is indeed the solution we would recommend: You have "2 pens" (or a selector or button that allows to have 2 languages), one for English, one for Japanese, and you send the ink to the proper editor. This will go around the limitations I explained above.
1. Is it possible to create text blocks associated with different editors in the same EditorView? >>Currently, it seems like you are thinking of having a "Text Document" part, in which you would add several text blocks: https://developer.myscript.com/docs/interactive-ink/1.4/android/fundamentals/editing/#content-blocks With this approch, you would create the editor first (with already a setted language), and set the "Text Document" part. All your blocks would then have to be of the same language.
Based on the above, you understand we have no "magical solution" for your specific use-case, as our technology has not been designed for recognizing mixed languages ; the only solution that would seem acceptable from an accuracy point of view would be that you have 2 pens with 2 editors.
Best regards,
Olivier
1 person likes this
N
Nicolas Morin
said
almost 4 years ago
Thank you for your reply. We fully understand that the English
recognition would not be as good with the Japanese configuration as with
the English configuration. Thank you for the detailed explanation. The
problem is that our customer is reluctant to proceed with the project if
we cannot find a way to improve the English recognition while still
supporting Japanese. This is why we are looking at ways to send data
from specific text blocks to a different editor configured in English to
have it do conversions.
>> As-is, if the English ink is mixed with Japanese, there is no easy solution... This
does not appear to be an issue with our customer. We understand that
the English configuration cannot convert text to Japanese and the user
would use it only for sentences written in English only.
As you
noted we are using a "Text Document" Part for our UI because we need to
be able to support paragraphs as well as shapes in the future. We
discovered recently that some functionality, such as Import, available
to "Text" Parts is not available for Text Documents, so solutions that
would only work for "Text" parts would not work for us.
Thank you
for your suggestion to create a custom InputController. This is
something we might look into, but we were wondering if you ever tried
such an implementation and had some sample code you could share.
One
issue is that so far we have not been able to successfully create a
second editor without a UI (we were able to create one that owns its own
share of the screen). Any tips or sample code for that would also be
helpful.
>>This is indeed the solution we would recommend:
You have "2 pens" (or a selector or button that allows to have 2
languages), one for English, one for Japanese, and you send the ink to
the proper editor. This will go around the limitations I explained
above.
We don't see at this time how we should proceed to display
the output from both editors on the same page. Do you have any
suggestion for how that could be achieved?
On our side, we
started work on an experimental implementation using JIIX export. We
looked at the contents of the JIIX string and found the stroke
coordinates data. We removed some intermediate points in the sequence to
make it lighter but left the start and end points unchanged. We then
tried feeding that stroke data to the Editor using
Editor.pointerEvents(). The stroke is indeed displayed with the correct
shape but its scale is much smaller than the original stroke. Do you know why this is happening and how it can be fixed? Also,
would it be possible to send those pointer events to the "Text" part of
an English editor in the background to get an English conversion that
we could then export to the foreground? Or is it impossible because the
JIIX data is unfit for that or for some other reason?
Thank again!
O
Olivier @MyScript
said
almost 4 years ago
Answer
Dear Nicolas,
to answer your questions: Thank you for your suggestion to create a custom InputController. This is something we might look into, but we were wondering if you ever tried such an implementation and had some sample code you could share. >>This has not been done on our side, but but we do not see any reason it would not work. Indeed, the idea behind the uireferenceimplementation is that it is provided "as-is", and customers can re-implement and tune it according to their needs.
We don't see at this time how we should proceed to display the output from both editors on the same page. Do you have any suggestion for how that could be achieved? >>The only solution would be that you have 2 renderers (one for each editor) ; you would then create a custom view on top, in which would display the content of both renderers. To my knowledge, this is something that has never been done by customers, and is likely to be demanding to implement, without being sure it will work as expected.
Also, would it be possible to send those pointer events to the "Text" part of an English editor in the background to get an English conversion that we could then export to the foreground? Or is it impossible because the JIIX data is unfit for that or for some other reason? >>proceeding this way may work: you export as JIIX both your English and Japanese parts. With the JIIX, you know the coordinates/position of your English and Japanese text results. This allows you to re-create a "text string" with English words inserted into Japanese. You can then import the latter in your Japanese part.
As you guess based on my answers, your use-case is really in the limits of our iink SDK, and I am afraid we may not be able to match the expectations of your customer. Maybe the specification of the application should be reviewed, or address the use-case differently. If it would be possible to have more specification, maybe we could think of another way to proceed.
Nicolas Morin
Sorry about the repeated questions but our customer would like to have a definitive answer about the possibility to improve handwritten recognition for both English and Japanese, so if possible could you answer whether the following is possible. If it is, how it can be done and if not, why it can't be done.
The problem as stated before is that the quality of recognition for English text with an editor configured for Japanese is not so good. However, it is pretty good if the editor is configured for English. So we would like a way to somehow be able to submit handwritten strokes for recognition to either an English editor or a Japanese editor while editing a single document and display the result on the screen normally.
In a previous thread, we discussed transferring data between 2 editors configured for Japanese and English respectively, but if I am not mistaken, the sample code you provided transfers text that has already been converted by the Japanese editor to the English editor. What we would actually need is to transfer the handwritten input from the Japanese editor to the English editor to then do the recognition.
Here are our questions:
1. Using the UIRefenrenceImplementation framework, is it possible to create an editor configured for en_US along the one created by the EditorView (which we configured for ja_JP).
2. Is it possible to export the handwritten input data from the Japanese editor?
3. Is it possible to import the handwritten data from 2. into the English editor?
4. Is it possible to convert the imported handwritten data in 3. with the English editor?
5. Is it possible to export the converted text from 4. ?
6. Is it possible to import the exported text from 5. back into the Japanese editor?
If any of the above is not possible, could you tell us which and explain why it is not possible?
We also thought about another way to reach our goal. We would provide a button to select between English and Japanese input and switch the current editor between one configured for English and one configured for Japanese when the user presses the button. So the user would be able to input some text in Japanese using the Japanese editor, then switch to the English editor to input text in English and back and forth. In such a case, if the user enters Japanese text while the English editor is the one selected, the text would probably get converted to something strange in English, but that is OK. We have the following question about this scenario.
1. Is it possible to create text blocks associated with different editors in the same EditorView?
I tried creating a test project where I modified the code of EditorView to try to implement the above, but it is causing all kinds of display and crash problems.
Our customer is putting a lot of importance on improving the detection to decide whether or not to move on with the project, so if there is any way to reach our goal, either with the above suggestions or some other way, we would really like to know it.
P.S. Our customer is in a rush to get a reply about the above, so if possible, could you give us a quick reply as you usually do?
Thank you!
Dear Nicolas,
to answer your questions:
Thank you for your suggestion to create a custom InputController. This is something we might look into, but we were wondering if you ever tried such an implementation and had some sample code you could share.
>>This has not been done on our side, but but we do not see any reason it would not work. Indeed, the idea behind the uireferenceimplementation is that it is provided "as-is", and customers can re-implement and tune it according to their needs.
One issue is that so far we have not been able to successfully create a second editor without a UI (we were able to create one that owns its own share of the screen). Any tips or sample code for that would also be helpful.
>>Currently, this is done in our "batch mode" sample. You can refer to the latter: https://github.com/MyScript/interactive-ink-additional-examples-android/blob/master/java/samples/batch-mode/src/main/java/com/myscript/iink/samples/batchmode/MainActivity.java
We don't see at this time how we should proceed to display the output from both editors on the same page. Do you have any suggestion for how that could be achieved?
>>The only solution would be that you have 2 renderers (one for each editor) ; you would then create a custom view on top, in which would display the content of both renderers. To my knowledge, this is something that has never been done by customers, and is likely to be demanding to implement, without being sure it will work as expected.
Do you know why this is happening and how it can be fixed?
>>The reason is that the coordinates of the JIIX file are in milimters, while the display of your device is in pixels.
You can proceed as explained in the following topics:
https://developer-support.myscript.com/support/discussions/topics/16000028728
https://developer-support.myscript.com/support/discussions/topics/16000028588
Also, would it be possible to send those pointer events to the "Text" part of an English editor in the background to get an English conversion that we could then export to the foreground? Or is it impossible because the JIIX data is unfit for that or for some other reason?
>>proceeding this way may work: you export as JIIX both your English and Japanese parts.
With the JIIX, you know the coordinates/position of your English and Japanese text results. This allows you to re-create a "text string" with English words inserted into Japanese. You can then import the latter in your Japanese part.
As you guess based on my answers, your use-case is really in the limits of our iink SDK, and I am afraid we may not be able to match the expectations of your customer. Maybe the specification of the application should be reviewed, or address the use-case differently.
If it would be possible to have more specification, maybe we could think of another way to proceed.
Best regards,
Olivier
- Oldest First
- Popular
- Newest First
Sorted by Oldest FirstOlivier @MyScript
Dear Nicolas,
currently, as you rose several questions about recognizing Japanese mixed with English, I am going to explain a bit further our recognition in general works, which will provide you with a better overview of what can be achieved with our technology and what cannot:
-Basically, for a given language, e.g. en_US, we have ink databases, which contain the way "american people write".
-We also have lexicons, that basically contain words, first/last-names of a country (USA in this example)
-We also have a "langauge model" that based on the context of the sentence can give more or less weight when our solution has several candidates available with basically the same probability.
-This allows to have the best accuracy for a given country.
For example, for the en_GB, the ink databases we have were mostly collected in the UK, and it includes english words, such as "color" rather thann "colour" (for the en_US)...
This explains why we have language and country spcific languages, so that we have the best accuracy.
Also, as we saw english words are commonly added in any language, our solution also embeds some english words in any language. This is rather to recognize isolated words than to recognize an English sentence in another language (there is no advanced language model and other things)...
From this, you undserstand that our solution will never be perfect for recognizing English sentences mixed with Japanese sentences. Result may be acceptable, but will never be as good as if the English sentences were identified and recognized with our en_US language and Japanese sentences properly identified and recognized with the Japanese recognizer.
This probably answers most of the complaints you have from your customer.
Now, to answer your specific questions:
1. Using the UIRefenrenceImplementation framework, is it possible to create an editor configured for en_US along the one created by the EditorView (which we configured for ja_JP).
2. Is it possible to export the handwritten input data from the Japanese editor?
3. Is it possible to import the handwritten data from 2. into the English editor?
>>This cannot be done easily. The easier is that you manually create a second editor, set it the en_US configuration.
Then, you could create your own implementation of the InputController, which has 2 editors as input ; this way, in the handleOnTouchForPointer function, you could feed both editors (the en_US one and the ja_JP one) at the same time.
4. Is it possible to convert the imported handwritten data in 3. with the English editor?
>>our solution will not be able to recognize properly the Japanese ink with the English editor. It will return english words, which the result cannot be predicted....
5. Is it possible to export the converted text from 4. ?
>>Yes, just call the "export" function, but as said the result for the Japanese ink will be anything be Japanese...
6. Is it possible to import the exported text from 5. back into the Japanese editor?
As-is, if the English ink is mixed with Japanese, there is no easy solution.... as the "import" function will add the content of the import, then the Japanese ink that would be reconized will be concatenated after.
If any of the above is not possible, could you tell us which and explain why it is not possible?
>>Let us know if the above helps.
We also thought about another way to reach our goal. We would provide a button to select between English and Japanese input and switch the current editor between one configured for English and one configured for Japanese when the user presses the button. So the user would be able to input some text in Japanese using the Japanese editor, then switch to the English editor to input text in English and back and forth. In such a case, if the user enters Japanese text while the English editor is the one selected, the text would probably get converted to something strange in English, but that is OK. We have the following question about this scenario.
>>This is indeed the solution we would recommend: You have "2 pens" (or a selector or button that allows to have 2 languages), one for English, one for Japanese, and you send the ink to the proper editor. This will go around the limitations I explained above.
1. Is it possible to create text blocks associated with different editors in the same EditorView?
>>Currently, it seems like you are thinking of having a "Text Document" part, in which you would add several text blocks: https://developer.myscript.com/docs/interactive-ink/1.4/android/fundamentals/editing/#content-blocks
With this approch, you would create the editor first (with already a setted language), and set the "Text Document" part. All your blocks would then have to be of the same language.
Based on the above, you understand we have no "magical solution" for your specific use-case, as our technology has not been designed for recognizing mixed languages ; the only solution that would seem acceptable from an accuracy point of view would be that you have 2 pens with 2 editors.
Best regards,
Olivier
1 person likes this
Nicolas Morin
Thank you for your reply. We fully understand that the English recognition would not be as good with the Japanese configuration as with the English configuration. Thank you for the detailed explanation. The problem is that our customer is reluctant to proceed with the project if we cannot find a way to improve the English recognition while still supporting Japanese. This is why we are looking at ways to send data from specific text blocks to a different editor configured in English to have it do conversions.
>> As-is, if the English ink is mixed with Japanese, there is no easy solution...
This does not appear to be an issue with our customer. We understand that the English configuration cannot convert text to Japanese and the user would use it only for sentences written in English only.
As you noted we are using a "Text Document" Part for our UI because we need to be able to support paragraphs as well as shapes in the future. We discovered recently that some functionality, such as Import, available to "Text" Parts is not available for Text Documents, so solutions that would only work for "Text" parts would not work for us.
Thank you for your suggestion to create a custom InputController. This is something we might look into, but we were wondering if you ever tried such an implementation and had some sample code you could share.
One issue is that so far we have not been able to successfully create a second editor without a UI (we were able to create one that owns its own share of the screen). Any tips or sample code for that would also be helpful.
>>This is indeed the solution we would recommend: You have "2 pens" (or a selector or button that allows to have 2 languages), one for English, one for Japanese, and you send the ink to the proper editor. This will go around the limitations I explained above.
We don't see at this time how we should proceed to display the output from both editors on the same page. Do you have any suggestion for how that could be achieved?
On our side, we started work on an experimental implementation using JIIX export. We looked at the contents of the JIIX string and found the stroke coordinates data. We removed some intermediate points in the sequence to make it lighter but left the start and end points unchanged. We then tried feeding that stroke data to the Editor using Editor.pointerEvents(). The stroke is indeed displayed with the correct shape but its scale is much smaller than the original stroke.
Do you know why this is happening and how it can be fixed?
Also, would it be possible to send those pointer events to the "Text" part of an English editor in the background to get an English conversion that we could then export to the foreground? Or is it impossible because the JIIX data is unfit for that or for some other reason?
Thank again!
Olivier @MyScript
Dear Nicolas,
to answer your questions:
Thank you for your suggestion to create a custom InputController. This is something we might look into, but we were wondering if you ever tried such an implementation and had some sample code you could share.
>>This has not been done on our side, but but we do not see any reason it would not work. Indeed, the idea behind the uireferenceimplementation is that it is provided "as-is", and customers can re-implement and tune it according to their needs.
One issue is that so far we have not been able to successfully create a second editor without a UI (we were able to create one that owns its own share of the screen). Any tips or sample code for that would also be helpful.
>>Currently, this is done in our "batch mode" sample. You can refer to the latter: https://github.com/MyScript/interactive-ink-additional-examples-android/blob/master/java/samples/batch-mode/src/main/java/com/myscript/iink/samples/batchmode/MainActivity.java
We don't see at this time how we should proceed to display the output from both editors on the same page. Do you have any suggestion for how that could be achieved?
>>The only solution would be that you have 2 renderers (one for each editor) ; you would then create a custom view on top, in which would display the content of both renderers. To my knowledge, this is something that has never been done by customers, and is likely to be demanding to implement, without being sure it will work as expected.
Do you know why this is happening and how it can be fixed?
>>The reason is that the coordinates of the JIIX file are in milimters, while the display of your device is in pixels.
You can proceed as explained in the following topics:
https://developer-support.myscript.com/support/discussions/topics/16000028728
https://developer-support.myscript.com/support/discussions/topics/16000028588
Also, would it be possible to send those pointer events to the "Text" part of an English editor in the background to get an English conversion that we could then export to the foreground? Or is it impossible because the JIIX data is unfit for that or for some other reason?
>>proceeding this way may work: you export as JIIX both your English and Japanese parts.
With the JIIX, you know the coordinates/position of your English and Japanese text results. This allows you to re-create a "text string" with English words inserted into Japanese. You can then import the latter in your Japanese part.
As you guess based on my answers, your use-case is really in the limits of our iink SDK, and I am afraid we may not be able to match the expectations of your customer. Maybe the specification of the application should be reviewed, or address the use-case differently.
If it would be possible to have more specification, maybe we could think of another way to proceed.
Best regards,
Olivier