Skip to content

alignUI: Indian Language support with utf-8 encoding #2

@adplearn

Description

@adplearn

I find this tool very useful. So I compiled the jar as per the instruction given in manula.html. However, I am unable to work on Bangla (e.g., vrinda in MS) and Oriya (e.g., Kalinga in MS) Scripts. Sample parallel files are attached. It works correctly for test.hn (hindi) and English.
The exact issue is that the alignUI does not display test.mn or test.or sentences. It shows blank boxes as shown in the image file. But, when I link the words blindly it creates desired output file as expected in UTF-8 (aligned.2 file in the sample-utf8-data.zip). Neither the sentence pane nor the Wordbuttons displays the test.mn or test.or text properly.

The other minor issues are:
(1) There is no warning pop-up as described in the manual, which could prevent incomplete alignment of any pair.
(2) The current sentence number is displayed. It could be really useful for the annotator to have the Total number# and the completed sentence number# displayed along side the current Sentence ID (pair_id) at the top frame.

Can anyone help to resolve the above issues? Thanks in Advance.

sample-utf8-data.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions