Page 126 - Kỷ yếu hội thảo khoa học quốc tế - Ứng dụng công nghệ mới trong công trình xanh , lần thứ 8
P. 126

th
               HỘI THẢO KHOA HỌC QUỐC TẾ ATiGB LẦN THỨ TÁM - The 8  ATiGB 2023                         109

                  Model accuracy will be benchmarked on test data   • On-device dictionaries and NLP for disambiguation
               across  all  languages.  The  optimization  process  will   and context.
               focus on maximizing accuracy given model size and   By targeting model improvements and leveraging
               latency  constraints.  Comparisons  to  single  language   mobile-optimized  NLP,  the  multi-lingual  approach
               models will validate the multi-lingual approach. User   can  reach  accuracy  closer  to  specialized  individual
               feedback will be  incorporated to refine the interface   models.
               and  experience.  Figure  5  shows  how  our  proposed
               work shows 100% accuracy in recognizing text using   V. CONCLUSIONS
               OCR on a student ID card.                         This  research  demonstrated  the  feasibility  of  a
                  IV.  RESULTS AND ANALYSIS                   multi-lingual  OCR  system  for  extracting  text  from
                                                              international student ID cards. The approach built on
                  The multi-lingual OCR model was evaluated on a   existing  Vietnamese/English  models  to  train  and
               held-out test set of student ID card images across the   optimize OCR - only a 2-3% decrease from individual
               2  target  languages  of our  study.  OCR  accuracy  was   uni-lingual  models  per  language.  It  achieved  94.5%
               measured  as  the  percentage  of  text  characters   accuracy  for  multi-lingual  model  on  average  for
               correctly  extracted  from  the ID  card  region.  Table  I   English  and  Vietnamese  both  and  95.35%  accuracy
               summarizes  the  results  compared  to  individual  uni-  for  uni-lingual  model  on  average  for  English  and
               lingual models.                                Vietnamese  languages.  This  represents  a  promising
                      Table I. OCR Accuracy By Language       proof  of  concept  given  the  significant  challenge  of
                                                              recognizing  diverse  scripts  and  designs.  If  accuracy
                                Multi-Lingual   Uni-Lingual
                   Language                                   can be further improved, the system would provide a
                                  Model          Model
                                                              convenient  and  inclusive  student  ID  card  reader
                    English       95.2%          96.5%
                                                              usable by international populations.
                  Vietnamese      93.8%          94.2%
                                                                 The multi-lingual model was efficiently optimized
                  As  seen  in  Table  I,  the  multi-lingual  model   for on-device deployment using TensorFlow Lite and
               achieves comparable accuracy to the specialized uni-  mobile-specific  architectures.  This  enables  fast  and
               lingual models, with around a 1-3% decrease across   reliable  OCR  performance  on  students'  own  devices
               both  languages.  Accuracy  above  94%  can  be   without  relying  on  network  connectivity.  The  user
               considered viable for practical use.           interface  and  experiences  were  designed  to  be
                  A. Error Analysis                           intuitive  across  languages  and  scripts.  These
                   a)  Characters are mostly correct, but some are   innovations  successfully  built  upon  the  initial
               mistaken for spaces or misses.                 Vietnamese/English  OCR  system  described  in  our
                                                              study  to  address  the  needs  of  international
                   b)  Extra   spaces   are   predicted   between   universities.
               characters.
                   c)  Some characters are missed entirely.
                  These  errors  may  stem  from  the  challenging
               segmentation of cursive English or Vietnamese script
               (figures  6  &  7).  Language  disambiguation  is  also
               difficult  when  characters  are  analyzed  individually
               rather than in context.
                  B. Optimizations
                  The  following  measures  can  further  optimize
               multi-lingual accuracy:
                  • Increased  training  data  for  low-accuracy
               languages.
                  • Regularization  and  aggressive  augmentation  to
               reduce overfitting.
                  • CTC  loss  training  to  improve  sequence
               prediction.
                  • Ensemble  modelling  to  combine  strengths  of
               different architectures.                           Fig. 6. Correct 85.71%, can't recognize 'ô', 'ô'
                                                                               into letter 'o'


                                                                                   ISBN: 978-604-80-9122-4
   121   122   123   124   125   126   127   128   129   130   131