CIRCA:Bangla

From CIRCA

(Difference between revisions)
Jump to: navigation, search
 
Line 17: Line 17:
'''
'''
-
The project relies on three main types of technological functionalities. First, machine learning or artificial intelligence (AI). Most of the parts of this project are based on data-driven machine learning and natural language processing (NLP). In this case, the main technology was used is speech, text, and image processing tasks. Second, font, keyboard, and encoding, without any machine learning component. Third, creating the standard, which is mainly documentation.  
+
The project relies on three main types of technological functionalities.
 +
*Machine learning or artificial intelligence (AI). Most of the parts of this project are based on data-driven machine learning and natural language processing (NLP). In this case, the main technology was used is speech, text, and image processing tasks.
 +
*Font, keyboard, and encoding, without any machine learning component.
 +
*Creating the standard, which is mainly documentation.  

Current revision as of 18:21, 24 October 2021

From June 2017, the Government of Bangladesh initiated a large-scale project on the Bangla language called Bangla. Considering the public benefits, this project will make more than 40 tools available for the public to use free of cost. Although some of the tools have already been developed until September 2021, still they are not launched. Also, the project is yet to be completed and will continue till 2023. For this reason, much information about the project is not available now.


Contents

Brief History

The interface of the project website.

Bangladesh Computer Council (BCC) is a government body of the government of Bangladesh that mainly trained people to enhance their digital skills and knowledge during the 2000s. After 2004, the president of Bangladesh emphasized more on the Bangla language development. As an outcome, they started working with Unicode and developing a national Bangla keyboard. Now, they have a few Tier-3 and Tier-4 data centers. Digitalization of the public facilities was one of the most important commitments of the ruling government in their three previous electoral manifestos: in 2009, 2014, and 2018. For that reason, they are trying to work and develop analog communication infrastructures and transform them into digital. In the middle of the 2010s, the government contemplated a more comprehensive project on the Bangla language. They took an initiative in 2016 and in 2017 a project was approved by the Planning Ministry. BCC was appointed to accomplish the project now called Bangla. It aimed at developing more than 40 Bangla language tools under different 16 packages, including voice recognition, speech-to-text, topic modeling, sentiment analysis.


Technologies Used

The project relies on three main types of technological functionalities.

  • Machine learning or artificial intelligence (AI). Most of the parts of this project are based on data-driven machine learning and natural language processing (NLP). In this case, the main technology was used is speech, text, and image processing tasks.
  • Font, keyboard, and encoding, without any machine learning component.
  • Creating the standard, which is mainly documentation.


People Working

The public-facing services of the project.

The project management body is comprised of one project administrator, nine consultants, and other managerial employees. Consultants are mainly language-technology experts, software architects, machine learning experts, who are the core technical persons and execute the main intellectual tasks of the project. Other members mainly execute the clerical and processing tasks. Apart from the project members, two other parties are involved in this project: the academic wing and the third-party vendor. The intellectual wing from the project body develops concepts for the academic wing, they conducted the research, project members review the reports, and then the vendors execute the tasks. In that case, the role of the project members is to supervise and make policy; the academic wing to do the fieldwork and built a theoretical model; the vendor to implement the model and produce some real-life outputs. In this respect, the project enables collaboration between academia and industry mandatory.


Problems the Project will Solve

Bangla has been a low-resource language for a long time. As a result, digital benefits are barely available in the Bangla language. For example, the University Grants Commission (UGC) of Bangladesh proposed Turnitin to include the Bangla language into their service list, but they denied it. Such examples are ample. Multinational companies are reluctant to work with the Bangla language. For these reasons, Bangla needed to be self-dependent, developing its own tools. This project can be a pioneering initiative in this journey.


Methodologies the Project Follows

The methodology adopted for this project is more-or-less same as writing a research paper. At first, they create concept notes, which they call inception reports. It is basically the theoretical framework of their work. In the second phase, the involved bodies collect data, train them, and develop an application programming interface (API). Lastly, they develop tools and deploy them for consumption.


Money Distribution

This project is entirely government-funded, meaning the government of Bangladesh is bearing all the costs of this project. To break the financial lineup down, Planning Ministry allocates funds to the Information and Communication Technology Ministry (ICT). ICT ministry gives money to the project. The project checks the quality of the works and decides how much money should be disbursed toward the academic and vendor bodies. Since it is a large-scale and nationwide project, money management is a crucial part. However, since it is a government-funded project and financial information is mostly confidential, it should be difficult to determine whether the money is utilized properly. Given the present development in the project, it seems the management is satisfactory, and the tools are going to be effective for the public.

Personal tools