We believe in the power of language to change lives through our programs
The Language Digitization Initiative is solutions-oriented and we focus on actionable deliverables.
The goal of Translation Commons’s Language Digitization Initiative (LDI) is to provide a vast array of resources such as toolkits, pilot studies, guidelines, and training to indigenous communities in order to improve their access to information.
Zero to Digital Guidelines
High-level overview and roadmap: evaluating the digital status of your language; the remaining steps and workflow needed for digitization of your language; and a glossary of useful terms.
Instructions and workflow for documenting words, abbreviations, and phrases in your language relating to the various areas of specialized activity in your community. How to coin new words in your language as needed for the process of digitization.
Guide to the kinds of existing language materials (print, audio, video, and otherwise) to be collected and preserved in order to support the digitization of your language. Explanation of the pairing of language materials with the various digital applications that require them.
Guides for documenting and recording your language. TC recommends two excellent resources:
The “Language Gathering and Collection Guide” created by Benjamin Chung of First People’s Cultural Council guides you through the steps of “eliciting” or collecting words, sentences, conversations, stories, and more in your language. It provides user-friendly instructions on how to get started, some basic linguistic information to be aware of, ways to enhance accuracy, and different methods of collecting data appropriate for stories, conversations, word lists, and what he calls “Rapid Word Collection” and “Group Recordings”. This well-structured website is very easy to use and includes video clips, graphics for prompts, sample word lists, photos, audio clips, testimonials, and written step-by-step instructions. It is a very robust treatment of language documentation for the non-linguist, and will be useful to beginners and language advocates with no prior training as well as those with experience in the field.
“Basic Oral Language Documentation” written by Will Reiman and hosted by ScholarSpace of the University of Hawai’i explains an oral-based approach to documenting your language. Many languages are primarily spoken and do not yet have a written tradition. In a nutshell, this approach uses three main steps without needing to wait for writing systems and spelling conventions to be established: “compile a sample of recordings of a full range of speech event types; comment on those recordings [also as audio recordings]; and archive the complete corpus of recordings with an institution that will provide long-term access.” Reiman explains the process, the equipment needed, different set-ups you can use, and benefits of this method. The webpage provides a pdf of Reiman’s lecture slides with notes as well as an mp3 file with a recording of his lecture explaining the method
TC Videos and Slide Presentations
Building Bridges for Digitization of Indigenous Languages by Jeannette Stewart (video)
An introduction to font issues for language support by Gerry Leonidas (video)
Text, Keyboard, Font, Essentials for your language on the internet by Craig Cornelius (video)
The Script Encoding Initiative and Unicode by Deborah Anderson (slides)
Language Documentation: A Brief Overview by Anna Belew (slides)
Terminology Management for Indigenous Languages by Sue Ellen Wright (video)
Terminology Management for Indigenous Languages by Sue Ellen Wright (slides)
IYIL 2019 Translation Commons Projects, TC Advisors (video)
Machine Translation for Indigenous Communities, by Kirti Vashee (video)
Discussion with UNESCO on IDIL, (video)
Indigenous Interpreters in Mexico, by Alexandra Hernandez Leon and Hector Santaella Barrera (video)
Modern NLP changes the requirements for building automatic Translation Systems, by Antonis Anastasopoulos (video)
Software Internationalization Principles and Strategies, by Tex Texin (video)
Bringing Newly Invented Scripts of South Asia into the Digital World, by Anshuman Pandey (video)
How Unicode Characters Become Glyphs on Your Screen, by Christopher Chapman, (video)
Making Optical Character Recognition Systems for Telugu, by Atul Negi (video)
Zero to Digital Series of Guidelines for Indigenous Communities, by Sue Ellen Wright (video)
Accelerating Support for Indigenous Languages in Digital Systems, by Jeannette Stewart and Craig Cornelius (video)
Digitization Solutions for Indigenous Languages, by Julie Anderson and Craig Cornelius (video)
Indigenous Languages Concerns of Identity and having Independent Script The case of India, by Siva Prasad Rambhatla (video)
Mukurtu CMS, by Michael Wynne (video)
Indigenous Community presentations
Digitizing Koits Sunuwar, by Dev Kumar (video)
Sunuwar Digitization, Interview with Dev Kumar, by Julie Anderson (video)
Digitizing the Chakma Language, by Bivuti Chakma (video)
Chakma Digitization by Bivuti (video)
Sora Sompeng by Sony Salma (slides)
Children’s story in Mehri by Janet Watson (slides)
Selim and his shadow in Mehri by Janet Watson (slides)
UN – UNESCO Documents
UNESCO’s Internet Universality Indicators
Publishing of the Zero to Digital: A Guide to Bring Your Language Online was an important milestone on our journey to prevent and reverse global language extinction. But there is still a lot of work to be done for ensuring human rights and connectivity for all the native languages. We call for localization and internationalization experts, language translation professionals, language community advocates, linguists, font designers and marketing experts to join our mission in breaking the language and communication barriers!
Machine Translation Guidelines
Zero to Digital: bringing a language online Pilot with Sunuwar language
Mapping the digitization status of all languages Database
Educational online teacher materials for sciences
Indigenous Translator Training