Computational Linguist (Volunteer)
Engineering Team
Position Description
Translation Commons is a nonprofit public charity dedicated to
providing a reliable and scalable method for digitally rendering
languages. Our mission is to solve the problem of unrepresented
languages struggling to participate in a global communications
network, and since 2015, we have successfully helped digitize numerous
endangered languages.
Translation Commons is a nonprofit public charity dedicated to
ensuring all unrepresented languages can participate fully in the
global digital communications network. Our mission involves solving
the problem of unrepresented languages struggling to participate in a
global communications network.
We are seeking a motivated and skilled Computational Linguist to
join our volunteer team within the Language Digitization Initiative.
This crucial role focuses on developing tools that address fundamental
challenges in digitizing Indigenous Languages, ensuring their presence
and vitality online. This is a remote, volunteer opportunity that
offers a fantastic opportunity to apply your technical expertise
toward meaningful cultural preservation and language development.
Key Responsibilities
- The Computational Linguist will focus on the practical application of linguistic data and algorithms to create essential language digitization tools.
- Tool Development: Design, develop, and test morphologizer tools (e.g., morphological analyzers and generators) tailored for the unique challenges of specific Indigenous Languages.
- Data Analysis: Work with linguistic experts and native speakers to process and structure language data (lexicons, corpora, grammars) necessary for building accurate computational models.
- Model Building: Apply expertise in computational linguistics and natural language processing (NLP) to create and refine models for analysis (e.g., tokenization, segmentation, part-of-speech tagging, and morphological disambiguation).
- Documentation & Resources: Document the developed tools, models, and data sets to ensure they are accessible, maintainable, and usable by community members and other researchers.
- Collaboration: Coordinate with the Program Managers, linguistic advisors, and community stakeholders to understand requirements and validate the utility and accuracy of the tools.
Desired Skills and Experience
- We are seeking professionals with a strong background in both computer science and theoretical linguistics, and a passion for language preservation.
- Technical Expertise: Strong background or hands-on experience in Computational Linguistics, NLP, or Language Technology.
- Linguistic Knowledge: Demonstrated understanding of morphology, syntax, and phonology, particularly as they relate to under-resourced or polysynthetic languages.
- Programming Skills: Proficiency in relevant programming languages (e.g., Python) and experience with NLP libraries (e.g., NLTK, spaCy).
- Problem Solving: Ability to identify complex linguistic and technical challenges and use creativity and initiative to develop effective computational solutions.
- Collaboration: Strong communication skills and the ability to work effectively within an international, remote volunteer team.
- Commitment: A weekly commitment of at least 5 hours is required.