Invited Sessions
Special Invited Panel 1
New Horizons in Language Documentation: Making Dictionaries for Learners and Communities
Indigenous Presenters
- Skye Whiting - Tahltan
- Tanya Louie - Tahltan
- Kathy Cottrell - Tahltan
- Pauline Hawkins - Tahltan
- Patricia Louie - Tahltan
- Verna Vance - Tahltan
- Lesli Louie - Tahltan
- Vance Crookedarm - Crow
- Emerson Bull Chief - Crow
- Velma Pretty On Top - Crow
- Roanne Hill - Crow
- Bryan Hudson - Eastern Shoshone
Non-Indigenous Presenters
- Willem De Reuse - TLC Apache Linguist
- Elliot Thornton - TLC Dictionary Database Manager
- Abbie Hantgan-Sonko - TLC Linguist
- Bob Rugh - TLC Crow Specialist
AV Equipment
- Projector for ppt presentation
- Speakers to play audio
- Internet connection to showcase published online dictionaries
The Language Conservancy (TLC)—a 501(c)(3) non-profit with headquarters in Bloomington, Indiana—is a worldwide leader in supporting endangered Indigenous languages. We work closely with communities to determine what solutions best meet community needs. We offer unique solutions for language documentation and revitalization with a strong emphasis on providing the technical knowledge to quickly and effectively develop language resources such as dictionaries, mobile applications, and pedagogical materials.
There is an increasingly urgent need to quickly document the lexical inventory of endangered languages as speaker populations diminish and the window for working with first-language speakers closes. Traditional approaches to developing dictionaries through multi-decade fieldwork and text-corpus development are often inadequate solutions to contexts where there are few speakers among an aging population. This panel will showcase eight different dictionary projects from across North America. We will present enhanced methods for developing accessible, interactive. online, mobile and print dictionaries for endangered Indigenous Languages. The panel presenters are representatives of the Crow Agency in Montana, the Wind River Reservation in Wyoming, the San Carlos Indian Reservation, the Yavapai-Apache Nation in Arizona, and the Tahltan Nation in British Columbia. Each presenter will describe their tailored process to dictionary development and show the current status of their project towards a published (online, mobile and print) dictionary.
The basis for most TLC dictionaries is the Rapid Word Collection (RWC) method, originally developed by SIL International (2010) in order to create practical dictionaries in a relatively short period of time. TLC has adapted RWC semantic domain associations to the North American endangered language situation and has equipped it to run in-person or virtually. Using a specialized software tool TLC developed, speakers work with a scribe to collect words, glosses and audio recordings as well as supplemental information for each entry.
TLC has successfully conducted in-person and virtual events in which between 5,000 and 14,000 words have been collected for several First Nations and Tribes. These workshops spark a true sense of ownership, commitment, and community among the participants while competing in a friendly manner with each other for the most words collected in a day. Following an RWC event, all collected data are consolidated and organized to be reviewed by trained linguists to ensure standardized spelling, accurate transcription, and grammatical consistency. The data are then organized for further review in subsequent workshops, using another exclusively developed collection tool. In this paper, we illustrate this methodology by presenting case-studies from a variety of languages and community settings and present the output online (e.g., Ute 2022), mobile (e.g., Stoney, 2021) and print (Rugh et al, 2022) dictionaries.
References
Rugh, Bob, Graczyk, Randolph and McCleary, Timothy P. (Eds.). (2022). Crow Dictionary, 1st Edition. Crow Language Consortium, Billings, MT.
Stoney Mobile Dictionary - Apps on Google Play. The Language Conservancy, Dec. (2021), https://play.google.com/store/apps/details?id=org.stoneynakoda.dictionary Retrieved June 1, 2022
SIL International. (2010). http://www.rapidwords.net/. Retrieved April 5, 2022.
Ute Mountain Ute Dictionary, Ute Mountain Ute. 2022. https://dictionary.utelanguage.org/ Retrieved June 1, 2022.
Warfel, Kevin. (2016). Dictionary Production: Rapid Word Collection Method. Brochure SIL International. Retrieved 2020, from http://www.rapidwords.net/resources/files/rapid-word-collection-flyer
Special Invited Panel 2
New Horizons in Language Documentation: Making Dictionaries for Learners and Communities
Indigenous Presenters
- Shane YellowThunder - Ho-Chunk
- Alex Fire Thunder - Lakota
- Charleen Fisher - Gwich’in
- Paul Williams Jr - Gwich’in
- Šišóka Duta - Dakota
- Cherith Mark - Stoney Nakoda
- Terry Rider - Stoney Nakoda
- Juanita Plentyholes - Ute Mountain Ute
- Colleen Cuthair-Root - Ute Mountain Ute
Non-Indigenous Presenters
- Elliot Thornton - TLC Dictionary Database Manager
- Abbie Hantgan-Sonko - TLC Linguist
- Corey Telfer - TLC Stoney Nakoda Linguist
- Bob Rugh - TLC Crow Specialist
AV Equipment
- Projector for ppt presentation
- Speakers to play audio
- Internet connection to showcase published online dictionaries
The Language Conservancy (TLC)—a 501(c)(3) non-profit with headquarters in Bloomington, Indiana—is a worldwide leader in supporting endangered Indigenous languages. We work closely with communities to determine what solutions best meet community needs. We offer unique solutions for language documentation and revitalization with a strong emphasis on providing the technical knowledge to quickly and effectively develop language resources such as dictionaries, mobile applications, and pedagogical materials.
There is an increasingly urgent need to quickly document the lexical inventory of endangered languages as speaker populations diminish and the window for working with first-language speakers closes. Traditional approaches to developing dictionaries through multi-decade fieldwork and text-corpus development are often inadequate solutions to contexts where there are few speakers among an aging population. This panel will showcase eight different dictionary projects from across North America. We will present enhanced methods for developing accessible, interactive. online, mobile and print dictionaries for endangered Indigenous Languages.The panel presenters are representatives of the Ho-Chunk Nation in Wisconsin, Dakhóta Iápi Okhódakičhiye, a 501(c)(3) non-profit in Minnesota and South Dakota, Beaver Village in Alaska, the Ute Mountain Ute Tribe in Colorado and the Stoney Nakoda Nation in Alberta. Each presenter will describe their tailored process to dictionary development and show the current status of their project towards a published (online, mobile and print) dictionary.
The basis for most TLC dictionaries is the Rapid Word Collection (RWC) method, originally developed by SIL International (2010) in order to create practical dictionaries in a relatively short period of time. TLC has adapted RWC semantic domain associations to the North American endangered language situation and has equipped it to run in-person or virtually. Using a specialized software tool TLC developed, speakers work with a scribe to collect words, glosses and audio recordings as well as supplemental information for each entry.
TLC has successfully conducted in-person and virtual events in which between 5,000 and 14,000 words have been collected for several First Nations and Tribes. These workshops spark a true sense of ownership, commitment, and community among the participants while competing in a friendly manner with each other for the most words collected in a day.
Following an RWC event, all collected data are consolidated and organized to be reviewed by trained linguists to ensure standardized spelling, accurate transcription, and grammatical consistency. The data are then organized for further review in subsequent workshops, using another exclusively developed collection tool. In this paper, we illustrate this methodology by presenting case-studies from a variety of languages and community settings and present the output online (e.g., Ute 2022), mobile (e.g., Stoney, 2021) and print (Rugh et al, 2022) dictionaries.
References
Rugh, Bob, Graczyk, Randolph and McCleary, Timothy P. (Eds.). (2022). Crow Dictionary, 1st Edition. Crow Language Consortium, Billings, MT.
Stoney Mobile Dictionary - Apps on Google Play. The Language Conservancy, Dec. (2021), https://play.google.com/store/apps/details?id=org.stoneynakoda.dictionary Retrieved June 1, 2022
SIL International. (2010). http://www.rapidwords.net/. Retrieved April 5, 2022.
Ute Mountain Ute Dictionary, Ute Mountain Ute. 2022. https://dictionary.utelanguage.org/ Retrieved June 1, 2022.
Warfel, Kevin. (2016). Dictionary Production: Rapid Word Collection Method. Brochure SIL International. Retrieved 2020, from http://www.rapidwords.net/resources/files/rapid-word-collection-flyer
Special Invited Roundtables
Special Invited Roundtable 1
New Horizons in Language Documentation: Making Modern Dictionaries Using New Software and Database Management Systems
Indigenous Presenters
- Skye Whiting - Tahltan
- Tanya Louie - Tahltan
- Kathy Cottrell - Tahltan
- Pauline Hawkins - Tahltan
- Patricia Louie - Tahltan
- Verna Vance - Tahltan
- Lesli Louie - Tahltan
- Vance Crookedarm - Crow
- Emerson Bull Chief - Crow
- Velma Pretty On Top - Crow
- Roanne Hill - Crow
- Bryan Hudson - Eastern Shoshone
Non-Indigenous Presenters
- Elliot Thornton - TLC Dictionary Database Manager
- Abbie Hantgan-Sonko - TLC Linguist
- Bob Rugh - TLC Crow Specialist
- Corey Telfer - TLC Stoney Nakoda Linguist
AV Equipment
- Projector for ppt presentation
- Speakers to play audio
- Internet connection to showcase published online dictionaries
This roundtable discussion will be a follow-up to the dictionary making panel but will dive deeper into the technical aspects of the process. Presenters will share the word collection process and workshop preparation as well as database development procedures. We will showcase a number of software tools developed by The Language Conservancy (TLC) that allow for a variety of custom tailored approaches to workshops and creation of dictionaries.
For the Crow, Tahltan and Eastern Shoshone projects, TLC organized multi-week Rapid Word Collection (RWC) events systematically going through a set of 1,800 semantic domains in 4-10 groups of 3-5 speakers each. Following these events, we organized a series of Record and Review Collections (RRCs) to check all entries for accuracy and add information as needed.
On the other hand, for the Yavapai, and Apache projects, the teams used existing published and unpublished wordlists, dictionaries and databases, to create prompts for which speakers could add recordings. TLC linguists and programmers organized the existing entries into semantic domains. Participants were given the option to add new entries in this hybrid RWC/RRC approach.
Once recorded, all data are imported into the dictionary database management tool TshwaneLex (TLex). The data are kept on a shared server so editors within TLC and tribal members can access and edit the data collaboratively. TLex’s Lua scripting access allows for complex automations to enhance the handling of many dictionary building operations including merging duplicates, generating rich reversal entries, organizing and mapping example sentences, bulk reorganization of inflected forms, and conditional styling for rich exports.
Crow, Ho-Chunk, and Tahltan’s RWC projects served to expand existing smaller dictionaries so that data were converted to the TLex format and merged into the database.
In all steps of recording and editing, a chain of custody is maintained in TLex with ID numbers, modified dates, and tracking codes to ensure edits can be explained and validated at each step. TLC’s Record and Review software directly reads the .xml export from TLex, providing seamless integration for further recording collection and editing without needing full TLex training.
Established internal TLex internal training methods and materials are now being provided to teams of Indigenous linguists and editors from the Tahltan and Ho-Chunk dictionary projects to facilitate skills transfer and help create independent stakeholder-driven dictionary projects.
Tahltan will be integrating further expansion of their dictionary into the work of new speaker apprentice programs.
The outcome of a dictionary project is the dictionary itself. The philosophy at every stage is to “always be proofing” in TLC’s TLex processes. TLex is designed to reflect the final product with print and digital styles maintained and updated, with audio available to all editors at all times.
The design of the digital dictionaries is focused on serving the needs of language learners. Each dictionary has an advanced search engine behind it set up to allow common phonological mistakes and provide users with access to entries whether they are seeking to translate text or roughly spell out a word they have heard. All of the text within an entry is automatically linked to entries with the help of various styles of lemmatization in English and the target language. A common experience is provided between mobile, desktop, and web users ensuring accessibility regardless of platform.
Since TLex provides the ability to always be proofing, after the first release, much of the export process is automated so additional updates can be created with ever increasing frequency.
Digital formats provide a way to easily receive community feedback which is integrated in most cases annually or biannually.
This roundtable will demonstrate the tools used to create our online, mobile and print dictionaries and give a lot of space to discussion between the presenters and the audience.
Special Invited Roundtable 2
New Horizons in Language Documentation: Making Modern Dictionaries Using New Software and Database Management Systems
Indigenous Presenters
- Shane YellowThunder - Ho-Chunk
- Alex Fire Thunder - Lakota
- Charleen Fisher - Gwich’in
- Paul Williams Jr - Gwich’in
- Šišóka Duta - Dakota
- Cherith Mark - Stoney Nakoda
- Terry Rider Stoney - Nakoda
- Juanita Plentyholes - Ute Mountain Ute
- Colleen Cuthair-Root - Ute Mountain Ute
Non-Indigenous Presenters
- Elliot Thornton - TLC Dictionary Database Manager
- Abbie Hantgan-Sonko - TLC Linguist
- Bob Rugh - TLC Crow Specialist
- Corey Telfer - TLC Stoney Nakoda Linguist
AV Equipment
- Projector for ppt presentation
- Speakers to play audio
- Internet connection to showcase published online dictionaries
This roundtable discussion will be a follow-up to the dictionary making panel but will dive deeper into the technical aspects of the process. Presenters will share the word collection process and workshop preparation as well as database development procedures. We will showcase a number of software tools developed by The Language Conservancy (TLC) that allow for a variety of custom tailored approaches to workshops and creation of dictionaries.
For the Ute Mountain Ute, Ho-Chunk and Stoney projects, TLC organized multi-week Rapid Word Collection (RWC) events systematically going through a set of 1,800 semantic domains in 4-10 groups of 3-5 speakers each. Following these events, we organized a series of Record and Review Collections (RRCs) to check all entries for accuracy and add information as needed.
On the other hand, for the Gwich’in project, the teams used existing published and unpublished wordlists, dictionaries and databases, to create prompts for which speakers could add recordings. TLC linguists and programmers organized the existing entries into semantic domains. Participants were given the option to add new entries in this hybrid RWC/RRC approach.
Once recorded, all data are imported into the dictionary database management tool TshwaneLex (TLex). The data are kept on a shared server so editors within TLC and tribal members can access and edit the data collaboratively. TLex’s Lua scripting access allows for complex automations to enhance the handling of many dictionary building operations including merging duplicates, generating rich reversal entries, organizing and mapping example sentences, bulk reorganization of inflected forms, and conditional styling for rich exports.
Crow, Ho-Chunk, and Tahltan’s RWC projects served to expand existing smaller dictionaries so that data were converted to the TLex format and merged into the database.
In all steps of recording and editing, a chain of custody is maintained in TLex with ID numbers, modified dates, and tracking codes to ensure edits can be explained and validated at each step. TLC’s Record and Review software directly reads the .xml export from TLex, providing seamless integration for further recording collection and editing without needing full TLex training.
Established internal TLex internal training methods and materials are now being provided to teams of Indigenous linguists and editors from the Tahltan and Ho-Chunk dictionary projects to facilitate skills transfer and help create independent stakeholder-driven dictionary projects.
Tahltan will be integrating further expansion of their dictionary into the work of new speaker apprentice programs.
The outcome of a dictionary project is the dictionary itself. The philosophy at every stage is to “always be proofing” in TLC’s TLex processes. TLex is designed to reflect the final product with print and digital styles maintained and updated, with audio available to all editors at all times.
The design of the digital dictionaries is focused on serving the needs of language learners. Each dictionary has an advanced search engine behind it set up to allow common phonological mistakes and provide users with access to entries whether they are seeking to translate text or roughly spell out a word they have heard. All of the text within an entry is automatically linked to entries with the help of various styles of lemmatization in English and the target language. A common experience is provided between mobile, desktop, and web users ensuring accessibility regardless of platform.
Since TLex provides the ability to always be proofing, after the first release, much of the export process is automated so additional updates can be created with ever increasing frequency.
Digital formats provide a way to easily receive community feedback which is integrated in most cases annually or biannually.
This roundtable will demonstrate the tools used to create our online, mobile and print dictionaries and give a lot of space to discussion between the presenters and the audience.