The AI Keeping Dying Languages Alive: Real communities around the world are using artificial intelligence to rescue languages on the edge of extinction, and the lessons reach right here to Ulster-Scots and Irish
Across the world, languages that were weeks away from silence are finding their voice again. Artificial intelligence is a big part of why. This is one of the genuinely good news stories of our time.
There is a statistic that tends to stop people mid-conversation: linguists estimate that roughly half the world's seven thousand languages will fall silent before the end of this century. Not because nobody cares, but because the last fluent speakers are elderly, communities are scattered, and the recording and teaching work needed to keep a language breathing is enormous. It takes years of painstaking effort, specialist knowledge and serious funding. Most endangered languages have none of those things in sufficient supply.
That is changing. Over the past three or four years, AI tools have moved into the language preservation space in a way that would have seemed implausible a decade ago. Speech recognition models trained on tiny datasets, AI-assisted transcription of archive recordings, automated grammar tools that help learners practise without needing a fluent teacher in the room. The results are not perfect, but they are real, and for communities that were watching their language disappear in real time, real is more than enough.
What is actually happening on the ground
The story that gets told most often, and rightly so, is from New Zealand. Te Hiku Media, a Māori-owned broadcasting organisation based in the Far North of New Zealand, spent years collecting thousands of hours of recordings from kaumātua, the community's elders, many of whom were among the last fully fluent speakers of te reo Māori. Working with AI researchers, they built Kōrero Māori, a speech recognition engine trained almost entirely on those recordings. It can now transcribe spoken Māori with accuracy that rivals commercial English-language tools. More importantly, Te Hiku Media owns it. They deliberately chose not to hand the data or the model to a large technology company, a decision that sparked a global conversation about data sovereignty and who gets to profit from a community's heritage.
That is not an isolated case. In Canada, the First Voices project has been helping Indigenous communities build their own digital language archives since the early 2000s, and AI-assisted tools have dramatically accelerated the transcription and searchability of those archives in recent years. In Wales, the Welsh Government has invested in Macsen, a Welsh-language voice assistant built on open-source AI, which can answer questions, set reminders and control smart home devices entirely in Welsh. Children growing up in Caernarfon or Aberystwyth can now talk to technology in their first language. That matters more than it might sound.
The science behind why AI is so useful here
Traditional language documentation work is slow because it is almost entirely manual. A linguist sits with a speaker, records them, transcribes the recording by hand, annotates the grammar, checks it against existing materials and eventually publishes something that other researchers can use. For a language with a handful of speakers and no living tradition of written text, that process can take a career. AI does not replace that careful scholarship, but it compresses the most time-consuming parts.
Modern speech recognition models can be fine-tuned on surprisingly small datasets, sometimes just a few dozen hours of clean audio, to produce workable transcriptions. Natural language processing tools can identify grammatical patterns and flag inconsistencies across large bodies of text far faster than a human reader. And generative AI, used carefully, can help produce learning materials, practice dialogues and vocabulary exercises that a small community organisation could never have afforded to commission from scratch. The technology is not doing the cultural work. The community still has to do that. But it is doing the administrative and technical heavy lifting that used to eat all the available time and money.
Why this matters for Northern Ireland
Northern Ireland sits in a genuinely unusual position when it comes to language. Irish and Ulster-Scots are both recognised regional languages under the European Charter for Regional or Minority Languages, and both have communities of speakers and learners who care deeply about their survival and growth. Neither is on the immediate brink of extinction in the way that some Indigenous languages are, but both face the familiar pressures: not enough teachers, not enough immersive content, not enough everyday digital presence to feel natural to a younger generation growing up with smartphones.
The lessons from Te Hiku Media and the Welsh Government are directly applicable here. Ulster-Scots, in particular, has a relatively small body of digitised text and audio compared to Irish, which means AI tools trained on general English or Irish data will perform poorly on it. Building good speech recognition or text tools for Ulster-Scots requires the same community-led data collection approach that worked in New Zealand. That is a project that organisations like the Ulster-Scots Agency could genuinely explore, and the technical barriers to starting it are lower now than they have ever been. For Irish, the infrastructure is more developed, but AI-assisted tutoring tools, voice interfaces and content generation could still dramatically expand what learners can access outside the classroom.
There is also a broader point about cultural heritage. Northern Ireland has archives, oral history collections and community recordings sitting in formats that are difficult to search, index or share. The Public Record Office of Northern Ireland holds material that researchers spend years trying to navigate. AI-assisted transcription and cataloguing tools are being used by archives across Britain and Ireland to make exactly this kind of material more accessible. That is not a distant possibility. It is something that could be scoped and started with a modest budget.
The people doing it right
What the best projects have in common is that the technology follows the community, not the other way around. Te Hiku Media did not start by asking what AI could do. They started by asking what their community needed, which was a way to make elder recordings searchable and a tool that would let learners practise pronunciation without needing a fluent speaker in the room. The AI was the answer to a specific problem, not a solution looking for one.
The same principle applies to a project called ELDP, the Endangered Languages Documentation Programme, which has funded fieldwork across Africa, Asia and the Americas for years. More recently, researchers working within that tradition have started using AI to process the backlog of recordings that exist but have never been fully transcribed. In some cases, recordings made in the 1970s and 1980s, the only record of a language variety that no longer has living speakers, are being made searchable for the first time. That is not a small thing. It is the difference between a language being recoverable and it being gone for good.
Where to start if you care about this
If you work in language preservation, heritage or community education in Northern Ireland, the practical starting point is not grand and it is not expensive. The first step is usually an audit: what recordings, texts and materials do you already have, what format are they in, and what would be most useful to make searchable or shareable? That audit takes time but it does not require specialist AI knowledge. It is just good information management, and it sets you up to use AI tools effectively when you do bring them in.
The second step is looking at what already exists. For Irish, there are open-source language models and datasets that have been built by researchers at institutions like Dublin City University and the National University of Ireland Galway. You do not need to start from scratch. For Ulster-Scots, the picture is thinner, but that is an argument for starting the data collection work now rather than waiting. The third step is finding the right technical partner, one who will take the time to understand what you are actually trying to achieve rather than arriving with a predetermined solution. The technology is genuinely exciting, but the cultural knowledge and community trust that surrounds it is what makes it work.
A Friday thought
Language is one of those things that feels permanent until it suddenly is not. The generation of people who grew up speaking Irish as a first language in parts of Donegal and Antrim, or who absorbed Ulster-Scots at home in Ballymena or Comber, are not going to be here forever. The window for capturing that knowledge, that sound, that particular way of putting the world into words, is not infinite.
What is encouraging about the global story is that communities which looked like they had already lost the race are finding that they had not. Te reo Māori is being spoken by children who would not have had access to immersive learning a generation ago. Welsh is being used to talk to smart speakers in kitchens in Gwynedd. These are not miracles. They are the result of communities deciding what they wanted to protect and then finding the tools to do it. AI turned out to be one of those tools, and a rather useful one at that.
Got a culture, heritage or community project that could use a hand from AI?
Drop Verona AI a message for a free, no-pressure chat. We work with organisations of all sizes across Northern Ireland and would love to hear what you are trying to protect or build.
Book a free consultationMore from the Verona AI blog
Less Time on the Numbers, More Time on the Advice: Practical AI tools that Northern Ireland accountancy practices can put to work right now, from sole traders in Omagh to mid-sized firms in Belfast
Northern Ireland accountancy firms are using AI to cut admin, spot errors faster and give clients better advice. Here is where to start in 2026.
Smarter Law, Less Paperwork, Better Outcomes: Practical AI tools that Northern Ireland legal firms can put to work right now, from sole practitioners in Derry to large commercial practices in Belfast
Discover how AI is helping Northern Ireland legal firms cut admin, speed up research and serve clients better. Practical tools, real benefits, no jargon.
Smarter Shops, Happier Customers: Practical AI tools Northern Ireland retailers can put to work right now, from stock rooms in Newry to shop floors in Belfast
Discover practical AI tools Northern Ireland retailers can use today to cut waste, boost sales and keep customers coming back. No hype, just results.