What makes an archive "living"
A living archive is not simply a digital archive with a better interface. It is a fundamentally different relationship between a collection and its community. At Rainforest Studio, we define a living archive through four pillars.
Searchable: semantic, not just keyword-matched
Traditional archives rely on keyword search. If the metadata says "textile workers" and you search "weavers," you get nothing. Semantic search (understanding what you mean, not just what you typed) changes this entirely. It connects related concepts, surfaces relevant material across formats, and makes the archive genuinely usable for people who don't know archival vocabulary.
This is the single biggest difference most communities notice first. An archive that understands natural language questions becomes something people actually use, not something they visit once and abandon.
Citable: with provenance intact
Every piece of material in a living archive carries its provenance. When the system surfaces a document, a photograph, or an oral history excerpt, it tells you where it came from, who contributed it, and how it connects to other materials in the collection. This isn't just good practice: it's essential for communities whose histories have been extracted, recontextualised, or erased.
Traditional platforms handle provenance through metadata fields, but it's rarely surfaced in a way that's meaningful to non-specialist users. In a living archive, citation and source attribution are built into every interaction.
Conversational: RAG-enabled dialogue with the archive
This is the biggest leap. RAG (Retrieval-Augmented Generation) combines language models with structured knowledge bases to let people have actual conversations with an archive. You can ask "What was life like for textile workers in Tottenham in the 1970s?" and receive a synthesised response drawn from oral histories, photographs, and documents in the collection, with sources cited.
Traditional digital archive software has no equivalent to this.
“The closest analogy would be a reference librarian who has read every item in the collection and can draw connections across them instantly. Except this librarian is available at 2am, speaks every language the community speaks, and never retires.
The most important distinction: a living archive is governed by the community it serves. This means decisions about what gets included, how materials are described, who has access, and how AI features behave are made by the community, not by a vendor or a funding body.
This is not the same as "self-hosted." Many traditional platforms are technically open-source, but community-controlled means the community sets the rules for their own cultural material. It means oral histories aren't scraped into training datasets. It means a diaspora community in South London decides how their stories are told, not an institution three steps removed.
It also means resilience. When platforms get acquired, shut down, or change their terms of service, communities using them can lose access to their own collections overnight.
“Community control isn't just a principle: it's a safeguard against the very real risk of digital dispossession.