Cracking the code: How a “prediction machine” is resurrecting the Singapore Stone
- Written by Francesco Perono Cacciafoco, Associate Professor in Linguistics, Xi'an Jiaotong-Liverpool University
Several years ago, my linguistic research team and I began developing a computational tool we call ‘Read-y Grammarian’. Our goal was to reconstruct the highly fragmentary text of the Singapore Stone[1], a relic from the 10th to 14th centuries that features an undeciphered script similar to Kawi.
The software uses a specialised algorithm that applies digital philology[2] (the study of oral and written texts) and epigraphy[3] (the study of written matter recorded on hard or durable material) techniques to analyse ancient inscriptions.
We faced several setbacks while developing ‘Read-y Grammarian’ over the years. However, thanks to our persistence, the system is finally complete and fully operational.
Originally implemented for the Singapore Stone, ‘Read-y Grammarian’ can be used to reconstruct any fragmentary text. Its adaptive framework allows it to restore a wide array of damaged historical records — including manuscripts and papyri — with a few simple adjustments.
This flexibility makes the tool a powerful new ally for researchers looking to piece together the missing fragments of human history.
Read more: The Singapore Stone’s carvings have been undeciphered for centuries – now we’re trying to crack the puzzle[4]
About the Stone
First recorded by the British in 1819 at the Singapore River’s mouth[5], the sandstone monolith was later demolished in 1843 to clear space for building projects. The roughly three-metre-square slab was largely destroyed, leaving only three surviving fragments.
After researchers sent those fragments to the Royal Asiatic Society’s Museum in Calcutta for study, their trail went cold. Their fate remained undocumented for decades until 1918, when — as far as records show — the institution returned only a single piece to what was then the Raffles Museum of Singapore[6].
The Stone originally bore an inscription spanning roughly 50 lines, but its full text has been lost. All that remains are a handful of rough sketches made before its destruction, along with copies of the three fragments recovered after 1843 and the single original fragment that still exists.
The inscription remains a major puzzle. Its script resembles regional writing systems like the Javanese Kawi[9], but actually it doesn’t match any known writing system found on Earth.
Because no one has been able to read the Stone yet, it remains undeciphered[10].
The monument’s turbulent history[11], coupled with the complex and unique nature of its script, has fuelled its myth over the centuries. This has turned the artefact into one of the most intriguing challenges for scholars.
Alongside legendary puzzles like the Phaistos Disc[12] and Linear A[13], the Singapore Stone remains one of History’s great unsolved codes.
Some link it to the legendary strongman Badang[14] and the sprawling Majapahit Empire[15]. Yet, until the conundrum is cracked, all connections — including those with ancient India[16] or Java’s influence — remain speculative.
Decoding the void
Among other computational methods[17], ‘Read-y Grammarian’ works by digitising the inscription and assigning a unique alphanumeric (composed of both letters and numerals)[18] code to every attested character. By tracking the exact position and line of every symbol, the algorithm identifies gaps in the text and reconstructs the likely original layout line by line.
It then applies frequency analysis and statistical and mathematical calculations — based on the linguistic patterns found in human languages — to predict which characters may have filled the missing spaces. This way, ‘Read-y Grammarian’ acts like a ‘prediction machine’, analysing the text step by step and converting its alphanumeric outputs back into actual characters.
We can also adjust specific settings in the system. For example, we can tweak the syntax[19] (grammar rules and word order) of a reference language or language family[20] — such asJavanese[21] or Austronesian[22] — or adjust the related morphology[23] (the internal structure of words and how they are formed).
The algorithm then generates different possible versions of the text and our research team reviews these versions to determine which ones make the most sense linguistically.
This approach allows us to map phonemes[24] (distinct units of sound that distinguish one word from another) to every character in the text, helping us identify possible words. It significantly streamlines the comparative process, letting scholars test the script against a variety of candidate languages to find a possible match.
What we know so far
Our ultimate goal is to solve the mystery of the elusive inscription. A text can be considered truly deciphered only when it can be read and understood in full.
Identifying an exact language will take time. However, we have already reached an important breakthrough: after centuries of silence, we have reconstructed several plausible versions of a complete text. This is a significant achievement, because the monolith itself was almost entirely destroyed. Reconstructing the original inscription would have been impossible without our custom algorithm.
Moreover, we are currently developing a more sophisticated model to expand the capabilities of ‘Read-y Grammarian’. The upgrade will allow us to generate systematic transcriptions at a much faster rate and in greater numbers. It will also incorporate advanced features, such as elements of historical phonology[27] (the study of how sound systems change and evolve over time), to further refine the results.
A work in progress
Deciphering the Singapore Stone has proven extremely difficult, primarily because so little of the inscription survives. The fragments are too small to support reliable frequency analysis[28] — assessing how often specific symbols appear in a text — or statistical studies.
Yet, pattern-recognition[29] methods are the core tools of cryptanalysis[30] (the art of cracking codes and ciphers), and they require far more data than the highly damaged epigraph can currently provide.
To break the deadlock, we must first focus on consistently reconstructing the full inscription. By restoring the text piece by piece and line by line, we can finally view the result as a complete composition.
Though this is not the same as actually reading the inscription, rebuilding the Stone’s original contents gives us the foundation we need to analyse the structure of its text.
Before we can interpret and decipher its fragmentary remains, this crucial first step will pave the way for decoding its writing system and finally identifying the unknown language hidden within.
References
- ^ the Singapore Stone (www.roots.gov.sg)
- ^ philology (www.britannica.com)
- ^ epigraphy (www.britannica.com)
- ^ The Singapore Stone’s carvings have been undeciphered for centuries – now we’re trying to crack the puzzle (theconversation.com)
- ^ First recorded by the British in 1819 at the Singapore River’s mouth (www.nlb.gov.sg)
- ^ Raffles Museum of Singapore (www.nhb.gov.sg)
- ^ https://www.mdpi.com/journal/histories (scholar.google.com)
- ^ CC BY (creativecommons.org)
- ^ Javanese Kawi (www.omniglot.com)
- ^ undeciphered (www.mdpi.com)
- ^ turbulent history (www.mdpi.com)
- ^ Phaistos Disc (heraklionmuseum.gr)
- ^ Linear A (www.britannica.com)
- ^ Badang (www.roots.gov.sg)
- ^ Majapahit Empire (www.britannica.com)
- ^ India (www.britannica.com)
- ^ Among other computational methods (www.mdpi.com)
- ^ alphanumeric (composed of both letters and numerals) (www.merriam-webster.com)
- ^ syntax (www.britannica.com)
- ^ language family (webspace.ship.edu)
- ^ Javanese (www.britannica.com)
- ^ Austronesian (www.britannica.com)
- ^ morphology (www.britannica.com)
- ^ phonemes (www.britannica.com)
- ^ https://www.mdpi.com/journal/information (www.mdpi.com)
- ^ CC BY (creativecommons.org)
- ^ historical phonology (learn.socratica.com)
- ^ frequency analysis (www.sciencedirect.com)
- ^ pattern-recognition (www.sciencedirect.com)
- ^ cryptanalysis (www.britannica.com)
Authors: Francesco Perono Cacciafoco, Associate Professor in Linguistics, Xi'an Jiaotong-Liverpool University





