Lasswitz and Borges: Indexing the Library of Everything

The library of all possible literature

Since ancient times, the civilized parts of humanity envisioned of collecting the totality of human knowledge in one place. The Library of Alexandria, probably founded during the third century B.C., is the most well known, systematic effort of making this vision real. Some researchers estimate that the number of scrolls of the library might have even reached a peak of one million, a huge number for the time. The largest modern libraries may well house more than 100 million items, a printed material that may require more than 4000 kilometers of shelving but which could also be packed within a cubic container with a side only 62 meters long, a volume similar to the deadweight tonnage of an ultra large crude carrier. Though this is far from the totality of human knowledge, one can imagine that collecting the whole production of human intellect in print and packing it in a cube would not after all require too much space. Science fiction has made quite a step further, envisioning a library of not only the existing books but one of all possible books, including nonsense and gobbledygook. In his short story “Die Universalbibliothek” (The Universal Library, 1901), the German teacher, scientist and author Kurd Lasswitz (1848 – 1910) explored the possible size of an imaginary library comprised of all books of a certain format that could be possibly printed. Lasswitz’s idea is nicely expressed in a few lines by his character in the story, professor Wallhausen:

“… the number of possible combinations of a given number of letters is limited. Therefore all possible literature must be printable in a finite number of volumes”.

Kurd Lasswitz

The argument is rather simple: Lasswitz calculates that a book of 500 pages, with 40 lines per page and 50 characters per line, consists of one million characters. Allowing 100 possible different characters, including the latin alphabet, punctuation marks and “the space that keeps the words apart”, there are 100 possibilities for each one of the 1000000 characters of one book. Wallhausen points out that “… if we take our one hundred characters, repeat them in any order often enough to fill a volume which has room for one million characters, we’ll get a piece of literature of some kind. Now if we produce all possible combinations mechanically we’ll ultimately get all the works which ever have been written in the past or can be written in the future […including] the lost works of Tacitus and their translations into all living and dead languages […] all forgotten and still undelivered speeches in all parliaments […] the history of the subsequent wars, all the compositions all of us wrote in school and college”.

Combinatorics and immensity

Elementary combinatorics indicate that the number of possible books produced thus is 10^2000000, an ungraspable number consisting of a “1” followed by 2000000 zeros. Wallhausen argues that packing this immense number of books in boxes of 1000 volumes each, would require space immensely greater than the observable universe. Indeed, if each box has a volume of one cubic meter, then the total volume of all the boxes would be 10^1999997 cubic meters, which is approximately equal to 3,5X10^1999916 times of what is now considered as the observable universe. In other words, it would take approximately 3,5X10^1999916 universes to accommodate Lasswitz’s Universal Library. The total number of different pages appearing in the library is 10^4000, vastly smaller than the number of books, while the total number of different lines is 10^100, equal to the exotically named number “ten duotrigintillion” (commonly named “googol”), a number with 101 digits. Therefore, any specific line repeats itself 2X10^1999903 times and every specific page repeats itself 5X10^1996002 times throughout the books of the library. All these numbers are incomparably bigger than what is now accepted as the total number of atoms in the observable universe, estimated somewhere between 4X10^79 and 10^81, a number with up to 82 digits.

A printing machine for printing all possible 65 – character lines using 50 possible symbols. The sketch, made by the nuclear physicist George Gamow, appears in his book “One two three … infinity” (1947). The machine consists of separate discs with letters and signs across the rim.The full rotation of each disc moves the next one forward by one place.

To achieve some understanding of the Universal Library’s mind bending numbers, assume that each atom in the universe is a printing factory with as many printing machines as there are atoms in the universe.  And assume that each of these printing machines produces each millionth of a second as many books as there are atoms in the universe. Then, the production rate would be 10^249 books per second and, had this rate remained constant since the birth of the universe,  only 10^267 books would have been produced up to this moment, a production corresponding to 0,00…….(1999730 zeroes)…….01% of the required number of books.

It is remarkable that the total number of books is sensitively dependent on the specific accepted printing format. For example, reducing the size of each book by only two pages (i.e. 498 pages instead of 500) yields a library 10^8000 times smaller. Similarly, reducing the number of characters by one per page (i.e. 1999 characters instead of 2000), yields a library 10^1000 times smaller.

From gobbledygook to undiscovered theorems

As Lasswitz describes in his short story, every possible, coherent or not, known, lost or future work of literature in every possible language would eventually appear somewhere in the library, albeit transliterated in Latin alphabet, together with an immense number of slight variations of it, containing one or more typographical errors. For example, for each book of the library there would be 99000000 versions containing exactly one typographical error and approximately 4,9X10^15 versions containing exactly two typographical errors. These numbers rise sharply with each additional typographical error considered, until the book contains so many errors that it becomes incoherent and unrecognizable. Each book is therefore accompanied by an unimaginable number of its variations or “relatives”. In between any two fully coherent works there are a huge number of common “relatives”, books that may result from both works after specific typographical alterations. Apart from this, each book is also linked to 500! (500 factorial, i.e. 2 times 3 times 4 times 5 times … times 500) other books identical to it, only with their pages shuffled. Note that 500! is a number with 1135 digits, approximately equal to 1,22X10^1134.

Almost all the books of the Universal Library fall into the “gobbledygook” category, while others are a succession of grammatically correct words in random order, some of them forming entertaining pieces of nonsense literature. One of them is completely blank. Assuming that each line consists of 10 words, the total number of English language books with perfectly correct spelling, nonsensical or not, are calculated to be approximately 36X10^1000000, where each word could be any of the 600000 English words included in the Oxford dictionary. Books containing some number of typographical errors or misspellings but still recognized as English are vastly greater in number. For extensively inflected languages, such as Greek and Russian, the number of books is much higher.

It is obvious that Lasswitz’s line of thinking leads to a highly redundant Universal Library. Lines, pages and whole books appear again and again, almost identical, in unimaginable number of times. Lasswitz rather generously permits 100 different characters since “…mathematicians have an enormous number of symbols […] which could be replaced by an agreement with small indices…”. Thus, apart from works of literary merit, Lasswitz’s Universal Library includes all past and future mathematical works, undiscovered proofs and theorems.

The reduced everything

Lasswitz’s library could be greatly reduced in size by simply using 100 language – independent symbols, such as  &, @, #, $, * etc. The correspondence of each symbol to a specific character of any alphabet could be left open to the reader’s choice. For each specific book the 100 symbols appearing could thus be thought of as corresponding to any set of 100 of language characters in 100! different ways, the number of the possible permutations of 100 objects. For example, the symbol sequence

&@#$*$

could be thought of as representing the words ABUSES, IMPEDE, SCORER etc. for an English language reader, the words ΣΟΒΑΡΑ, ΘΡΑΣΟΣ, ΕΙΡΗΝΗ etc. for a Greek language reader, and the words СОБАКА, ЯБЛОКО, ГИТАРА etc. for a Russian language reader. The overall size of the library would then be reduced by a factor 100!, a number approximately equal to 10^158. The size ratio between the thus resulting library and the Universal Library is much smaller than the one between an atom and the whole universe.

The Library of Babel

Jorge Luis Borges

The short story “The Library of Babel” (1941) by the famous Argentine author Jorge Luis Borges (1899 – 1986)  was based on Lasswitz’s “Die Universalbibliothek” and gave the idea of the library of everything a rather metaphysical twist. Using first person narration, Borges describes a universe (“some call it “the Library”” is mentioned in the story) occupied by planes of interlocking hexagonal rooms, connected to the other planes with spiral stairways that “sink abysmally and soar upwards to remote distances”. Each room is equipped with four walls of bookshelves and the absolutely elementary necessities for human sustenance. The inhabitants of this universe are people plagued by the meaninglessness and the immensity of the library surrounding them, a collection of books similar to the “Universalbibliothek”. Borges assumes a book format of 410 pages, with 40 lines in each page and 80 characters in each line, each character selected from 25 possible typographical symbols. The library of Babel, comprised of 25^1312000 books, a number approximately equal to 1,95X10^1834097, is therefore vastly smaller than Lasswitz’s Universal Library, still it is at the same time vastly bigger than the observable universe. At some point, the unnamed narrator discusses a mind numbing possibility: “On some shelf in some hexagon (men reasoned) there must exist a book which is the formula and perfect compendium of all the rest: some librarian has gone through it and he is analogous to a god […] Many wandered in search of Him. For a century they have exhausted in vain the most varied areas. How could one locate the venerated and secret hexagon which housed Him? Someone proposed a regressive method: To locate book A, consult first book B which indicates A’s position; to locate book B consult first a book C, and so on to infinity… In adventures such as these, I have squandered and wasted my years”.

A total book

The idea of a universal index, a catalogue of all books, would at first require assigning a number to each book, corresponding to its position in the library. Thus, a one to one correspondence between the 25^1312000 books and numbers should be established. Any book indicating the position of another, should provide enough printing space to accommodate 1834097 digits, far more than the 1312000 digit printing space provided by the 410 page book format. Therefore, a supposed index proves to be insufficient to indicate the position of a single book in the library, let alone the positions of all the rest. After all, Borges does not allow numerical digits among the 25 accepted symbols of his Library of Babel. However, these 25 symbols could be thought of as representing the digits of a base – 25 numerical system, i.e. a system expressing numbers using powers of 25, instead of the powers of 10 used by the ordinary base – 10 system. The maximum number that could fit within a book of the library would then be

24X25^0+24X25^1+24X25^2+…

+24X25^1311999

exactly equal to 25^1312000-1. If the first book of the library is numbered as “zero”, then the base – 25 system makes certain that each and every book of the library indicates the position of exactly one book and that two different books cannot point to the same book. A book A in the library points to a book B, which in turn points to a book C and so on. Thus the “regressive method” described in the “Library of Babel” may indeed lead to adventures that could have one waste and squander whole eons. However, it is certain that a number of books of the library would point to themselves, forming this way dark spots, inaccessible from outside. In other cases, a book A would point to a book B, which in turn would point to A again: our brief adventure in the library would then be periodic. It is easy to imagine much longer chains of periodic adventures. Similar conclusions can be inferred for Lasswitz’s “Universalbibliothek”.

Obviously, in such a library a single book cannot become the catalogue of all the rest, not even by the regressive method: there will always exist inaccessible dark spots and closed loops, books that cannot be referred to by others. In other words, the only complete index of the library of everything is the library itself.

According to the narrator in “The Library of Babel”, this impossible one – book index, the universal catalogue would be the only justification of the mind bending library. “It does not seem unlikely to me”, he says, “that there is a total book on some shelf of the universe; I pray to the unknown gods that a man – just one, even though it were thousands of years ago! – may have examined and read it. If honor and wisdom and happiness are not for me, let them be for others. Let heaven exist, though my place be in hell. Let me be outraged and annihilated, but for one instant, in one being, let Your enormous Library be justified”.