Can LLMs accurately recall the Bible?

223 pointsby benkaiser5 months ago

31 comments

szvsw5 months ago

It seems like LLMs would be a fun way to study/manufacture syncretism, notions of the oracular, etc; turn up the temperature, and let godhead appear!If there’s some platonic notion of divinity or immanence that all faith is just a downward projection from, it seems like its statistical representation in tokenized embedding vectors is about as close as you could get to understanding it holistically across theological boundaries.All kidding aside, whether you are looking at Markov chain n-gram babble or high temperature LLM inference, the strange things that emerge are a wonderful form of glossolalia in my opinion that speak to some strange essence embedded in the collective space created by the sum of their corpi text. The Delphic oracle is real, and you can subscribe for a low fee of $20/month!

评论 #42537882 未加载

评论 #42542482 未加载

评论 #42538268 未加载

评论 #42539136 未加载

评论 #42538059 未加载

评论 #42538566 未加载

评论 #42547029 未加载

gwd5 months ago

I'm learning New Testament Greek on my own*, and sometimes I paste a snippet in to Claude Sonnet and ask questions about the language (or occasionally the interpretation); I usually say it's from the New Testament but don't bother with the reference. Probably around half the time, the opening line of the response is, "This verse is <reference>, and...". The reference is almost always accurate.* Using a system I developed myself; currently in open development: <a href="https://www.laleolanguage.com" rel="nofollow">https://www.laleolanguage.com</a>

评论 #42544925 未加载

评论 #42545177 未加载

nickpsecurity5 months ago

I tested this back when GPT4 was new. I found ChatGPT could quote the verses well. If I asked it to summarize something, it would sometimes hallucinate stuff that had nothing to do with what was in the text. If I prompted it carefully, it could do a proper exegesis of many passages using the historical-grammatical method.I believe this happens because the verses and verse-specific commentary are abundant in the pre-training sources they used. Whereas, if one asks a highly-interpretive question, then it starts re-hashing other patterns in its training data which are un-Biblical. Asking about intelligent design, it got super hostile trying to beat me into submission to its materialistic worldview every paragraph.So, they have their uses. I’ve often pushed for a large model trained on Project Gutenberg to have a 100% legal model for research and personal use. A side benefit of such a scheme would be that Gutenberg has both Bibles and good commentaries which trainers could repeat for memorization. One could add licensed, Christian works on a variety of topics to a derived model to make a Christian assistant AI.

评论 #42544731 未加载

cowmix5 months ago

When I test new LLMs (whether SaaS or local), I have them create a fake post to r/AmItheAsshole from the POV of the older brother in the parable of the Prodigal Son.It's a great, fun test.

评论 #42546124 未加载

danpalmer5 months ago

LLMs are bad databases, so for something like a bible which is so easily and precisely referenced, why not just... look it up?This is playing against their strengths. By all means ask them for a summary, or some analysis, or textual comparison, but please, please stop treating LLMs as databases.

评论 #42546545 未加载

评论 #42546183 未加载

asimpleusecase5 months ago

This is nice work. The safest approach is using the look up - which his data shows to be very good - and combine that with a database of verses. That way textual accuracy can be retained and very useful lookup be carried out by LLM. This same approach can be used for other texts where accurate rendering of the text is critical. For example say you built a tool to cite federal regulations in an app. The text is public domain and likely in the training data of large LLMs but in most use cases hallucinating the text of a fed regulation could expose the user to significant liability. Better to have that canonical text in a database to insure accuracy.

ks20485 months ago

This is interesting. I'm curious about how much (and what) these LLMs memorize verbatim.Does anyone know any more thorough papers on this topic? For example, this could be tested on every verse in bible and lots of other text that is certainly in the training data: books in project gutenberg, wikipedia articles, etc.Edit: this (and its references) looks like a good place to start: <a href="https://arxiv.org/abs/2407.17817v1" rel="nofollow">https://arxiv.org/abs/2407.17817v1</a>

评论 #42542876 未加载

评论 #42543461 未加载

jsenn5 months ago

Has there been any serious study of exactly how LLMs store and retrieve memorized sequences? There are so many interesting basic questions here.Does verbatim completion of a bible passage look different from generation of a novel sequence in interesting ways? How many sequences of this length do they memorize? Do the memorized ones roughly correspond to things humans would find important enough to memorize, or do LLMs memorize just as much SEO garbage as they do bible passages?

评论 #42547984 未加载

评论 #42546473 未加载

waynecochran5 months ago

I find LLM's good for asking certain kinds of Biblical questions. For example, you can ask it to list the occurrences of some event, or something like "list all the Levitical sacrifices," "what sins required a sin offering in the OT," "Where in the Old Testament is God referred to as 'The Name'?" When asking LLM's to provide actual interpretations you should know that you are on shaky ground.

评论 #42546482 未加载

asim5 months ago

I had similar thoughts about using it for the Quran. I think this highlights you have to be very specific in your use cases especially when expecting an exact response on static text that shouldn't change. This is why I'm trying something a bit different. I've generated embeddings for the Quran and use chromem-go for this. So I'll ask the index the question first based on a similarity search and then feed the results in as context to an LLM. But in the response I'll still sight the references so I can see what they were. It's not perfect but a first step towards something. I think they call this RAG.What I'm working on <a href="https://reminder.dev" rel="nofollow">https://reminder.dev</a>

评论 #42538137 未加载

评论 #42538188 未加载

评论 #42545257 未加载

评论 #42538217 未加载

评论 #42548243 未加载

评论 #42543891 未加载

avree5 months ago

I wonder if the author knows that "slurpees" is misspelled in his bio on the post.

评论 #42544630 未加载

kittikitti5 months ago

I tried something similar with my favorite artist, Ariana Grande. Unfortunately, not even the most advanced AI could beat my knowledge of her lyrical work.

评论 #42538275 未加载

evanjrowley5 months ago

Approximately 1 year ago, there was a HN submission[0] for Biblos[1], an LLM trained on bible scriptures.[0] <a href="https://news.ycombinator.com/item?id=38040591">https://news.ycombinator.com/item?id=38040591</a>[1] <a href="http://www.biblos.app/" rel="nofollow">http://www.biblos.app/</a>

评论 #42543917 未加载

jccalhoun5 months ago

It is fun and frustrating to see what LLMs can and can't do. Last week I was trying to find the name of a movie so I typed a description of a scene in chatgpt and said "I think it was from late 70s or early 80s and even though it is set in the USA, I'm pretty sure it is European" and it correctly told me it was the House by the Cemetery.Then last night I saw a video about the Parker Solar Probe and how at 350,000mph it was the fastest moving man-made object. So I asked chatgpt how long at that speed it would take it to get to Alpha Centauri which is 4.37 light years away. It said it would take 59.8 million years. I knew that was way too long so I had it convert mph to miles per year and then it was able to give me the correct answer of 6817 years.

评论 #42547721 未加载

评论 #42547089 未加载

评论 #42547339 未加载

efitz5 months ago

Interesting result but probably predictable since you’re trying to use the LLM as a database. But I think you’re onto something in that your experiments can provide data to inform (and hopefully dissuade) creation of applications that similarly try to use LLMs for exact lookups.I think the experiment of using the LLM to recall described verses - eg “what’s the verse where Jesus did X”- is a much more interesting use. I think also that the LLM could be handy as, or to construct, a concordance. But I’d just use a document or database if I wanted to look up specific verses.

ChuckMcM5 months ago

Interesting that it takes an LLM with 405 BILLION parameters to accurately recall text from a document with slightly less than 728 THOUSAND words. (not quite three decimal orders of magnitude smaller but still).

评论 #42545228 未加载

评论 #42545195 未加载

评论 #42545577 未加载

michaelsbradley5 months ago

I’ve been pretty impressed with ChatGPT’s promising capabilities as a research assistant/springboard for complex inquiries into the Bible and patristics. Just one example:<pre><code> Can you provide short excerpts from works in Latin and Greek written between 600 and 1300 that demonstrate the evolution over those centuries specifically of literary references to Jesus' miracle of the loaves and fishes? </code></pre> <a href="https://chatgpt.com/share/675858d5-e584-8011-a4e9-2c9d2df78325" rel="nofollow">https://chatgpt.com/share/675858d5-e584-8011-a4e9-2c9d2df783...</a>

评论 #42538262 未加载

评论 #42538286 未加载

评论 #42538320 未加载

orionblastar5 months ago

There is this robot that reads the Bible: <a href="https://futurism.com/religious-robots-scripture-nursing-homes" rel="nofollow">https://futurism.com/religious-robots-scripture-nursing-home...</a>

评论 #42538625 未加载

cbg05 months ago

While this is slightly more catered towards a technical audience, I think articles on relatable subjects like this one could prove valuable in getting non-technical people to understand the limitations of LLMs, or what companies are calling "AI" these days. A version of this article that is more focused on real-world examples, showing exactly how the models can make mistakes and present the wrong or incomplete information with less technical focus would probably better cater to a non-technical audience.

pwinkeler5 months ago

I love that people are finally comfortable adding the word "artificial" into their analysis of the bible. About time. Because make no mistake, LLMs are at best artificial intelligence. More likely, they are very good regurgitating machines, telling us what we have been telling ourselves in an even better form thus goading us along in our fallacies.

graemep5 months ago

The Bible is a very tricky thing to recall word for work because of differences between canons and translations. Different wording might be taken from a different translation than the one asked for, rather than being wrong.

评论 #42545192 未加载

gerdesj5 months ago

Why?Why do you put a weird computer model between you and a computer and errr Your Faith? Do bear in mind that hallucinations might correspond to something demonic (just saying)I'm a bit of a rubbish Christian but I know a synoptic gospel when I see it and can quote quite a lot of scripture. I am also an IT consultant.What exactly is the point of Faith if you start typing questions into a ... computational model ... and trusting the outputs? Surely you should have a decent handle on the literature: It's just one big physical book these days - The Bible. Two Testaments and a slack handful of books and that for each. I'm not sure exactly but it looks about the same size as the Lord of the Rings.I've just checked: Bible: 600k LotR: 480K - so not too far off.I get that you might want to ask "what if" types of questions about the scriptures but why would you ask a computer? Faith is not embedded in an Intel Core i7 or an Nvidia A100.Faith is Faith. ChatGPT is odd.

评论 #42545966 未加载

评论 #42546570 未加载

killermouse05 months ago

I believe I saw or read somewhere that, in the case of the brain, memories were not as much stored as they were reconstructed when recalled. If that's true, I feel like we are witnessing something similar with LLMs as well as with stable diffusion type of things. Is there any studies looking into this in the AI world? Also if anyone knows what I'm referring to (i.e "reconstructing memories") I would love some pointers because I can't remember for the love of me where I heard or read of this idea!

评论 #42550197 未加载

Animats5 months ago

It's discouraging that an LLM can accurately recall a book. That is, in a sense, overfitting. The LLM is supposed to be much smaller than the training set, having in some sense abstracted the training inputs.Did they try this on obscure bible excerpts, or just ones likely to be well known and quoted elsewhere? Well known quotes would be reinforced by all the copies.

评论 #42543124 未加载

评论 #42545640 未加载

评论 #42543534 未加载

评论 #42544514 未加载

seanhunter5 months ago

By Betteridge's law of headlines, the answer is clearly "no".[1]But also, LLM's in general build a lossy compression of their training data so are not the right tool if you want a completely accurate recall.Will the recall be accurate enough for a particular task? Well I'm not a religious person so I have no framework to help decide that question in the context of the bible. If you want a system to answer scripture questions I would expect a far better approach than just an LLM would be to build a RAG system and train the RAG embedding and search at the same time you train the model.[1] <a href="https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines" rel="nofollow">https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...</a>

weMadeThat5 months ago

they totally can.I got exiled into an isolated copy of an AI-populated internet once and they put perfectly accurate bible quotes into dictionaries!

ddtaylor5 months ago

I'm heavily biased here because I don't find much value in the bible personally. Some of the stories are interesting and some interpretations seem useful, but as a whole I find it arbitrary.I never tell other people what to believe or how they should do that in any capacity.With that said I find the hallucination component here fascinating. From my perspective everyone who interprets various religious text does so differently and usually that involves varying levels of fabrication or something that looks a lot like it. I'm speaking about the "talking in tongues" and other methods here. I'm not trying to lump all religions into the same bag here, but I have seen that a lot have different ways of "receiving" communication or directive. To me this seems pretty consistent with the colloquial idea of a hallucination.

评论 #42543462 未加载

评论 #42542763 未加载

评论 #42545126 未加载

评论 #42538376 未加载

dudeinjapan5 months ago

In the beginning was the Vector, and the Vector was with God, and the Vector was God.

eddiewithzato5 months ago

Why then does it have a hard time being a judge for MTG rule interactions?

sneak5 months ago

> While they can provide insightful discussions about faith, their tendency to hallucinate responses raises concerns when dealing with scriptureI experience the exact same problem with human beings.> , which we regard as the inspired Word of God.QED

MrQuincle5 months ago

"I've often found myself uneasy when LLMs (Large Language Models) are asked to quote the Bible. While they can provide insightful discussions about faith, their tendency to hallucinate responses raises concerns when dealing with scripture, which we regard as the inspired Word of God."Interesting. In my very religious upbringing I wasn't allowed to read fairy tales. The danger being not able to classify which stories truly happened and which ones didn't.Might be an interesting variant on the Turing test. Can you make the AI believe in your religion? Probably there's a sci-fi book written about it.

评论 #42547437 未加载

评论 #42542806 未加载

评论 #42554257 未加载