ML 101: Do not evaluate on the training data.<p>Yes of course it can, because they fit in the context window. But this is an awful test of the model's capabilities because it was certainly trained on these books and websites talking about the books and the HP universe.
How much of that character map is already in its training data and how much of it is actually read from the input prompt?<p>I’m always suspicious of these kinds of tests. It needs to be run with an unpublished book, not one of the most popular series in the 21st century.
Not sure about all of the Harry Potter books, but I gave it My entire data export from ChatGPT and handled it very well. I was able to search through it and have conversation again from past conversations. It was good.
So, according to Gemini pricing, the call would cost approx. $11.
Now, hopefully all goes to plan and the input correct and the result is what you wished for. If not, how many $11 calls do you need?
Sure, pricing will go down, but my observation is that people just ignore the cost of context. When it's all about tech it's fine, but not if it's about efficiency.
> <i>All the books have ~1M words (1.6M tokens). Gemini fits about 5.7 books out of 7. I used it to generate a graph of the characters and it CRUSHED it.</i><p>An LLM could read all of the books with <i>Infini-attention</i> (2024-04):
<a href="https://news.ycombinator.com/item?id=40001626#40020560">https://news.ycombinator.com/item?id=40001626#40020560</a>
OK so my next question is what can you do with a model loaded with Harry Potter Context? Answer Harry Potter Trivia at a superhuman level? Write the next Harry Potter adventure?<p>Having used GPTs to do creative writing I can report that they are good for solving the tyranny of the blank page, but then you have to read and edit hundreds of pages of dank AI prose, which never quite aligns with your creative vision, to harvest a few nuggets of creativity. Does it end up saving any time?
I can’t see how this map would be useful to anyone. While it gets some of the relationships right, it has a bunch of unneeded detail and focuses on areas not crucial to the stories.<p>At a service level, LLM’s wow, but when you dig into the details there are often still huge gaps in output quality for many tasks.
It would be more impressive (and cleaner, btw) if it was fed with fan-fiction books and not the original books. Then we can see what it can make out of the context and what it "borrows" from the training data.<p>Why fan-fiction? Well, fan-fictions are not famous enough to be included in any training corpus, I believe. But fan-fictions of Harry Potter are numerous enough to test the context limit. There are also similarities and distinctions from the originals, which require correct recall to distinguish between them. That would be a good test, isn't it?
Shouldn't the title be rephrased to not be clickbait?<p>I refuse even read it bc clickbait makes me sad but something like "Gemini 1.5 can read all the HP books at once" would be a more appropriate title for this forum, imo.
FWIW, i actually think this is pretty cool.<p>People created a map of all the Star Wars characters manually years ago. Being able to see all the characters mapped out from a story you’re interested in is pretty fun and helpful.