TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Throw a Whole Book into an LLM to Extract Characters and Relationships

9 pointsby msuvakov4 months ago
Let&#x27;s try a small experiment with LLMs that have a large context length: feed an entire book into the context window and ask it to generate a list of characters, their relationships, and physical descriptions—data that can later be used for image generation.<p>In this repository, you can find two tools: a script that extracts data from book text using an LLM (Gemini or OpenRouter API) and an HTML&#x2F;JS (D3) visualization of the character graph. An external text-to-image model can be used to generate character illustrations (a Google Colab example is provided).<p>Explore the visualizations, play with the script, and feel free to add more books to the repo via pull requests!

2 comments

kodachi3 months ago
I&#x27;d like to know the names of all the ladies the character &quot;Chinasky&quot; fucked in the book &quot;Women&quot;.<p>Should be more than 20 at least.
eesmith4 months ago
I&#x27;m trying to understand how to judge the quality of the result, that is, to better quantify what &quot;relatively accurate character identification and relationship mapping&quot; means.<p>For example, with the &quot;The Adventures of Tom Sawyer&quot; example, I see Bob Tanner connected to Huckleberry Finn but no one else.<p>Is that supposed to be significant? I pulled up the source text from <a href="https:&#x2F;&#x2F;www.gutenberg.org&#x2F;cache&#x2F;epub&#x2F;74&#x2F;pg74-images.html" rel="nofollow">https:&#x2F;&#x2F;www.gutenberg.org&#x2F;cache&#x2F;epub&#x2F;74&#x2F;pg74-images.html</a> :<p>The first occurrence is an exchange starting:<p><pre><code> Tom hailed the romantic outcast: “Hello, Huckleberry!” ... “No, I hain’t. But Bob Tanner did.” </code></pre> The last line is spoken by Huckleberry. It is clear that both kids know who Bob Tanner is, because Tom mentions &quot;he’s the wartiest boy in this town&quot;. (The image prompt says &quot;might be holding a bean or have warts on his hands&quot;, but the bean is the method that Tom and Huck use.)<p>The other context is:<p>&gt; “Well, I have too,” said Tom; “oh, hundreds of times. Once down by the slaughter-house. Don’t you remember, Huck? Bob Tanner was there, and Johnny Miller, and Jeff Thatcher, when I said it. Don’t you remember, Huck, ’bout me saying that?”<p>So what does it mean that Huck has a connection to Bob but Tom does not, when it seems equally strong in the text?<p>Or, we see in the graph that &quot;Bull Harbison&quot; is &quot;a dog that howls outside the tannery&quot;, which isn&#x27;t correct. Tom thinks it&#x27;s Bull, but after another howl they realize it&#x27;s actually a stray.<p>Why is Mr. Jones, &quot;the Welshman&quot; referred to as &quot;old man&quot;?<p>Why is Mrs. Thatcher not listed? Or the Rev. Mr. Sprague, the &quot;Useful Minister&quot; as chapter V&#x27;s title describes him?<p>There are also some characters mentioned only once, like &quot;Mr. Benton, an actual United States Senator&quot; and &quot;Major and Mrs. Ward; lawyer Riverson&quot;, who are not on the graph, while names like Benny Taylor (&quot;Benny Taylor’s little wagon&quot;) and Jimmy Hodges (&quot;he more than half envied Jimmy Hodges, so lately released&quot;) are in the graph. Why?<p>And there&#x27;s &quot;the cat&quot; just hanging about with no connection.<p>Also, there should be no link from Jimmy to Huck as its not in the text, and we don&#x27;t know if Benny or Jimmy are young boys, as characterized in the bio.<p>The graph shows a connection between Uncle Jake and Jim (&quot;Friends&quot;) which doesn&#x27;t exist in the book, where Jake is mentioned two times in a single paragraph. Is there a built-in assumption in the model that the two named Black characters in the book, both slaves, would be friends?<p>It says that Mary and Aunt Polly are niece&#x2F;aunt respectively, but I don&#x27;t see that in the text. We know that Mary is Tom&#x27;s cousin, and Polly is Tom&#x27;s aunt, but we don&#x27;t know the relationship between Mary and Polly. Could they be mother&#x2F;daughter? <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;List_of_Tom_Sawyer_characters#Mary" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;List_of_Tom_Sawyer_characters#...</a> says it&#x27;s never specified.<p>It seems like it would be a lot of work to verify both the correctness of the generated data, and verify there are not missing parts. Does it really save time and effort?<p>To be sure, these are small parts of the books, but then again, Tom Sawyer is one of the most analyzed books in the American canon, where there should be a lot of examples in the corpus describing relations between the main characters.
评论 #42949909 未加载