TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Generating Text with Markov Chains

72 pointsby healeycodesover 4 years ago

9 comments

phaedrusover 4 years ago
I used to write Markov-based chat bots. Something I thought I observed, but tried and failed to show mathematically, is the possibility of well-connected neighborhoods of the graph (cliques?) leading to some outputs being more likely than their a priori likelihood in the input.<p>For example, once a simple Markov bot learns the input phrase &quot;had had&quot; it will also start to generate output phrases a human would assign 0% probability to like &quot;had had had&quot; and &quot;had had had had&quot;. This in itself isn&#x27;t a violation of the principles behind the model (it would have to look at more words to distinguish these).<p>The question is whether more complicated loops among related words can create &quot;thickets&quot; where the output generation can get &quot;tangled up&quot; and generate improbable output at a slightly higher rate than the formal analysis says a Markov model of that order should do for given input frequencies. An example of such a thicket would be something like, &quot;have to have had had to have had&quot;.<p>Essentially, I&#x27;m hypothesizing that the percentage values of the weighted probabilities of transitions does not tell the whole story, because the high-level structure of the graph has an add-on effect. A weaker hypothesis is that the state-space of Markov models contains such pathological examples, but that these states are not reachable by normal learning.<p>Unfortunately I lacked the mathematical chops &#x2F; expertise to formalize these ideas myself, nor do I personally know anyone who could help me explore these ideas.
评论 #26006637 未加载
评论 #26006672 未加载
评论 #26010824 未加载
spiderxxxxover 4 years ago
I&#x27;ve done something similar without first learning about Markov Chains. One of my more interesting experiments was creating messed-up laws. I fed it the constitution and alice in wonderland, and it made the most surreal laws. The great thing about them is they don&#x27;t need to know about language. You could make one to create images, another to create names for cities. I made one to create &#x27;pronounceable passwords&#x27;. It took the top 1000 words of a dictionary, and then it would spit out things which could potentially be words, of any length. Of course, the pronounceability of a word like Shestatond is debatable.
not2bover 4 years ago
I published a program to do basically this on Usenet in 1987.<p>Someone created a fixed version on github at<p><a href="https:&#x2F;&#x2F;github.com&#x2F;cheako&#x2F;markov3" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;cheako&#x2F;markov3</a>
评论 #26005839 未加载
zhengyi13over 4 years ago
This is a neat introduction to the subject.<p>If you want to see more of what they can do, and you have had any exposure to Chomsky, you might also appreciate <a href="https:&#x2F;&#x2F;rubberducky.org&#x2F;cgi-bin&#x2F;chomsky.pl" rel="nofollow">https:&#x2F;&#x2F;rubberducky.org&#x2F;cgi-bin&#x2F;chomsky.pl</a>.
hermitcrabover 4 years ago
I wrote &#x27;Bloviate&#x27; to mess around with Markov chains. From Goldilocks and the 3 bears it prodoces gems such as:<p>“Someone’s been sitting my porridge,” said the bedroom.<p>You can download it here: <a href="https:&#x2F;&#x2F;successfulsoftware.net&#x2F;2019&#x2F;04&#x2F;02&#x2F;bloviate&#x2F;" rel="nofollow">https:&#x2F;&#x2F;successfulsoftware.net&#x2F;2019&#x2F;04&#x2F;02&#x2F;bloviate&#x2F;</a>
monokai_nlover 4 years ago
8 years ago I implemented something like this for your tweets at <a href="https:&#x2F;&#x2F;thatcan.be&#x2F;my&#x2F;next&#x2F;tweet&#x2F;" rel="nofollow">https:&#x2F;&#x2F;thatcan.be&#x2F;my&#x2F;next&#x2F;tweet&#x2F;</a><p>It still causes a spike of traffic every now and then from Twitter.
mkaicover 4 years ago
thank you for this clear and instructive write up! i appreciate the high-level, concise breakdown.
inbx0over 4 years ago
For the Finns reading this, there&#x27;s a Twitter bot &quot;ylekov&quot; [1] that combines headlines from different Finnish news outlets using Markov chains. Sometimes they come out pretty funny<p>&gt; Suora lähetys virkaanastujaisista – ainakin kaksi tonnia kokaiinia<p>[1]: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;ylekov_uutiset" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;ylekov_uutiset</a>
kleer001over 4 years ago
Yup, those are Markov Chains. Not to sound snotty or mean, but... so what? Why should we be interested?<p>Looks kinda like you followed a tutorial and did a write up.
评论 #26003840 未加载
评论 #26003830 未加载
评论 #26004505 未加载