TechEcho

14 comments

cardiganover 9 years ago

Ooh front page; I guess this calls for a bit of an explanation!First off - used this code for training the models: <a href="http://karpathy.github.io/2015/05/21/rnn-effectiveness/" rel="nofollow">http://karpathy.github.io/2015/05/21/rnn-effectiveness/</a>Very very easy to setup and train; highly recommend playing around with your own training data (just a text file!)This project's code: github.com/shariq/burgundyStyled and deployed the website about a year ago at a hackathon; it then used a nice wordlist with hand picked words. (repo/wordserver/old_burgundy_words.txt)Few days ago: got the server to start training a bunch of models (~200), with randomized parameters, using the original wordlist as the training data. (repo/rnn/rnn.py:forever)Yesterday: woke up at 3 AM after my sleep schedule rolled around, started exploring the output of models trained to different numbers of epochs and run at different temperatures. Subjectively looked at the outputs, decided some model/epoch/temperature tuples were horrible, got rid of those. Wrote a few different scoring functions (just using intuition for what kinds of bad outputs seemed to be commonly occurring) to score the model/epoch/temperature tuples. Got the top ~10 scoring tuples from each scoring function, plus added some additional interesting ones along the way, and then used a pronunciation scoring function (repo/rnn/pronounce.py) to select the top 5 of all of these. Funny enough, the top 5 tuples all used different models and a varying range of temperatures (i.e, not the same model from different epochs, and picking the right temperature significantly improved how well the model performed) (repo/rnn/explore.py)Since the models would still occasionally output words which were completely unpronounceable, I put some code on top of the models which would generate a bunch of words then discard the bottom 1/3rd of unpronouncable words. A significant portion of generated words from these models also started with a "c" or "b" for some reason: gave those a high chance of being discarded. Short words were also uninteresting, and extremely long words would occasionally show up: added probabilistic filters for length. Finally, initialization time of LuaJIT is very high, so I had the server keep a pool of words which gets reseeded as it runs out. (repo/rnn/rnnserver.py)If you want to train your own word generator and you need some pointers, would love to help: @shariq

评论 #10648342 未加载

gliese1337over 9 years ago

No "about" info? No "how it works", "how it was trained", etc.?It seems to only generate words that match English phonotactics & spelling conventions- things that could be English words. Can it be retargeted to other languages, or to arbitrary word-shape constraints?I am particularly interested because I've recently undertaken a survey of word-generation software for conlangers (people who create artificial languages, like Quenya or Klingon or Na'vi), and while they do come in widely varying degrees of sophistication, with varying degrees of built-in linguistic knowledge, there are none yet publically available that are based on neural networks.

评论 #10646761 未加载

评论 #10647854 未加载

DanBCover 9 years ago

I got cacurine, which is less pleasant.Is it really using recurrent neural networks, or is it using markov chains?

评论 #10647878 未加载

评论 #10646942 未加载

jgalt212over 9 years ago

I got carantil which is not great, but with a small tweak and it's Carancil which is a perfectly good name for a new drug. Companies like Brand Institute charge good money for these services.

pizzaover 9 years ago

These are all pretty cool. What determines how it could "improve" generated words? Perhaps a larger, "pleasant"-words-only corpus?

评论 #10647879 未加载

namuolover 9 years ago

Is there source available for <a href="http://burgundy.io:8080/" rel="nofollow">http://burgundy.io:8080/</a> ?

评论 #10646768 未加载

argonautover 9 years ago

The results are indistinguishable from Markov chains. There's really no need to use RNNs for everything...

评论 #10647008 未加载

评论 #10647713 未加载

throwaway24997over 9 years ago

I got 'mingerrot' which isn't beautiful at all - it sounds like some kind of unpleasant infection.

ogigover 9 years ago

Train it with some Tolkien appendixes and it could be a good RPG name generator.Also, realworld usernames may be fun. You could make a twitter username generator or something.

gregw134over 9 years ago

Any ideas on how to generate startup names with a neural network?

评论 #10648854 未加载

jastantonover 9 years ago

Amamanus, any word with anus in it isn't exactly pretty :)

评论 #10646479 未加载

smcnallyover 9 years ago

vermocharen -- certainly works in some contexts. A coffee roaster, e.g.no small feat to get even marginally-euphoneous words from an open, available code base.Next up came tintilu picolera fangon

seqizzover 9 years ago

Thanks for my new hostname generator.

评论 #10648437 未加载

abrknover 9 years ago

turdurine

14 comments

cardiganover 9 years ago

评论 #10648342 未加载

gliese1337over 9 years ago

评论 #10646761 未加载

评论 #10647854 未加载

DanBCover 9 years ago

I got cacurine, which is less pleasant.Is it really using recurrent neural networks, or is it using markov chains?

评论 #10647878 未加载

评论 #10646942 未加载

jgalt212over 9 years ago

I got carantil which is not great, but with a small tweak and it's Carancil which is a perfectly good name for a new drug. Companies like Brand Institute charge good money for these services.

pizzaover 9 years ago

These are all pretty cool. What determines how it could "improve" generated words? Perhaps a larger, "pleasant"-words-only corpus?

评论 #10647879 未加载

namuolover 9 years ago

Is there source available for <a href="http://burgundy.io:8080/" rel="nofollow">http://burgundy.io:8080/</a> ?

评论 #10646768 未加载

argonautover 9 years ago

The results are indistinguishable from Markov chains. There's really no need to use RNNs for everything...

评论 #10647008 未加载

评论 #10647713 未加载

throwaway24997over 9 years ago

I got 'mingerrot' which isn't beautiful at all - it sounds like some kind of unpleasant infection.

ogigover 9 years ago

Train it with some Tolkien appendixes and it could be a good RPG name generator.Also, realworld usernames may be fun. You could make a twitter username generator or something.

gregw134over 9 years ago

Any ideas on how to generate startup names with a neural network?

评论 #10648854 未加载

jastantonover 9 years ago

Amamanus, any word with anus in it isn't exactly pretty :)

评论 #10646479 未加载

smcnallyover 9 years ago

seqizzover 9 years ago

Thanks for my new hostname generator.

评论 #10648437 未加载

abrknover 9 years ago

turdurine

Word generator using recurrent neural networks

14 comments

Word generator using recurrent neural networks

14 comments