Did a Human or a Computer Write This?

172 pointsby tomek_zemlaabout 10 years ago

33 comments

vonnikabout 10 years ago

Former reporter hear who has been in professional talks with a natural-language generation firm, dabbles in NLP, and has some insight into this process.A lot of what machines churn out is based on templates that humans have created. Many of the things we read that are supposedly written by machines required major human intervention, and the application of strict constraints on the machine so that it wouldn't screw up.Number stories about baseball and the stock market are good for machines to write about. Most qualitative things aren't, for the moment. Even Google's results generating captions from images last November actually included a lot of variance. The algorithm was good at recognizing some things and not others.To the commenters on this thread who criticize news organizations, I would simply ask: Do you consume your news for free? Do you believe that synthesizing complex events using multiple sources of data happens without cost? Do you believe that news organizations pay reporters much to write stories? Please reflect on your role in the news ecosystem.Most readers today are free riders, and unwilling to subsidize quality journalism. That's one reason why it's dying.

评论 #9169200 未加载

评论 #9170442 未加载

评论 #9169975 未加载

评论 #9178323 未加载

评论 #9169581 未加载

joe_the_userabout 10 years ago

What does "written by a computer" even mean?I mean, computer-written can be anything from filling in some blanks to rendering a formula by reverse-parsing. And suppose that an apparently fancy formula actually results in blank-filling-in most of the time?Lots of professional writers are described as following a formula. Lots of writers and non-writers construct text using word-processors and even an occasional search-and-replace.-- Oh, and add to that "The Eliza Effect"[1], in which it's pretty easy for humans to ascribe greater meaning to some kinds of computer generated text than it really has.[1] <a href="http://en.wikipedia.org/wiki/ELIZA_effect" rel="nofollow">http://en.wikipedia.org/wiki/ELIZA_effect</a>

评论 #9167274 未加载

hammerandtongsabout 10 years ago

There is a non-zero amount of reporting that would be improved by algorithmic and model driven story generation.Quite a bit of reporting has become thinly veiled rewrites of press releases, what could be improved by an algorithm that actually had background context and a consistent model for a type of story like a simple preview of a coming game?The amount of statistics in reporting that lacks any context for the numbers spewed out? Why couldn't a machine do a better job in enforcing context for the numbers?For example -- "there is a %50 murder increase in the first six months of this year" (common problem with human reporting) vs "there is %50 murder increase from 2 in the first six months of last year to 3 in the first six months of this year, this is down from 20 in the previous year" (an algorithm that automatically enforces context)?I realize this is a complex topic but wow the average ;) news story is soooo bad that...

评论 #9167918 未加载

评论 #9167727 未加载

评论 #9167807 未加载

评论 #9170084 未加载

Udikabout 10 years ago

Completely meaningless. "Written by a computer" doesn't really mean anything. What's important it's the breadth and variety of the content an algorithm can generate, its ability to choose the most relevant among its input data, and the amount of meaningless content that has to be discarded by a human supervisor before publishing.Take fragment 6: “Tuesday was a great day for W. Roberts, as the junior pitcher threw a perfect game to carry Virginia to a 2-0 victory over George Washington at Davenport Field.”Cool. And who told the computer it was a great day for him? Who told the computer he was playing? Who told it his game was "perfect"? This sentence can be written by a professional human journalist in about ten seconds, how long does it take to input in a computer the data that make up the story or to check among hundreds of possible variations for one that doesn't contain obvious mistakes?Or: “Kitty couldn’t fall asleep for a long time. Her nerves were strained as two tight strings, and even a glass of hot wine, that Vronsky made her drink, did not help her. Lying in bed she kept going over and over that monstrous scene at the meadow.”Ok, it's a novel, written by a computer. Now, everybody can write a software that produces one single novel: just store it as a single string in the program and print it out. The magic happens when the computer can write something that goes way beyond the data that was stored in it exactly for that purpose. So how many different novels can this program write? Does the result exceed considerably the effort of the programmers put in the program itself? Most probably not, otherwise we'd be talking of a general AI.

评论 #9168015 未加载

评论 #9167980 未加载

评论 #9168004 未加载

评论 #9168257 未加载

cowpigabout 10 years ago

I got 7/10, but I think I would do better on prose with 3-4 sentences rather than a single one. Poetry is probably a bit tougher though.I don't find computer-generated snippets that impressive--I mean, a lot of it is just pumping out variations or markov-generated mix-and-matches of things humans wrote in the first place. More impressive would be passing the Turing Test :D

评论 #9167106 未加载

tdubhro1about 10 years ago

Leaving aside the gimmicks in some of these examples, it is a fact that when you need to communicate quantitative information machine generated text is a great solution. Our company builds analytics and dashboards for fund managers and traders (the people who manage most pension funds). Infographics and charts only go so far; eventually, the user has to extract the key information, or communicate it with a colleague, which means verbalizing.[ShowHN:] We built our report generation system (in Haskell) that can create a custom report for every portfolio or market index, and can be tuned to the user's risk profile. We've released a public version that anyone can use for free that covers most global indices and sectors:<a href="https://apps.otastech.com/morningreport?listtype=marketindex&listid=11&apikey=5AA46437DEDE6DC9C7ABCD56DF5CB&search=true" rel="nofollow">https://apps.otastech.com/morningreport?listtype=marketindex...</a>In our experience, the turns of phrase that initially give the impression that the text has been written by a human quite quickly become irritating noise. Our users need to absorb information quickly and accurately, and comprehension is aided by adhering to a standard structure and avoiding figurative language.

hurinabout 10 years ago

Some of these are clearly computer-generated in a way where the human has done most of the work. Take“In truth, I’d love to build some verse for youTo churn such verse a billion times a daySo type a new concept for me to chewI keep all waiting long, I hope you stay.”Now I'm going to bet a generator algorithm does not actually comprehend the meaning of any that - most likely the generator aspect was filling in some blanks, or picking out one random phrase structure out of pre-coded structures.

评论 #9167935 未加载

netherabout 10 years ago

"Hi"did a human or computer write this?

评论 #9167333 未加载

ridiculous_fishabout 10 years ago

Here's a humorous failure mode of computer-authored articles. Zillow bought the real estate company Trulia. The website equities.com then shared its wisdom:Trulia Inc (TRLA) established a new 52-week low yesterday, and could be a company to watch at the open. After opening at $0.00, Trulia Inc dropped to $0.00 for a new 52-week low.The article goes on to speculate as to whether this is a buy or sell signal. It's since been deleted but I put the text up here: <a href="http://pastebin.com/ihgWNVJU" rel="nofollow">http://pastebin.com/ihgWNVJU</a>

hyperion2010about 10 years ago

I missed 3 because I have a low opinion of human punctuation. For sports and business or most things with numbers I assume that it is a computer because that is the kind of data that is reasonably easy to write domain specific sentence construction for. For poetry looking for whether there seems to be underlying meaning or intent that surfaces without excessive mental gymnastics also seems like a reasonable strat.

dkarapetyanabout 10 years ago

I missed 3. Some of these are getting pretty good but they probably chose the best examples. I'm guessing with a bit more context it would have been much easier to disambiguate.

jerfabout 10 years ago

For computer graphics, I have a standard that I use where instead of asking "Is this rendering technology 'realistic'?", as in, a binary question, I ask "At what resolution is this render indistinguishable from reality?" For instance, there's a lot of car photos and architecture renderings that use that use certain expensive rendering techniques that look great even at 720p, but you start getting into 1080p or above and it once again becomes clear it's a computer rendering. Other techniques may only be able to work up to 320x200 or something.Similarly, telling whether a computer has written something or not is very challenging at this snippet size because there's hardly any room for "voice" to shine through. I actually did pretty well, but to be honest I got more mileage out of a meta-heuristic ("how is the author trying to fool me? ah, this one seems really, really human so it must be computer... yup...") than actual analysis of the text. I mean, drop those computer-generated sports sentences into the middle of a human sports column and you're not going to pick them out specially... they're facts. They fit. However, an entire column written like can be pretty obvious. I get some financial news from some Google Alerts on a couple of companies and it's incredibly obvious that there are computer algorithms out there that can take the daily outcome for a stock, how the market did that day, and how the entire industry did that day, and spin that into several hundred words of completely and utterly useless speculation about "why" the stock did a certain thing. (Not that it hasn't become clear to me just how shallow a lot of the "free" analysis is, but, well, in no way does outsourcing the shallow analysis job to a computer make it any better...!)(One of them in particular that I've come to enjoy reading in an almost Dadaist sort of way really loves the phrase "The bears had a field day with..." as in, "The bears had a field day with $STOCK as it dropped 0.01% in light trading.")Increase the sample size and you'd probably get a better sense of whether or not it is fooling you. I was going to write "and you might do better", but that's not necessarily true... for instance, to be honest I've never been "into" poetry, I've even tried seriously a couple of times, just can't do it, and I'm pretty sure the poetry-writing program could fool me for quite a few stanzas before I eventually caught on because, to me, it's all the same. [1] I'd eventually guess more on meta-analysis like observing grammatical structures being repeated for what would not be a good reason.[1]: Have I caveated this sufficiently that nobody will feel compelled to reply and explain to me just how objectively awesome poetry is? I'd say my thing is more music, but to be honest, a surprising number of "human" composers already sound pretty computer-y to me....

Guillaume86about 10 years ago

7/8. Number 5 got me, but also helped me get the last ones correctly. I would probably got more errors if the answers were all given at the end.

评论 #9169353 未加载

adventuredabout 10 years ago

There are still far too many obvious tells here."Apple’s holiday earnings for 2014 were record shattering."An algorithm isn't going to use the slightly unusual "holiday earnings" reference (it would have said X quarter or end of year perhaps), it's also not going to understand (yet) that they were record shattering without human direction."Benner had a good game at the plate for Hamilton A’s-Forcini. Benner went 2-3, drove in one and scored one run. Benner singled in the third inning and doubled in the fifth inning."That's maybe the easiest out of all of them. Repeating Benner that way over and over again, is nothing like how a sports writer would write.Maybe another six to ten years of evolution, and it'll be nearly impossible to tell the difference. It's certainly a significant improvement over what you would have seen ten years ago in this sort of exercise.

评论 #9167991 未加载

评论 #9168014 未加载

eobabout 10 years ago

The Shakespeare sonnet wasn't written by a computer. IIRC Nate just had a text editor with a type-ahead suggest box that was weighted by an n-gram lookup into your corpus of choice. It was up to the human to choose which word to use.

tomasienabout 10 years ago

The problem with this quiz, and so many like it, is a bunch of the passages were so badly written. A badly written passage could be a human or a computer, there's no real way to distinguish that. There are 3-4 in here that are well constructed, 1 by a human and a couple by computers. For each, to me, it was obvious which wrote it. The rest I was like "who knows, either a bad writer or a computer but it could be either".

TheSpiceIsLifeabout 10 years ago

There are at least three comments here sayings "got x/10"Every time I load the page I get the same eight enumerated questions.Is there something going on here I'm not aware of?

评论 #9170668 未加载

demarqabout 10 years ago

Only got two wrong [bragging!], and both were sports commentary. but then again I'm not really a sports guy.Either way that was impressive... I could imagine in the future your phone/tablet scans your favourite websites and generates news stories for you to read, replacing traditional media. If you are into food you can have an entire newspaper generated about the goings on of food!

stretchwithmeabout 10 years ago

Are computers actually writing things? Or filling in a template covering an event that produces data that is expected?I could write an app that fills out this template:"Wow, did you see that post on Hacker News? Got X number of views within Y minutes!"That's not writing. The guy who made the template wrote most of it. The rest is plug and play.

GizaDogabout 10 years ago

Humans are computers so it does not matter.

arithmaabout 10 years ago

"6"Was this number written by a human or a computer?If you sift through a stream of mumbo jumbo written by a thousand chimps and selectively extract meaningful words, is it written by chimps or selected by humans.That is just to say, an algorithm can't be "judged" by single outputs, much less output selected by a human.

tek-cyb-orgabout 10 years ago

It was all written by humans. computers just do what humans want them to do. period.

jafingiabout 10 years ago

But, if the algorithm is written by a human, isn't it a human writing the text?

评论 #9168087 未加载

joelrunyonabout 10 years ago

Would love this technology to auto-analyze politicians (or anyone's statement) and be able to automatically 'fact check' or reference them to keep them honest & improve news quality.

backlavaabout 10 years ago

print "This sentence was written by a computer."

yzhabout 10 years ago

6/8. Got 5 and 6 wrong. I wonder what kind of errors are more often: treating a computer-generated text to be the work of a human, or vice versa?

getdavidhigginsabout 10 years ago

I brought up this scenario a few months ago with 'content engines', or 'content as a service'. Natural Language Generation (NLG) is no trivial matter. The ones who have mastered algorithmic writing are the new gods of our time. Can you imagine not having to pay the team of writers at NYT? Exactly ― you can't imagine: <a href="http://blog.higg.im/2014/03/14/percolate-content-marketing/" rel="nofollow">http://blog.higg.im/2014/03/14/percolate-content-marketing/</a>

jimkriabout 10 years ago

This actually gives me a idea for an independent study I need to do to finish my computer science minor. It is basically what the article is about, an algorithm that I could use to write my papers. It could be for major papers or for a cover letter or even a blog post that I have to write for a class.Does anyone have any thoughts on that?I have not done any research on the topic of algorithms writing articles other than reading this article and the other article that is on the front page as well.

ricardobeatabout 10 years ago

Got 10/10. I already knew about Quill so knew what to expect. The funny thing is that perfectly correct sentences are more likely to have been written by a computer than a human, it's a pretty good signal.

评论 #9167995 未加载

Tyguy7about 10 years ago

I got 1 wrong. Those algorithms still need work.

weitzjabout 10 years ago

Ah. 9/10. Failed nr. 3

评论 #9167877 未加载

评论 #9169553 未加载

hcarvalhoalvesabout 10 years ago

Surprised by the poetry.

daniel134about 10 years ago

Got 7/8 correct!