> <i>Start with manual processes</i><p><i>This actually goes for all companies. Early on, whenever you can replace code with a manual process, you should; if for no other reason than it can help you to iterate faster. We do it religiously at 42Floors. Everything starts manually. Save your precious engineering cycles for the times when you actually need it.</i><p>God, there needs to be a sexy phrasing of this, kind of like how quick-iteration-cycles has "Move fast and break things." Today, people are so dependent on computers that they seem to forget that there used to be a way to just <i>do things</i> even without an app.<p>I've helped or consulted on various researching projects for fun, and it always pains me to hear a very smart person say something like, "I saw that there's a great Java library for sentiment analysis. Should we build our project on Java?"<p>Nevermind the layperson ignorance of software engineering evident in that question...my problem with this kind of question is <i>why do we need any kind of software to do "sentiment analysis"?</i><p>The reason isn't self-evident...and if the questioner would just take the time to do "sentiment analysis" manually, as if no computers existed, what they actually <i>need</i> would be easy to explain to a software engineer. And what I mean by "manually", I mean to take a sample of input and a spreadsheet, and then mark down the "sentiment" you yourself can detect by reading over each unit of input.<p>After an hour of that, you'll have a great idea of what you actually <i>want</i> and <i>need</i>, such as, what <i>kind</i> of sentiment are you looking for? What is the granularity of sentiment, e.g. just "happy" and "angry"? Or do you need "Very happy/happy/neutral/unhappy/angry?" And if the latter, how did you, yourself, as a human, differentiate between an "unhappy" and "angry" input?<p>You may realize that you don't need very much granularity. Or that the kind of input you're analyzing, such as tweets, do not lend themselves very easily to accurate sentiment analysis...or, at the very least, will require certain tweaks in the software so as not to be thrown off by common styles of phrasing. Or you may find the input lends itself to having a nice loophole that greatly enhances how quickly you can judge sentiment, such as if your input sample tends to use a lot of emoji.<p>These are all computational thought processes that require no machine learning to just <i>do</i>, that we as humans can do for ourselves, whether it's to efficiently prototype a machine learning model or because an EMP bomb just went off. As a programmer, I'm all for automating the hell out of everything, but it really irks me when people have no idea <i>how</i> to automate something, nor have a reason <i>why</i> something should be automated, and then hope that the machine (or its mercenary operator) can figure it all out for them.