When a photographer takes or edits a picture, she doesn't need to predict or simulate her own reaction. There is no model or training necessary, because the real outcome is so easily accessible. However, she is only one person, and perhaps can't proxy well for a larger group.<p>The model has the reverse situation, of course: it cannot perfectly guess the emotional response for any one person, but it has access to a larger assortment of data.<p>In addition, in different contexts it may be easier/cheaper to place a machine vs. a human in a certain locale to get a picture.<p>If my theorizing makes any sense, it suggests that this technology would be useful in contexts where: the locale is hard to reach and the topic is likely to evoke a wide variety of emotional responses.