TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

5000 human feelings in real time - from blogs

1 pointsby khmelover 12 years ago

2 comments

khmelover 12 years ago
At the core of We Feel Fine is a data collection engine that automatically scours the Internet every ten minutes, harvesting human feelings from a large number of blogs. Blog data comes from a variety of online sources, including LiveJournal, MSN Spaces, MySpace, Blogger, Flickr, Technorati, Feedster, Ice Rocket, and Google.<p>We Feel Fine scans blog posts for occurrences of the phrases "I feel" and "I am feeling".<p>Once a sentence containing "I feel" or "I am feeling" is found, the system looks backward to the beginning of the sentence, and forward to the end of the sentence, and then saves the full sentence in a database.<p>Once saved, the sentence is scanned to see if it includes one of about 5,000 pre-identified "feelings". This list of valid feelings was constructed by hand, but basically consists of adjectives and some adverbs. The full list of valid feelings, along with the total count of each feeling, and the color assigned to each feeling, is here.<p>If a valid feeling is found, the sentence is said to represent one person who feels that way.<p>If an image is found in the post, the image is saved along with the sentence, and the image is said to represent one person who feels the feeling expressed in the sentence.<p>Because a high percentage of all blogs are hosted by one of several large blogging companies (Blogger, MySpace, MSN Spaces, LiveJournal, etc), the URL format of many blog posts can be used to extract the username of the post's author. Given the author's username, we can automatically traverse the given blogging site to find that user's profile page. From the profile page, we can often extract the age, gender, country, state, and city of the blog's owner. Given the country, state, and city, we can then retrieve the local weather conditions for that city at the time the post was written. We extract and save as much of this information as we can, along with the post.<p>This process is repeated automatically every ten minutes, generally identifying and saving between 15,000 and 20,000 feelings per day.
khmelover 12 years ago
When the applet is first opened, the initial dataset consists of the most recent 1,500 feelings collected by our system. The applet's panel can then be used to arbitrarily specify different populations, constrained by any combination of:<p>- Feeling (happy, sad, depressed, etc.)<p>- Age (in ten year increments - 20s, 30s, etc.)<p>- Gender (male or female)<p>- Weather (sunny, cloudy, rainy, or snowy)<p>- Location (country, state, and/or city)<p>- Date (year, month, and/or day)