How does this work?
Umbria collects and analyzes postings from millions of blogs, message boards, chat rooms, etc. to find out what is being discussed and who is engaging in discussions in the blogosphere. How do they do that? Check out this FAQ to find out.
What does Umbria consider a blog?
With as many definitions as people doing the defining, most people agree a weblog or a "blog," as it is also known, is an online diary or journal where people write about any topic in which they have an interest and make this available for others to read via the Internet. Blogs have also been described as personal Web sites with a much simpler means to add and post new material. Blogs can also provide the ability for others to comment on what has been written.
What is the difference between a blog and a message board, or online chat?
Like blogs, message boards provide a forum for people to post their thoughts online. They tend to be more conversational than blogs, with hundreds or even thousands of people posting their thoughts, raising questions and responding to one another. Message boards also tend to be focused on a single subject of common interest to all of the people writing on the board. For example, a message board may focus on a specific model of car or music group.
Like message boards, online chat provides instantaneous conversation between and among people, typically about a single topic of common interest.
Blogs, on the other hand, usually have a single primary poster, can cover a variety of topics and tend to have fewer other people posting their opinions.
What gets discussed in a blog?
The subjects of blogs are as varied as the people writing. Most people discuss the ongoing events of their lives in their blogs, while others use it as a medium to express their opinions about specific topics. For example, many people talk about their school, work, relationships, vacations and day-to-day activities, while others use it as a forum to express their opinions on politics, music, or technology.
While the earliest adopters of blogging were technologists searching for a simpler method of expressing their opinions through the Internet, those wishing to make their political opinions and observations available to others quickly adopted blogs. Shortly after the adoption by technologists and politicos, younger Internet users rapidly began adopting blogs as a journal or diary to share with others, however, today people of all ages and technological abilities have adopted and use blogs. For example, new mothers have adopted blogs as a convenient and time-saving way to provide updates to friends and family on developments with the new child without having to answer tens of e-mails everyday.
How many blogs are there?
Estimates place the number of blogs as high as 60 million by the end of 2006, with many more added each week.
How does Umbria determine what is being discussed on a blog?
Umbria uses a combination of approaches to understand what is being discussed by blogs. Some of the methods Umbria uses include approaches used by traditional search engines. However, Umbria takes its analysis further to insure the text it identifies directly relates to the topic of interest. By way of example, compare the following two sentences:
(1) "I went to Dairy Queen yesterday to have a blizzard."
(2) "I got stuck in a blizzard yesterday on my way to Dairy Queen."
If you are interested in the Dairy Queen Blizzard only sentence 1 is of interest, however, many traditional keyword based search approaches may identify both sentences of equal importance. Umbria's technology analyzes the entire sentence instead of just keywords to insure only relevant comments are included in the topics it identifies.
How does Umbria deal with spam?
Spam is a large and growing problem when it comes to blogs. Depending on the topic and/or subject discussed, up to 80% of all blog postings for some categories are made up of spam blogs versus genuine author-generated blogs. Spam not removed prior to analysis may skew analysis results by as much as 50%.
Umbria uses a three-pronged approach that uses both automated and human inspection to eliminate up to 95% of spam blogs from data prior to analysis.
How does Umbria determine the age or gender of bloggers?
From the words the blogger uses. Umbria has developed a number of systems to help identify age and gender. One of these systems decomposes postings into their parts of speech (nouns, verbs, adjectives, adverbs, etc.) and then uses mathematical models to compare the decomposed speech with nouns, verbs, adjectives, phrases and other forms of speech from people of known ages and genders. The technology Umbria uses is a new application of linguistic analysis.
As a rough example, compare the speech of a 14-year-old female with a 43-year-old male. To the extent they use different nouns, verbs, adjectives, adverbs, phrases, or speak about different topics, these differences offer clues to help predict age and gender. All market research has a margin of error dependent upon the type of analysis being conducted. Likewise, Umbria's analysis of blogs and other online opinion sources is limited to the perspectives of the pool of individuals who have gone online to offer opinions on products, services, brands, candidates, etc.