AI and Opinion Mining(2010)

Note: Please Scroll Down to See the Download Link.

ABSTRACT

             The main goal of this project is to extracting, classifying, understanding and accessing the opinions expressed in various online news sources. Here opinion mining refers to computational techniques for analyzing the opinions that are extracted from various sources. Current opinion research focuses on business and e-commerce such as product reviews and movie ratings.

           We developed a framework for analysis with four major stages such as stakeholder analysis, topical analysis, sentiment analysis and stock modeling. During the stakeholder analysis stage, we identified the stakeholder groups participating in web forum discussions. In the topical analysis stage, the major topics of discussion driving communication in the Web forum are determined. The sentiment analysis stage consists of assessing the opinions expressed by the Web forum participants in their discussions. Finally, in the stock modeling stage, we examine the relationships between various attributes of web forum discussions and the firm’s stock behavior.

            Opinion target, opinion holder and opinion are the definitions used to extracting opinions from different online sources. An opinion can be expressed in two types. 1. Direct opinion, 2.Comparative opinion. All the opinions are stored in a document. Following are the steps to extracting the opinions.

·        Identify the objects.

·        Feature extraction and synonym grouping.

·        Opinion orientation determination.

·        Integration.

Existing System:

             Existing approaches are based on different supervised and Unsupervised methods using opinion words and phrases and the grammar information. One key issue is to identify opinion words and phrases (such as good, bad, poor, or great), which are instrumental to sentiment analysis. However, there are seemly an unlimited number of expressions

That people use to express opinions, and in different domains, they can be significantly different. Even in the same domain, the same word might indicate different opinions in different contexts.

Proposed System:

            Our work builds on previous studies focusing on the relationship between the discussions held in firm-specific finance Web forums and public stock behavior. However, instead of assuming a shareholder view of participants in a finance Web forum as in previous research, and considering them to be uniformly representative of investors, we adopted a stakeholder perspective. This perspective more accurately represents the diversity of the constituency groups participating in the Web forum and closely aligns the analysis with the corporation’s stakeholder theory.

To address the broad questions posed in this research, and guided by the literature reviewed, we developed a framework for analysis with four major stages: stakeholder analysis, topical analysis, sentiment analysis, and stock modeling. During the stakeholder analysis stage, we identified the stakeholder groups participating in Web forum discussions. In the topical analysis stage, the major topics of discussion driving communication in the Web forum are determined.

The sentiment analysis stage consists of assessing the opinions expressed by the Web forum participants in their discussions. Finally, in the stock modeling stage, we examine the relationships between various attributes of Web forum discussions and the firm’s stock behavior.

IMPLEMENTATION MODULES:

·        Posting opinions

·        Object identification

·        Feature extraction

·        Opinion-orientation determination

·        Integration

MODULE DESCRIPTION:

 Posting opinions:

In this module, we get the opinions from various people about business, e-commerce and products through online. The opinions may be of two types. Direct opinion and comparative opinion. Direct opinion is to post a comment about the components and attributes of products directly. Comparative opinion is to post a comment based on comparison of two or more products. The comments may be positive or negative.

Object identification:

In general, people can express opinions on any target entity like products, services, individuals, organizations, or events. In this project, the term object is used to denote the target entity that has been commented on. For each comment, we have to identify an object. Based on objects, we have to integrate and generate ratings for opinions.

The object is represented as “O”. An opinionated document contains opinion on set of objects as {o1, o2, o3… or}.

Feature extraction:

An object can have a set of components (or parts) and a set of attributes (or properties) which we collectively call the features of the object. For example, a cellular phone is an object. It has a set of components (such as battery and screen) and a set of attributes (such as voice quality and size), which are all called features (or aspects). An opinion can be expressed on any feature of the object and also on the object itself.

                 With these concepts in mind, we can define an object model, a model

of an opinionated text, and the mining objective, which are collectively called the feature-based sentiment analysis model. In the object model, an object “O” is represented with a finite set of features,

                            F = {f1, f2,…, fn}

 which includes the object itself as a special feature. Each feature  fi ? F can be expressed with any one of a finite set of words or phrases

                          Wi = {wi1,wi2, …, wim}

 which are the feature’s synonyms.

Opinion-orientation determination:

The opinion holder is the person or organization that expresses the opinion. In the case of product reviews and blogs, opinion holders are usually the authors of the posts. An opinion on a feature f (or object o) is a positive or negative view or appraisal on f (or o) from an opinion holder. Positive and negative are called opinion orientations. From this opinion orientation we have to determine the type of opinion whether it is direct opinion or comparative opinion.

  • Direct opinion:

                A direct opinion is a quintuple (oj, fjk, ooijkl, hi, tl),

                where oj is an object,

                           fjk is a feature of the object oj,

                          ooijkl is the orientation of the opinion on feature  fjk of object oj,

                          hi is the opinion holder, and

                          tl is the time when the opinion is expressed by hi.

The opinion orientation ooijkl  can be positive, negative, or neutral.

  • Comparative opinion:

A comparative opinion expresses a preference relation of two or moreobjects based their shared features. A comparative opinion is usually conveyed using the comparative or superlative form of an adjective or adverb, such as “Coke tastes better than Pepsi.”

Integration:

Integrating these tasks is also complicated because we need to match the five pieces of information in the quintuple. That is, the opinion ooijkl must be given by opinion holder hi on feature fjk of object oj at time tl .To make matters worse, a sentence might not explicitly mention some pieces of information, but they are implied using pronouns, language conventions, and context. Then generate ratings based on above tasks. Thus we can clearly see how holders view the different features of each product.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

System               : Pentium IV 2.4 GHz

Hard Disk          : 40GB

Ram                   : 512 MB

SOFTWARE REQUIREMENTS:

Microsoft visual studio 2008(ASP.NET, c#)

SQL server 2005

Click here to download AI and Opinion Mining(2010) source code