2014年11月1日星期六

Recommendation Systems, Effective Weapons Against Choice-Phobia !

Happy Halloween! This Friday is the Halloween, festival encounters with weekends. It is such a fantastic opportunity for people to relax themselves. Actually, there are so many methods for people to relax. For instance, watching movies in the theatre, enjoying a banquet in the restaurant, playing sports games with friends and so on. For the most of the time, we spend much time considering, “which method is the most appropriate for me at present”. I believe this is a common phenomenon in our daily lives. Generally speaking, when we have to make choices, if there are no thoughtful recommendations, it is challenging for people to select a perfect choice.

So here comes the problem, is there any reliable method for people to make correct choice? In other word, we are looking for some scientific recommendation mechanisms, which are able to recommend things or persons which are perfectly suitable for us. During the last two weeks, on Prof. Rosanna’s class, we are actually focusing on this issue.

The concept of “recommendation mechanism” refers to recommender systems actually. They are some specific computational solutions which use computer to process the huge amount of information and do the filtering for users. They are able to analyze the tastes and preferences of different users, except from taking care of people, these systems also pay attention to items. They have the capability of analyzing the characteristics of different items. Finally, they will generate personalized recommendation based on users’ past activities and feedback.

Several recommendation methods are mentioned in the class. Let us take the content-based recommendation (CBR) as an example. If I am the recommendation service provider. When someone asks me for assistance, it is a wise choice to check this client’s history. Namely, to find out the products and services he had received. In most cases, recommend items which are similar to his history will not cause problems. Nevertheless, one person’s past activities belong to personal privacy. It is difficult or even illegal to sniff any information about that. Under this circumstance, it is necessary to transfer our attention from items to users. Here, another recommendation method is adopted. Collaborative filtering (CF). When we are doing research on users themselves, several approaches may be applied. For instance, grouping similar users, generating recommendation reports basing on user-item interactions rather than users or items themselves.

Afterwards, lots of mathematical derivations are made in the class. Franking speaking, they look complex and sophisticated, more time is needed for me to make further research on them. In a word, in spite of these phenomena about having-difficulty-in-choosing look normal in our daily social lives, there also exist a lot of profound theories behind them.


Hello November, please be good!

2014年10月14日星期二

The Social Network, A Magic Net Which Connects Everyone

When we register accounts in some online communities, the online community system will automatically recommend some people to be our friends. It is highly possible that these people recommended by the system are our friends or acquaintance in reality. It is amazing that the system seems to be familiar with us. After taking Prof. Rosanna’s last-week class, we may find some reasonable explanations.

As a matter of fact, these so-called “online community assistants” are nothing but some intelligent computers that are good at analyzing social network. After acquiring some personal information from us (e.g. Names, Gender, Hobbies, Your School, Your University or something else), it’s easy for computers to help us broaden our social network.

When I was an undergraduate, I heard an empirical theory, which says, “Any two people in the world can be connected by establishing no more than six relationships.” It seems to be incredible but it is sincerely reasonable. In order to simplify the model of people’s relationship networks, we may adopt the Graph Theory.

In a prodigiously large social network, each individual can be symbolized as a vertex (node). Relationships between persons (friends, followers etc.) can be regarded as edges (lines). Numbers of edges connected to the vertex are defined as graphs’ degree. Obviously, the more degree a vertex has, the more important role this vertex is playing in this graph. Graph theory plays an important role in social network analysis, except from degree, there are some other concepts to evaluate vertices’ attributes, such as closeness and centrality. The graph theory also has copious branch disciplines, there is no denying that making further research in this realm should be full of fun.


After all, graph theory is one of tools used in analyzing networks. Just like natural language processing, text classification methods. If we combine these useful tools wisely, there is no doubt that we may get more information from people’s relationships.

2014年10月2日星期四

Second Summary for the Course of Social Network Analytics. Come and Leave Your Comments!

Time flies, it has been over four weeks since I took the class. I have absorbed something new since I wrote the last blog. Hence, it is the time to make a summary for these two weeks’ knowledge.

I extract four key words in the last two class. They are “document comparison”, “text classification”, “text clustering” and “sentiment analysis and opinion mining”. Let’s talk about these topics in order. When it comes to the similarity between several file documents, in my perspective, we should only pay attention to several keywords. If two pieces of paper have several words which appear frequently in common, there is a high possibility that these two documents focus on the same research realm. This is an empirical method after all. There are several quantitative solutions with higher accuracy.

TF*IDF rule is one of these solutions, actually, it is simple and effective. In the TF*IDF rule, TF means the keyword frequency, it is simply the number of times a given keyword appears within a specific document. Meanwhile, the IDF (Inverse Document Frequency) is obtained by dividing the number of documents in the whole collection by the number of documents containing the given keyword. The bigger the IDF index is, there are fewer documents with specific keywords in the whole collection. These given keyword are of great specificity. Namely, they are unique. If two documents have several unique keywords in common, obviously and undoubtedly, they are similar and they can be classified into the same group.

Actually, the first three topics can be mentioned at the same time. In this information-explosion era, documents can be generated in an amazing speed, under this circumstance, people have realized the importance of comparing documents and classifying similar files into the same group.  It will increase the efficiency and make works easier.

Except from text classification, analyzing people’s moods is also a useful tool in social network analytics. People are likely to sharing their feelings with their friends on the platforms such as Facebook and Twitter. Some people express themselves explicitly while some people prefer implicit expressions. In the most cases, people’s moods can be divided into three types. Positive, negative and objective (neutral). Three types of moods represent three classes. People’s expressions are text files. Analyzer’s task is text classification, as what we have discussed above. We may create dictionaries for each class respectively. Several typical given keywords contain in the dictionary. Comparing people’s expressions with keywords in each class and it is easy to determine people’s current feelings.

All in all, contents mentioned in the last two classes are interesting. Although it is simple for us to understand, it also needs to be measured by rigorous mathematical methods.

2014年9月17日星期三

Welcome to SHEN, Zhiheng's first blog @ Blogger.com

Hi everyone! I am SHEN, Zhiheng and I am a postgraduate student at the IE Department, CUHK. You can follow my Sina Weibo @ My SinaWeibo or become my friends in My Google+My Facebook Account or My Renren Account.

This is my first personal blog @ Blogger.com. I am trying to adapt myself to different SNSs currently because I have chosen Prof. Rosanna Chan's course called "Social Media Analytics" in my first-term schedule. Under this circumstance, I will record my learning process of this course by writing personal blogs here. 

I took the course (IEMS5723_Social Media Analytics) in the last two weeks. After participating in the class, it occurs to me that social network analysis is such a profound discipline. People may get abundant information by analyzing one person's personal SNS.  Even some private information can be acquired. Doing research on this field will help people realize the significance of "Online Relationships". 

Nowadays, more and more SNSs are playing significant roles in people's daily life, except from sharing news and happiness with friends, SNSs have lots of functions as well, especially something related to commercial affairs. The picture below shows pros and cons of some famous SNSs. Here is its original address: Social Media Cheat Sheet.

Although there may exist lots of barriers while analyzing one particular social media object, we have invented copious useful tools. Such as content analysis, NLP technologies and some other tools (e.g. Google Book's N-Gram Viewer).

Writing a maiden English blog seems to be a difficult task for me.  Nevertheless, I do believe this will be a good start. By the end of this term, I will acquire something useful that I have not encountered before.

P.S. :
There may exist some grammar or spelling mistakes in this blog. I appreciate your patience and I will try my best to write a better one for the next time!