Assignment 2 – RPG Video Games, and Real World Influence

Project Inspiration

As a huge fan of video games, it is not surprising that when I heard Jiayu’s idea of visualizing games in our coming digital humanity project, I was super excited and decided to work with him immediately. As comes to details, we found it interesting to do an analysis about the relationships of video games’ elements/themes/factors, and how those may change over time. As we’ve amazed by the vast of number of released games from 1975, we decided to narrow down our data source to RPGs(Role-playing games) only; as most RPG’s plot count as an important factor, it should be easier to exact their elements/factors/themes simply by extracting their wikipedia page’s summary.

Corpus Construction

As aforementioned, we’ve decided to use the summary of wikipedia pages as our text source. As there more than 4000 recorded RPGs from 1975 on wikipedia, it is not suprising that a software package is needed to help us get the texts (It must be a pain to copy and paste all the summary pages by hand). So we chose the wikipedia package for python to do the work. Then, we constructed a list of all the game names in a excel file, let the program to read the file, use the wikipedia package to fetch the summary page, and write the summary paragraph into a file. But there starts a problem: based on the implementation of Jigsaw, a terrible result would be generated based on a very large file, and a slow processing time happens often; for Voyant, if we have too many pieces of texts, the generated trend line would be very difficult to read and interpret. So we finally decided to slice the texts so that for voyant, we have around 30 files and for Jigsaw, we would have around 2000 files to help it understand our corpus well. But when we look into individual files carefully, a problem is detected: based on the limitation of our and the wikipedia package‘s  algorithm, we got many junk informations. Therefore, we decided to use a online algorithm called DedupeFS to eliminate junk information.

Voyant analysis(Jiayu Huang)

Screen Shot 2015-09-23 13:32:14 +0000

The result of the relationship analysis is pretty amazing, for me. Based on a fact that a vast majority of RPGs chose a fantasy approach of game themes, terms like “dragon”, “demon”, “monster” appears frequently in all corpus, which is not surprising because that’s often the clue or the ultimate goal of a fantasy RPG game: to kill the biggest enemy that threatening the world. What is surprising is that, the “good” people are not mentioned often; the word “hero”, “angel”, “warrior”, etc. doesn’t appears that often, even if the helps from them might be essential for completing quests. And “quests”. It is not surprising to list it in the middle of the figure as almost no essential elements of the RPGs are not connecting with it, and it is the bridge to connect all the essential elements with each other.

In addition, some interesting facts:

  1. The princess is connected with the knight only by a word: book. So even fantasies know that knight and princess lived in a a fantasy world.
  2. Boys are connected with the word “named” while heroine connects to the word “unnamed”.
  3. Angel, although mentioned not that unfrequently, has a very weak relationship with other essential factors, and it has done it only with the word dungeon.
  4. Also, might be not that interesting, women characteristics are not occurred frequently. only the word “princess” and the word “heroine” could be a reflection of female characters, and, unfortunately, they are not connected to the main frame with a very strong relationship.

TimeLine analysis

Screen Shot 2015-09-23 13:55:21 +0000

Screen Shot 2015-09-23 13:46:51 +0000

Tactical, action, and strategy are three main types of PRGs, and a very interesting pattern is demonstrated: When tactical and action are in dominant, strategy games don’t; while strategy games becomes popular, the other two games falls unpopular. This is due to the wider acceptance of the video games overtime: generally, the action and tactical games are usually considered as “hard core” games, which usually difficult to play and only consumed by hard gamers. As video games are more and more accepted, easy-to-learn games like strategy games become more and more popular in the industry.

Screen Shot 2015-09-23 13:49:44 +0000

And more interestingly, it seems that gaming people don’t like love; instead, they like wars. Of course it is because most RPGs’ plots are based on wars. But surprisingly love is not usually mentioned until a certain time. After that, love is usually follows the same trend of the war in frequencies over time, which means it usually comes with the war.

Jigsaw Analysis(Zhengri Fan):

Comparing to Voyant, Jigsaw is not that useful overall; especially in our circumstances. Jigsaw is in advantages on identifying entities, which is to group nouns in a group that almost everybody shares a common feature in that group. But for our game analysis, only certain nouns are mentioned overtime. For the most part, different games have different terms and settings, therefore made it hard to identify entities based on our corpus. Moreover, for word relationships, it lacks the ability to put multiple words on a single figure to show their direct relationships, while in Voyant, it is easier to use and show a clearer result of the relationships between different keywords.

But we could still generate some usefulness from Jigsaw.

For example


This is a list view in Jigsaw, listing organizations and game names in order of word frequency. It is not surprising that the word “Playstation 2” is the highest among all organizations as it is one of the most popular game console in game history. Furthermore, the word “Final Fantasy” is mentioned the most among all the names of the games, as it is the world’s most influential PRG.


Another useful feature of Jigsaw is the word tree feature.



Based on the pervious Voyant analysis, we could see that female characters are not mentioned frequently in RPGs, and such can also be justified by the word tree feature in Jigsaw. As we see, there are way many words connects to “boy” than connects to “princess”, which could suggest that princess are not often mentioned in RPGs, at least not a important character in the game.

As mentioned, since Jigsaw groups nouns, it is useful for our project in a way that it could help us to identify the relationship of the games with different people and organization. But since that is not the main concern of our project, we chose not to talk it in detail. For the tool Voyant, as it is a more plain statistical tool than Jigsaw, it could help us more in gerneating a certain aspects of the corpus created, in many more ways than Jigsaw.

Reflection of Tanya’s Reading

Our corpus are no more than plain text. By applying Voyant and Jigsaw to it, we could see that those cold-blooded words become more meaningful reflections of the real world, of the Humanity. Voyant and Jigsaw are very different tools, by allpying both of them, we could analysis our raw data in different aspects. And by knowing the data in different aspects, we learned the subject in a detailed and profound way: we are not only focus on one sepcific plane in the world, but are seeing the whole issue in a three-dimentional object established by the tools; we are do not only seeing the cold-blooded data, but also seeing people’s view points on them. And, at last, we connected the digital tools with humanity.




Fact, Digital, and… Art?

One say, that Luke DuBois’s work combined art and digital to represent data. It’s true, in some contents, in which his visualizations are neither simply accumulating numbers, nor superficially shows off the complex, magical algorithms that he generates. It do makes me feel some distinctive differences between the nerdy visualizations that science people, may include me, made, in aesthetic perspective. But are those arts, or could we call it art? Definitely we didn’t get in touch with all his works; and the works that our gallery depicted might not be the best among all his “arts”. But I cannot feel the artistic attributes of the depicted works. Yes, the moving pictures, I can call it an art. But the word cloud, or the keyword maps, I highly doubt their “art” definition, if anyone would define that. The reason is simple: I cannot see anything that he intended to express. I’m not an art person, so my evaluation might be invalid, or, full of BS. However, as my own views, I thought that any art, no matter what format they are in, is trying to tell the viewer, by some kinds of human instinct, a passionate, detailed feelings or words, that the maker of that art willing to tell. But I cannot see his passionate words. As for word cloud, well, interesting, and that I what I felt. It just expresses some fact, or even not true “facts”, but some biased expressions based on the “lies” of the presidents. For the word map, yes, it is interesting, very interesting. But a figure made by hand could not be categorized as art, as I could not hear him scramming anything, but simply generating words from some amazing black box. And that is why I like the “self-portrait”. It is obviously based on fact, and probably be organized by computer, and most importantly, it’s an art. He is expressing. He is, in my point of view, telling the vastness of human communication and networks. His conversations made a picture of beautiful “stars” and orbiting the vast star called Luke Dubois. He is not telling this fact using any numbers, bar graphs, or pie charts, but using a beautiful “photograph” of stars.






PS: I’m willing to post a picture of the self portait, but I cannot find a sufficient one on the internet.

Visualizations of My Interest

The reason why I’ve picked such two visualizations is because, at a glance, they looked pretty cool and profound, and both of them are very interesting ones for me





For the first one, the Musicovery, it provides a two-dimensional plane to represent two different aspects of the music in the library: mood, either a dark or a positive one, and the pace, either an energetic or calm one, of the music. Besides, they assigned some color to the music to represent the genre of it. The most interesting feature is that, by clicking on the places that represents different combinations of mood and pace, it would shows up a list of music that best represent such combination. Just as its name shows, it could help the user discover new music.





The second one is a static visualization about Oscar winning actors. It shows the directors of the movies in which the actors get their prize for, and other non-Oscar winning actors that have worked with the directors. Interestingly, it is somehow a tree diagram: with the very top as the Oscar-winning movie, followed by their directors, Oscar-winning actors, and actors worked with them.  As aforementioned, it seems to be very beautiful and cool visualization. But nonetheless, it sufficiently demonstrate the question that motivate this visualization: the relationship was between directors, Oscar winning actors and non-Oscar winning actors to see if they could find patterns in who actors work with. As the figure clearly shows, that only few of the outermost names have lots of connections with the inner names. However, unlike the pervious one, where the viewer, this one only serves for its purpose, and it is very hard to view it in another perspective; whereas the first one, as one could click on somewhere to get a detailed music that represents the combination of mood and pace, different aspects could be obtained from it, such as how rock spread on the mood-pace plane, or how is the popularity trend of different music in different time. Also, its main aim is to help user to explore the data: the music, wheras the first one is aming to provide the relationship between Oscar prize and the directors and actors.