Assignment 2- Yifu

A Analysis of News and Chinese Stock Market

  • Construction of corpus

The topic that I was focusing on is the Chinese stock market. I would like to see if there is any possible and clear connection with the news report on Chinese stock market and its turbulence (bull(good and rising) market and bear(bad and falling) market). I chose two critical time period in Chinese stock market: the biggest bull market in 2007, and the recent bear market in 2015. Around 30 news were fetched manually for each of the time period from 5 major english news portals: BBC, USNEWS, Chinadaily, Reuters and NY times. These news are very relative to the topic because I chose them subjectively and I picked a little earlier in the timeline as I was wondering if the news report foresee the upcoming rise or fall in the stock market.

  • In Voyant

bear1111 bull das

These two are the word frequency picture of the bull(top) and bear(bottom) market. After I clear some of the misleading words like “china” and “stock”, these two pics look reasonable enough. People need to dig into a very subtle level, ignoring all the stock terms like “percent” and “index”. According to the left picture, we could see there are “selling” “fell” “brokerages” and “lost” shown up quite a few times. But on the right one one of these appears. Instead, “large”,”development” and “growth” seems to be appearing in many news report.

bull fallbear fll

Again, we could see the words connection in Voyant. In the top left one (bull market), rise has 17 connections while fell has 16. While in the right one rise has only 3 connection but fell has 21. And in the bear market one the word “rise” sometimes are related to “crisis”.

  • In Jigsaw

It is almost the same scene in Jigsaw but Jigsaw focus more on the entities rather than certain word. So in my research of the connection between news and stock market, Jigsaw is somehow less useful than Voyant. But one thing I found very interesting is the sentimental analysis.

bear niu

These two bars represent the sentimental value of each text. The more blue or right, the more sadness or bad words in the text. The more red or to the left, the more happy words in the text. Without even a guess, people could clearly see that the top one refers to the bull market while the bottom one referring to the bear market.

This sentimental analysis is crucial because when people want to know how the stock market acts, the most convenient resource for them is news report. And according to the sentimental analysis, they could foresee the upcoming turbulence of jump in the market, hopefully.

  • Comparison

I would say the these two platforms are all very useful but the work in different ways. Voyant seems to be more compatible with every kinds of text, no mater long or short because Voyand is taking every text together to analysis (comparing to Jigsaw). But Jigsaw is more picky in texts. You have to give Jigsaw a lot different text so that it could do the entity identifying. I’d say that for some certain area of research, Jigsaw would be more appropriate. I use Voyant more in my whole research.

  • Conclusion

As for Tenya’s argument, I don’t know if there is a right answer but I do think that by doing text analysis people would see a lot of things and connections that they would not know if they just see the text in total. “these cameras and the resulting images” did provide me with very multidimensional and interesting aspects of the resources. I would like to work on it more with my major topic if I have chance in the future. I believe it would be very though provoking to link math with statistical texts.


The Multi-functional Visualizations


I chose this one as my first example because it literally caught my eyes at the first time. This visualization is very clear and simple with its format and color. This piece for an Airbnb presentation maps out Airbnb’s top 50 markets. The thickness of the lines corresponds to the relative volume of travels between each pair of markets, which are labeled along the outside of this radial network visualization. The interesting thing I found with this one is that it gives you several feelings at the same time. Instead of numbers coming to your eyes, the color, thickness of the lines, the bigger names and the interaction with the lines all show up to you at the same time. So people can form a kind of intuition of what the picture want to deliver. It is easy to find that New York, Paris and San Francisco are the three places with the most lines interacting with the rest places. And the line between New York and Sydney, New York and Paris are much more thick than the others so that people know the general shape of the information before they find the exact data, (there is even no exact data.).



This one is of the same type, but more clear in the difference of the colors and lines. It shows the connections of companies in Hadoop world. Again very much information would be acquired from the first sight of the visualization. Those big names, thick lines and color all represented different information offers you the whole thing about this topic. Although with explosive amount of data, it requires you to find more about the topic as you dig into this picture to form you own idea.

Art by DuBois

The work done by Luke DuBois really shocked me last Friday. I literally feel this strong emotion since I am a math major and I have never learn that in such ways data could ever be showed. Unlike my assumption of Data visualization, which is trying to make the data look more beautiful and user-friendly to the reader, the work by DuBois taught me that it is not only a comfortable experience when I see the artful pictures and printing works but also let me find there is a more natural, deep and human nature way to approach the data. People do not read the boring numbers anymore, they see some real “things” in stead. They see the data in a touchable and readable thing.

My favorite one is definitely the map of American based on those words. It is to me more than a map but ironically it works the same as a regular map in my mind. I did not see any difference as my brain processing the date I acquired from the map. Those words were, again, like real things than words themselves and you know that those things stands for some places as you saw them. I also thing this is a very thought-provoking idea to replace some data by its characteristics in future study.