Bucknell Curriculum Vizualization

When high school students begin the college search, they are repeatedly bombarded with the same information about class size, department strength, learning goals, etc. from every university they encounter.  Each institution, in the interest of attracting students to apply, wants to put its best foot forward.  Understanding this motive behind the information Bucknell (as well as other colleges) makes publicly available on its website invites further scrutiny: does the information change once students commit to Bucknell?

The Bucknell mission statement, learning goals, college core curriculum goals, and department summaries are available to anyone on the university website.  All of this information essentially communicates the same thing: enabled by a Bucknell education, students grow into more mindful, critically thinking, capable, creative, and culturally aware contributing members of the global community.  Does the information only available to people with a Bucknell login, such as course descriptions and the specific classes that fill particular CCC requirements, is the carry the same content and cadence?  Is the public face of Bucknell, constructed through its publicly accessible website information, representative of a Bucknell student’s educational reality?

My personal stake in this research has to do with the difficulty I had selecting a major.  Every adviser tells incoming freshmen to take their time exploring, start by filling general education requirements before settling into a major.  I was told I had plenty of time to decide, but when the time came to declare a major I didn’t feel as though twelve credits-worth of experience was enough to go off of.  Coming from a fairly generic high school, I had no idea what it would mean to be an anthropologist, economist, creative writer, or comparative humanist because I had no experience and knew of no one who had experience in these fields.  If the publicly accessible department descriptions are not truly representative of the field, it puts more pressure on course selection in order for students to gain insight into a branch of knowledge.  But how can students be expected to choose courses they will enjoy and gain meaningful experiences from if the selection process is a gamble?

I began with a specific interest in the materials studied in the three comparative humanities core courses.  Visualizing genre and author/artist gender and ethnicity drew attention to the gaps in the courses’ coverage; specifically a lack of women and non-western authors.  (Visualizations below created in Palladio: on the left a graph view dividing the course materials based on gender, on the right a map view plotting the materials’ location of publication.)

palladio graph author sex     palladio map sized

From there, I became interested in broadening the scope of the visualization to the university as a whole.  Since I do not have access to all the syllabi in every department, I had to shift the focus of the visualization to a different, but related, set of data: course descriptions and requirements as seen in the online course catalog.  This data is especially intriguing because, although it is easily accessible for all Bucknell students making choices about which classes to take, its presentation (a glorified spreadsheet) is indigestible and makes comparison difficult.  My goal was to find a way to view all, or as much as possible, of the data at once in order to access a macro-perspective.  Initially I planned to use Stefanie Posavec’s “Writing Without Words” (below left) as a guide for the tree-like structure I wanted to create.  As “Writing Without Words” reveals Kerouac’s structural style in On The Road, I thought a similar design could reveal the structure of Bucknell’s course offerings.  After some experimentation, I realized my data appeared confusing and sloppy in such a format.  Instead, I borrowed Borris Muller’s circular structure of “Poetry on the Road” (below right) to give shape to my data.

writing without words                   poetry viz

The “Poetry on the Road” model enabled me to more closely follow Tufte’s principles of display architecture, which include: “(1) documenting the sources and characteristics of the data,” which the visualization accomplishes through its shape, designed to reflect the relationships between departments via CCC requirements; “(2) insistently enforcing appropriate comparisons,” made possible through the various options for node sizing; “(3) demonstrating mechanisms of cause and effect,” by the simple organization of data into the democratic, circular structure in which the viewer’s eye is not drawn to a particular area for any reason other than the concentration of edges; “(4) expressing those mechanisms quantitatively,” as I did by sizing and connection each node based on quantitative data from the course catalog; “(5) recognizing the inherently multivariate nature of analytic problems,” shown through the combination of variables such as node color, size, and location, and different CCC requirements; “and (6) inspecting and evaluating alternative explanations,” as we explore in Nadeem’s interactive network visualization for each department (Tufte 53).

template

Inspired by “Poetry on the Road,” I organized all of Bucknell’s academic departments into rings based on the size of each College/School (above).  The outer two rings, with nodes colored purple, represents the College of Arts and Sciences.  Since the College is so big, I split it further into an Arts and Humanities ring and a Science (hard and social) ring in order to make the visualization easier on the eyes.  The center ring, with red nodes, represents the College of Engineering.  The inner ring, with blue/green nodes, represents the School of Management.  In this particular visualization I chose to size nodes based on the number of unique courses offered in each department for the Fall 2015 semester.  For example, the music department has the highest number of unique courses (73) so it is represented by the largest node, and astronomy is one of the departments tied for the lowest number of unique courses (1) so it is represented by the smallest node.  I initially intended to make node size a variable for comparison by creating alternative visualizations with nodes sized based on number of total courses offered or the number of possible ways to fill CCC requirements in a particular department, but altering node size did not fit seamlessly into the narrative of the project as a whole.

circle.unique.allpub  circle.unique.CCQR.DUSCpub

Since my intention was to create a means to view as much of the course catalog information at once as possible, I first tried to map the edges for all the CCC requirements at once (above left).  Although it made for a decent website header image, the colorful quagmire is too cluttered to be analytically useful.  Even including as few as two CCC requirements on the same image does more harm in the clarity department than it does good for comparison purposes (above right, Quantitative Reasoning and Diversity in the US requirements pictured).

ARHC with nodes  ARHC

Although visualizing one CCC requirement at a time on top of the department nodes is simple enough to convey the data clearly, I decided to simplify even further by removing the nodes (Arts and Humanities requirement pictured above).  It became necessary to include a template of the nodes without any CCC requirements under the narrative tab in order for the visualization to make sense; but the visualization is still ledgible because the division of the different rings is intuitive enough to grasp without looking directly at the location of the nodes.  And the image is more visually impactful with just the edges.

macro–>  relationship –> micro

When it came time to combine the static and interactive aspects into a single visualization with a reasonably linear narrative, we decided to use the macro>relationship>micro view structure.  Starting with a macro view, a visualization will “facilitate the understanding of the network’s topology, the structure of the group as a whole, but not necessarily of its constituent parts” through a holistic view of the visualization, enabling users to see its overall pattern” (Lima 91).  Our macro view is located in the narrative (above left).  It offers both an overview of Bucknell’s academic structure through the listing of learning goals and college core curriculum design taken directly from Bucknell’s website, and a color-coded comparison of Bucknell’s learning goals to its CCC design.  This choice contextualizes the visualization for viewers who may not be familiar with Bucknell’s academic mission.  From the narrative tab, the viewer is prompted to select the college core curriculum tab to access the relationship view (above center), which “is concerned with an effective analysis of the types of relationships among the mapped entities” (Lima 92).  The edges of our static relationship view offer a perspective on the relationships between different departments through CCC requirements.  Finally, the user can click on a node to explore a singular department in more depth in the micro view (above right).  Although the micro view offers the most narrow perspective, it offers comprehensive, explicit, and “detailed information, facts, and characteristics on a single-node entity,” which helps to “clarify the reasons behind the overall connectivity pattern” (Lima 92).

 

http://curriculum.blogs.bucknell.edu/

Curriculum visualization (Nadeem Nasimi’s) http://nadeem.io/270/

Assignment 5

I collected my data from the Comparative Humanities core courses’ syllabuses.  The three core courses include: HUMN 128 – Myth Reason Faith (18th Century BCE-1295), HUMN 150 – Art Nature Knowledge (1486-1859), and HUMN 250 – Nihilism Modernism Uncertainty (1882-1957).  Together, these courses are advertised as the history of human thought.  They cover works starting in the 18th Century BCE with the Enuma Elish and ending in 1952 with Ralph Ellison’s Invisible Man.  I listed the title of each work we studied in each of the courses, along with its author, date of publication, coordinates, author’s sex, course it was taught in, and author’s ethnicity.  As curricula aiming to cover such a large time scale, it is impossible to include all notable humanities works in every genre.  Visualizations of the course data can draw attention to the areas that may have become invisible in the process of simplification.

palladio map sized

Using the Palladio mapping feature, I plotted the location of publication/creation of each work we study in the humanities core curriculum.  The highest concentration of works is in London and Paris, with Europe in general heavily represented.  The curriculum does primarily cover Western thought, so this pattern is unsurprising.  However, South America, Africa, and Southeast Asia are entirely unrepresented.

palladio graph course ethnicity     gephi author-ethnicity

The gaps in coverage of author nationality/ethnicity in Palladio’s graph function and in Gephi is less immediately obvious than in the Palladio map.  The multitude of nodes gives the false impression of ethnic diversity.  The array of colored nodes in Gephi make it seem like a broad range of ethnic groups are represented in the courses – the network visualizations only show the groups that are represented, not the ones that are invisible.  However, the Palladio graph can compare the relative diversity of one course to another.

palladio graph course genre     gephi title-genre labeled

In both Palladio and Gephi I visualized the genre of each work, with the Palladio graph additionally dividing the genres based on the course they were covered in.  In the Gephi network, the dominance of philosophy and literature is obvious due to the color coding.  The Palladio network is more useful for showing the genres studied in each individual course rather than the total popularity of a single genre across the courses.

palladio graph author sex       gephi author-gender

I separated the authors into categories based on their sex, revealing the obvious and enormous disparity between the number of male and female authors.  Of the eight female authors we discuss in the humanities core courses, five of them (Wollstonecraft, Shelley, Woolf, de Beauvoir, Kaplan) are almost exclusively analyzed within the context of feminism.  Zero of the authors are considered non-binary.  Since the force directed graph in Gephi shrinks the distance between “male” nodes, the gender gap is more visible in Palladio.

 

gephi title-genre labeled fr modularity      gephi title-genre

The above visualizations of the different genres represented in humanities core syllabuses, both made in Gephi, are examples of different syntax using the same data.  In the Fruchterman-Reingold, radial implosion, the most popular genres do not stand out as obviously as in the force-directed, centralized burst.  The centralized burst, concentrates the most relevant nodes at the center of the visualization, drawing the viewer’s eyes more quickly to the differences between nodes.  Fruchterman-Reingold viewers must search for node color among the evenly-spaced nodes.  One, well-connected node is immediately noticeable, but most of the rest are lost in the sphere.

 

In both platforms I was unsure of how to visualize all of my data at once.  I can’t color code individual nodes in Palladio as I’d like to, and Gephi gets confused with too many variables.

Assignment 3

slav-wordle-7

I chose to visualize data from a discourse analysis of Mount Carmel Daily Item newspaper articles containing the word “Slav” from 1892-1910.  I created the dataset in order to make a word cloud (above) representing the perception of Slavs in the coal region during the turn of the 20th Century.  The analysis initially included the data categories: date of publication, location, article title, epithet (specifically the words in the article used to describe Slavs, split into the three categories: modifiers, verbs, and nouns), and people.  Refining the dataset for visualization in Palladio, I added geographic coordinates, removed people, and reorganized the epithets into epithet frequency categories: race, class, and total.

graph location.epithet

Although it was scrambled and confusing at first, I found Palladio’s graph view to be most interesting for this dataset.  The visualization above is a result of inputting total epithet frequency (highlighted) and location name (un-highlighted), and sizing the nodes based on article frequency.  Since the labels are too small to read without zooming in and losing the effect of the big picture, this view is most effective for seeing which entities occur in the most articles.

graph location.date.epithet 1

In an attempt to enhance the graph view’s most helpful feature, I input location and date, and sized the nodes based on total epithet frequency.  The resulting visualization (above) is cluttered, but more easily readable

graph year

Simplifying the visualization, I organized the “year” nodes in chronological order like a time line.  This might be the most effective visualization I created in Palladio because it is simple and clear.

graph race.year

After organizing the nodes for year (highlighted) and race epithet frequency (un-highlighted), the visualization (above) is more revealing.  The organization is inexact because the nodes were dragged by hand, but the viewer is able to see in which years the most articles containing the most race epithets were published.

map zoom

Using the geographic coordinates in the map view, I plotted the location of each article on a world map.  I wanted to use different shades of red to represent the density of entities in articles of a particular location, but Palladio would only show one color at a time even though there is an option to add multiple layers to the map.

map title 1  map title 2

Instead I used the node size option to represent the racial epithet frequency.  Zoomed out (above left) the viewer can see all the articles scattered across the world, but the upper east cost of the United States is taken over by a single blob of color because the view is not detailed enough to show the individual article representations.  Zoomed in (above right) the Coal Region is visible in more detail.  Since I am not familiar enough with Pennsylvania geography to be able to identify the town each of the nodes is located in without a label, this visualization is not very useful to me.  If this visualization could be laid over a road map of Pennsylvania, it would be interesting to see which town’s newspaper articles contained the greatest frequency of racial words.

table date.race.class

The table view, although it has the ability to group the data by a chosen row dimension (I chose “year” for the visualization above), it has almost the same function as the spreadsheet the dataset was originally organized in.

gallery

Similarly, the gallery view does not seem to provide any more insight into the data than a spreadsheet.  It is frustrating to use because only a small fraction of the data is visible at once, and the format is uninteresting because none of the articles I used to compile the dataset had accompanying images.