Category: Week 2

Week 2 – Rennie Heza

Ignorance, in the realm of digital scholarship, is not taking on a project too large, or asking a question too difficult to answer. Instead, ignorance is allow yourself to believe a project is perfect.

As I talked through project ideas with library staff this week, the main goal of writing my project charter remained the same: seek to answer a question you would like to see solved, and worry about the details later. After a week of struggle, I am proud to share my first official proposed project summary:


“Hockey is undeniably old, but that doesn’t mean the game isn’t constantly advancing. In recent history, play has become quicker, and strategy has changed immensely, but the goal of each team year in and year out has remained the same: win the Stanley Cup. The Stanley Cup is the championship trophy of North America’s National Hockey League (NHL), widely regarded as the best hockey league in the world. Each of the 30 current NHL teams will do anything to gain a competitive edge over their opponents. The newest arms race in the world of sports: advanced data analysis. But answering the question: “What does a team have to do to win?” is far from simple.

This project is intended to compare the performance of each NHL team during the 2016-2017 regular season, and further, during the 2017 Stanley Cup Playoffs. Through regression and correlation analysis, the project will first identify key factors and metrics involved in constructing a successful NHL roster, in both the regular season and the playoffs. Though similar, I expect the main factors contributing to success in these two realms will differ, leading to interesting decisions for team management when analyzing a team’s potential.

Then using this data, a model will be created to predict a team’s success in both the regular season and playoffs. This ultimate goal is an interactive model that allows fans, players, and general managers the ability to track their teams projected success given roster moves, player injuries, line shuffling, etc. Using simple inputs (the metrics determined to be most influential in the first portion of the project), the model will graph a team’s projected future performance in terms of regular season wins and playoff wins, player scoring, and goalie performance. This interface will ensure that even the most casual hockey fans can understand the purpose of the model, without having to input or output complex metrics. However, the implications of such a model could involve high-level personnel decisions. Thus, this project will appeal to a wide range of individuals.

This outwardly simple model will cater to fans and hockey professionals alike, while providing statistically backed predictions of team performance. The goal is to identify the building blocks needed to create a contender of a team, and then allow fans and hockey professionals to fiddle with a team’s makeup to identify the best strategy for any NHL team to improve their season predictions. This will allow complex analysis to be performed on any team, by anyone, with Internet access. Such as tool has yet to be created, but this summer, that will change.”


Far from perfect? Sure. A project I can’t wait to begin? Absolutely.

Week 2 recap: Let’s talk about text

This week was all about the text.

On Monday, we were joined by Diane Jakacki as we dived into distant reading using Voyant.  Voyant is a pretty easy tool to get the hang of so in no time our students were out of the gates analyzing the corpus we provided them as well as their own documents. I think what is great about Voyant is that the interface is so user friendly but the sheer number of tools available on the platform can keep you busy for hours. Between the seven people in the room I think we only explored a fraction of Voyant’s capabilities.

What I really appreciated from Diane’s introduction to text analysis on Monday was the emphasis on distant and close reading working in tandem with one another. You can infer a lot from the visualizations in Voyant but if you don’t have knowledge of the text you are working with you will be left with a very superficial understanding.  This set us up perfectly for a deep dive into close reading through TEI on Tuesday and Wednesday.

We started close reading at the very beginning with transcribing diary entries of James Merrill Linn from May 1862.  James Merrill Linn was in the first graduating class at Bucknell and a local of Lewisburg. He went on to enlist in the Union army at the beginning of the Civil War and kept extensive diaries throughout.  Linn’s papers are part of the collection in Special Collections and University Archives and so far two classes have worked on transcribing and marking up his letters and diaries so our fellows work this week will contribute to that ongoing project.  As we transcribed and began to put the diary entries in order, everyone became pretty invested in Linn’s exploits.  This was good news because the next step in the process was marking up the documents.

I think when our fellows downloaded Oxygen onto their computers there was some apprehension as to what would come next. However, everyone caught onto TEI very quickly.  Debates arose on the appropriate tags to use and I still don’t think we reached consensus on whether death is an event, state or trait.  Once again it was great to watch the collaboration as the fellows worked through how to mark up and tag each of their documents together.

Since we still had unanswered questions about Linn,  I arranged a visit to Special Collections and University Archives with Assistant Archivist, Crystal Matjasic on Wednesday.  It gave us an opportunity to take a look at the originals of our letters and shed light on some transcription questions the image files left us with.  While we were there Tyler figured out that Bucknell is in possession of not one, but two clocks that are made of coal so I foresee another trip to the Archives in Tyler’s future.

The week ended with each student presenting their Project Charters to the group and individual meetings with the students.  Carrie and I are so impressed by the progress each student has made in the last two weeks!  It is amazing to watch the continual refinement of research questions, scope, and which tools will add the most to each project. As librarians we don’t always get to see the entire process unfold, we only see snippets, so this has been incredibly rewarding two weeks.  Also completely terrifying, as we realize we are a quarter of the way done! I will be gone the next two weeks at DHSI and visiting family so I can’t even imagine the progress that will be made by the time I come back.  Carrie will be keeping all of us posted 🙂

My Refined Project Idea

  • Project Name
    • The Representation of Anthracite Coal Miners: An Artistic Movement
  • Project Summary
      • “Through analyzing different forms of artistic mediums, I will identify the ways in which coal miners have been represented and the impact of their lives on the anthracite region.”
      • Areas: Northumberland, Luzerne, Schuylkill, Lackawanna, Columbia, and Carbon
        • In 8 weeks, I want to create a multimedia platform that includes artwork, poems/brief stories, songs, monuments, photography, and possibly film representing coal miners. I would also like to get audio clips from citizens from each area in the coal region.
        • I want to use my ideas of social memory and spatial memory while outlining the importance of where the monuments are situated.
        • I think Scalar would be the best platform to display all of my media. I would include forms of text analysis on the poems, brief stories, and songs. I would also like to do some sentiment visualizations with the songs. I want to create some data visualizations as well.  
        • My audience will be the people of the anthracite region. I want to show a collection of the representation of coal miners to the people who chose to remember the coal miners, or those who maybe did not choose to remember (Shamokin).
  • Environmental Scan
    • My project mapping monuments, “Shamokin and Coal Township: An Interactive Map” is the only useful website I could find.
    • There are really no other similar projects, except some authors discuss the representation or do a “study” of the “Appalachian” region.
    • This project will focus on anthracite coal miners, and include an abundance of different artistic representations.
  • Requirements for Development
    • Copyright Concerns: song lyrics, poems/stories, photography
    • I can go to the different historical societies of each region and online databases to find some of the materials.
    • I will need resources on the ideas of social memory. I would also like some books on art aesthetic theory.
    • The coal miners union
    • PA state records: WPA
  • Bulleted List of Deliverables
    • Find at least three examples of each artistic element: artwork, monuments, photography, poems/stories, and songs for each of the six counties
    • Gather audio clips from at least two people from each of the six counties
    • Analyze the representations, using Marx, social memory, spatial memory theories
    • Create a scalar website
        • Include the media components
        • Use text analysis tools on the songs and poems/stories
        • Use sentiment analysis tools on the song lyrics
        • Use theories to discuss artwork, monuments, and photography
      • Critique bad or feteshized representations of coal miners
      • Include a brief history of anthracite coal miners
  • End of Life/Future Plans
    • This project could easily be added upon. One could continue finding different artistic representations in the anthracite region, or begin researching other coal regions of PA or other states. Then, it could be connected with different coal regions from around the world.
    • Also, the monuments could be mapped, like my first research project, and one could research each town individually.


As I begin looking over my bulleted list of deliverables and thinking about how I could refine my project, I believe my scope will instead be my media. I will use only three different mediums – written & visual art and newspapers. My area will be the anthracite region broadly conceived and I think I will use the timelines of 1900-1930 and 1970 to present to look at how miners were represented during the height of the coal mining industry and after the fall of the industry. I believe this will not limit my research and not overwhelm me either, since I will be finding information through newspapers and artifacts that interest me in terms of written and visual art.

Week 2 – Justin Guzman

LGBTQ+ Representation in Cinema: Hollywood vs Indie


Representation in Hollywood Cinema is and has always been a reoccurring issue. Despite the pace at which society is advancing in terms of identity acceptance, Hollywood Cinema is somewhat behind. Indie Cinema (Independent) however, seems to be doing a way better job. A few of my research questions are: Why does Hollywood lack proper representation of the LGBTQ+ community? Is there a such thing as proper representation? What are the reasons representation is such an issue? Is it the audiences of America or are producers and directors playing the movie business safe by respecting older values?

My project will identify the different ways in which LGBTQ+ representation in cinema over the last 30-40 years as that’s about the same time that the LGBTQ+ movement has become one of the most powerful movements in the United States. The scope of my project is limited to about 50 films with plans to expand. By the end of the summer I need to have at least 35 films included into my research, but my ultimate goal is to get 50 included. This includes films that are both Hollywood produced and independently produced. I want this project to be something that identifies some of the problems with the film industry, but I also want people to be able to use my project as a way of finding films that have LGBTQ+ representation in them, so that they can judge the film’s approach at representation for themselves while also being able to express what they think and respond and share. This has the possibility for growth if the proper seed is planted and people are actually interested. So far I haven’t found any projects that really discuss representation using digital tools, but I have found projects that gather films for preservation and restoration. I can build upon these projects by starting an actual conversation about them; even if that conversation is one that’s central to Bucknell. With digital tools I will be able to identify key parts of screenplays (as they’re available), significant moments in the motion picture, as well as the audience’s response to the film (again, as they’re available). Some of the issues I can see myself running into with this project are copyright issues since I am using films in my project, but fair use might help me out since I would only be using portions of the films for educational purposes and for non commercial use and with proper credit given. Jason Snyder was a huge help in helping me figure out if the project I had in mind before was actually feasible and if it was doing anything that John Hunter’s database or the Library Catalog was already doing. Instead of creating a place for people to find LGBTQ films, I’d rather help people start a conversation while being able to see actual data from the film to reference. “Show me the data.”

Over the next six weeks I plan to analyze and gather data on whatever films I can find that have some sort of representation of LGBTQ+ people; whether it’s negative or positive. Within the next four weeks I hope to have a site designed that will allow people to easily access all of the data used and exchange information using something such as a comment section that people would be able to use.

As I said before, I want this project to be fueled by conversation after the data is gathered and presented. I don’t know if I’ll keep this going after I graduate, but it would be fantastic if the LGBTQ+ office on campus wanted to get involved after it’s done.



  • Week 3
    • Develop a clear list of gay, lesbian, bisexual, and transgender films
      • Budget, release date, profit, number of screens
    • Organize them by decade
    • Talk to Glynnis, Erica, Rebecca, Ken
    • Find as many screenplays as possible
    • Develop Scalar site
      • Timeline
      • Discussion
  • Week 4
    • Voyant analysis
    • Analysis of films using John Hunter’s method + ELAN
    • Continue scalar site development
  • Week 5
    • Continued analysis of films (screenplays and motion picture)
    • Continued development of site
    • Start loading films onto site
  • Week 6
    • Complete analysis of films
    • Complete development of site
  • Week 7 + 8
    • Final edits
    • Conclusions

Project Charter: China’s Internet Wave

Over the past week, we were introduced to several text analysis tools and it was interesting to learn that there are so many ways we can look at a text. Even the concept of close and distant reading is new to me, and it showed me the value of both being aware of the big picture and also delving into some of the details. We also spent some time working on our individual project charters, and below is a summary of mine:

Internet use in China over the past few years has increased tremendously and its growth shows no sign of slowing down. In just the first 6 months of the year 2016, there had been an increase of 21.32 million Internet users in the country, and Internet penetration continues to climb at a steady rate. As a result, the areas in which business opportunities lie have drastically changed and people are becoming increasingly reliant on the Internet in their everyday lives. My project will explore the social and economic impacts of China’s internet wave on different regions in the country by mapping out the significant milestones between the year 2005 (when Internet usage in China really took off) and 2016 (using the latest data I could find). After analyzing the data of Internet development in China, I will then focus on case studies in the areas of online shopping, meal-ordering, car transportation services and also instant messaging. There are several case studies I am hoping to look further into, including Alibaba(阿里巴巴), Taobao(淘宝), WeChat(微信) and Dianping (大众点评). If possible, I hope to create interactive timelines for the each of these companies, illustrating their growth over the past decade or so and incorporate these timelines into StoryMap. I also hope to ultimately formulate a clear picture of where in the country are the effects of these advancements most felt and whether the urban-rural gap in China is widening.

There are several available reports on the Internet development in China, with one of the most comprehensive being the statistical report created by CNNIC which was last published in July 2016. One of the most recent reports is Acquisdata’s report on China Information Technology published on 31 May 2017. These reports provide very useful statistics and numbers with regards to my research questions. However, they only very briefly touch upon the specific developments that the big players in the market are making and hardly provide in-depth qualitative analyses of the trends mentioned. On the other hand, there are also many books and articles written about specific companies – their stories and their progress, without putting them in the context of the other developments that have been introduced to the country. I hope to be able to combine both qualitative and quantitative analyses of the topic while at the same time looking at the bigger picture and mapping out where the important changes are taking place and further exploring the implications of my findings.

I am excited to be spending more time looking at individual case studies of the individual companies next week and perhaps (hopefully) stumbling upon new questions related to my project I have never thought about!