MSCS Capstone Showcase
MSCS Seniors
Welcome to the MSCS Capstone Showcase! Please join us in exploring our 2020-2021 seniors’ capstone projects.
All of our seniors complete a significant project from their capstone course. Below are the projects completed by students this year.

Sudoku Solving as a Graph Problem
Adeline Steward-Nolan ’21 (Oak Park, IL),
Computer science
-
Sudoku: Given a 9×9 grid, place the digits 1 through 9 in every row, column, and 3×3 box.
Me and my teammates wanted to try to build a sudoku solver using ideas from graph coloring problems. Instead of just thinking about numbers in rows, columns, and boxes, what if we thought about how the rows, columns, and boxes were connected? Using this framework, we were able to build a fast and accurate sudoku solving algorithm while learning about various other solutions..

Difference-in-Differences Designs with Panel Data
Aidan Toner-Rodgers ’21 (Sebastopol, CA),
Applied mathematics & statistics
-
The canonical difference-in-differences (DD) setting has two groups and two periods. While this framework is well understood, real world applications often involve multiple treated units and variation in treatment timing. This project considers DD designs in such settings. In particular, I explore three distinct approaches: a two-way fixed effects model, a “stacked” DD design, and an event study analysis. In addition to discussing the strengths and weaknesses of these strategies, I illustrate their application by estimating the causal effect of collective bargaining rights on police use of force.

Predicting EA Sports FIFA Team of the Season in Europe’s Top Five Leagues
Alexander Denzler ’21 (New York City),
Applied mathematics & statistics
-
The exact way that EA Sports FIFA chooses their Teams of the Season is unknown, and often faced with criticism. Although EA Sports’ decision is based on real life player performance, how they come to their conclusions, and which stats are most important to them, is unclear. We use machine learning techniques to uncover which statistics are most important when determining the Team of the Season in Europe’s top five leagues.

Blending Art and Real Life: Neural Style Transfer Exploration and Comparison
Alison Garrett-Engele ’21 (Needham, MA),
Computer science
-
As the field of artificial intelligence expands, so are different applications of AI art generation and deep learning such as Neural Style Transfer (NST). This popular optimization approach blends content and style images by accessing the feature maps of the content image stored in the intermediate layers of a convolutional neural network. For this experiment, some fellow students and I modified a published NST approach (Yuan 2018) to create an NST script using the VGG-19 model. We experimented with different optimizers and intermediates layers of the model to quantitatively determine which optimizers and layers produced the best blended image.

Factors Influencing the Social Help-Seeking Behavior of Introductory Programming Students
Anael Kuperwajs Cohen ’21 (Redmond, WA),
Applied mathematics & statistics, Computer science
-
In a classroom setting, working with others can increase motivation to attempt more challenges, reduce the difficulty of complicated concepts, and bring about greater overall success. Despite extensive research in other domains, there has been minimal exploration within computing on what impacts a student’s decision to seek social assistance. To understand what affects introductory programming students’ social help-seeking behavior, we conducted 32 semi-structured interviews and performed thematic analysis and qualitative coding on the ensuing transcripts. Our qualitative analysis revealed 18 significant factors that can fit into four broad categories: Internal Drivers, Social Constraints, Classroom Policy and Culture, and Practical Limitations.

Quantification and Visualization of Racist Rhetoric Through Network Analysis
Anael Kuperwajs Cohen ’21 (Redmond, WA),
Applied mathematics & statistics, Computer science
-
Mathematics is often viewed as a domain that is disconnected from social movements. However, mathematical tools can help explore a wide range of topics and reveal new insights. The goal of this project was to use mathematics, specifically network science, to quantify and visualize racism. We chose five historical and modern texts that each offer a different perspective on race in order to develop a deeper understanding of racist rhetoric used in writing over time. The networks we created and analyzed were based on the important words in each text and their connections to the neighboring important words.

What affects Independence Among the Elderly?
Analeidi Barrera ’21 (Round Lake Beach, Illinois),
Applied mathematics & statistics, Computer science
-
When people get older, they may have to rely on others to complete daily tasks that they were once able to do alone. However, there is obvious variation in the age and the degree to which people become dependent. This brings up an interesting question about what factors could make people more prone to early dependence than others. To study this question, we use the National Health Interview Survey data where we use bathing, toileting (using the bathroom), and eating as measures of dependence.

Sana: An interactive agent that simulates human speech
Analeidi Barrera ’21 (Round Lake Beach, Illinois),
Applied mathematics & statistics, Computer science
-
A chatbot is an agent or a computer program that is able to have a conversation with a user. Joseph Weizenbaum created Eliza, the first chatbot in 1966. Ever since we have seen many new and more sophisticated chatbots like Siri, Alexa, and Google Assistant. What is so fascinating about chatbots in the Artificial Intelligence world is that it uses Natural Language Processing and machine learning. Using a Transformer model, a relatively new model used in NLP, Sana was made to maintain a conversation with users in hopes that it would be indistinguishable from humans.

You Can Be The Next Van Gogh: An Evaluation of Neural Style Transfer
Anh Nguyen ’21 (Ho Chi Minh City, Vietnam),
Computer science
-
Do you wish to draw like your favorite artist but lack the time or ability? Neural style transfer (NST) might be the thing you need! NST is an application of deep learning in which a content image is “painted” in the style of another image, such as a texture image or a famous painting. It is an important technique in AI art generation and is the driving technology behind numerous applications like photo filters. Join me to hear more about how NST utilizes convolutional neural networks to create unique and impressive artworks!

Success and Failure in NBA Free Agency
Arif Zamil ’21 (Albany, CA),
Computer science, Mathematics
-
NBA free agency is the yearly summer frenzy where teams throw millions of dollars at players in hopes they can lead them to a championship. Every year players outperform and underperform their contracts, but is there a certain strategy to the free agency market in general? My capstone will dive into multiple networks spanning 2015-2020 to determine how and why teams were successful or unsuccessful during NBA free agency.

Donald Trump, the Stock Market, and the Underlying Sentiment that Connects Them
Arif Zamil ’21 (Albany, CA),
Computer science, Mathematics
-
Donald Trump has made significant use of the resources at his disposal to influence the American people. With nearly 13,000 tweets from 2017-2019, POTUS’ twitter account has become a primary source of relevant market information, particularly as it relates to the ongoing trade war and negotiations of a trade deal with China. This considered, we spent this semester using lexicological and machine learning approaches to sentiment analysis to explore potential relationships between Trump’s tweets and the stock market.

Codenames: An Implementation
Autumn Kinchen ’21 (Inver Grove Heights, Minnesota),
Computer Science
-
Essentially, we wanted to know whether or not artificial intelligence can act as an opponent; particularly in the famous game ‘Codenames’. We achieved this by condensing the game into a game of two players – you and your computer.

Quantification and Visualization of Racist Rhetoric Through Network Analysis
Ava Cutler ’21 (New York City),
Mathematics
-
What happens when you apply the objectivity of mathematics to a social concept such as racism? The goal of this project is to connect these seemingly unrelated subjects to visualize and quantify racist rhetoric using network science. Texts by three core authors, Thomas Jefferson, David Walker, and Barack Obama are analyzed and compared through a unique visualization that allows a viewer to easily detect connections between racist and anti-racist key words, like, “White,” “Black,” “equal,” “inferior,” and “free.” The project helps to understand the development of racist rhetoric overtime, and the complexity of language in racist dialogue.

Political Polarization on Twitter During the 2020 Election
Avik Bosshardt ’21 (Boca Raton, FL),
Computer science
-
During the 2020 presidential election, Twitter acted as the front line of an extremely polarized political war. As the campaign wore on, the number of political tweets grew, and these tweets proved to be a treasure trove of insights. We examined a random sample of 10,000 election-related tweets from July through November, and conducted sentiment analysis to uncover trends in how positively or negatively each of the candidates was perceived. The data tells the story of a unique election which became increasingly emotional over time.

Moran’s I: An Investigation in Spatial Autocorrelation
Brian White ’21 (Worcester, MA),
Applied mathematics & statistics
-
For the MATH/STAT 455: Mathematical Statistics Course, taught by Professor Kelsey Grinde, my classmate, Mr. Jack Acomb, and I did an in-depth analysis of the Moran’s I measurement of spatial autocorrelation. If two variables that are related to each other are correlated, then a variable that is related to itself is autocorrelated. While autocorrelation usually extends over time, autocorrelation that extends over an area is known as spatial autocorrelation. It is a measurement of how clustered a specific variable is in the world. This project examines how this measurement is calculated and provides some examples of real-world examples.

Falling Through: Finding the Gravitational Trajectory of an Object Inside a Planetoid
Charlie Thole ’21 (Maplewood/St. Paul, MN),
Mathematics
-
If an object were to fall through a planetoid, where would the gravitational forces of the planetoid take the object? What would the path look like? Is the motion perpetual? The answers become more complicated when varying the location of the motion’s initiation and the shape of the planetoid, and the resulting paths take interesting shapes.

2020 US Presidential Election Tweet Sentiment
Charlotte Giang ’21 (Hanoi, Vietnam),
Applied mathematics & statistics, Computer science
-
Playing an important role in navigating the discourse on the 2020 US presidential election, Twitter allows users to voice their opinions and access live updates about this critical election, which stirs various emotions for voters as well as observers. A question of interest is, does the sentiment of election-related tweets change over time and across political affiliations? We seek to answer this question by analyzing an Election Tweets dataset from IEEE Data Port. Text from tweets is sentiment-scored by VADER using lexicon and rule-based sentiment algorithms.

How Long After Founding Does a Company Go Public?
Charlotte Giang ’21 (Hanoi, Vietnam),
Applied mathematics & statistics, Computer science
-
From the founding date, how long does it take an average company to go public through an IPO? What characteristics of the company are associated with this duration of time? With survival parametric models and Cox proportional hazards models, we will shed light on companies’ time to IPO in relationship with their profitability, size, sector and leadership, using Kaggle’s IPO data up until 1/1/2018. Familiar probability and statistical concepts will prove useful in our discovery and analysis, and the beauty of survival analysis will be unveiled.

Analyzing StarCraft AI
Chase Yoo ’21 (South Korea),
Computer science
-
AI has been conquering traditional board games like chess or go. How has it been developed in StarCraft, which is a way more complex real time strategy video game? How is it performing?

Causal Concepts – When Correlation Does Mean Causation
Ciara Moore ’21 (Colorado),
Applied mathematics & statistics
-
One of the most common phrases in an introductory statistics class is “correlation does not equal causation!” This is not always the case; causal inference is an area of statistics that explores when a relationship between variables can be determined as causal by modeling interventions that are not possible. This project contains two blog posts: one of which is a “crash course” in causal inference reviewing some of the most important causal inference concepts, and the other is an example of these concepts in context through an exploration of the causal relationship between methamphetamine use and dental damage.

Lonely? Build a Recurrent Neural Network Chatbot that Sounds Just Like You!
Colin Kirby ’21 (Philadelphia, PA),
Computer science
-
Fatigued from keeping up contact with friends, family, co-workers, or maybe you just want someone to talk to? Come learn about a recurrent neural network chatbot we made using our own Facebook messenger data. This chatbot can emulate the way you have conversations with anyone over text!

Does Straighter = Safer? The Impact of Sexual Orientation on Perceived Safety in the Twin Cities
Colleen Minnihan ’21 (Greenfield, New Hampshire),
Applied mathematics & statistics
-
Is heterosexuality a protective factor for Macalester students to feel safe in the Twin Cities at night? I investigated other variables that might be at play in this relationship, such as anxiety and race, and held needed variables constant in order to see if there was a causal relationship between straightness and perceived safety. In the end, I found that changing a single variable in my model equations resulted in substantial changes in conclusions and show that being heterosexual does cause Macalester students to feel more safe (than those who are not heterosexual) in the areas near campus at night.

Mapping Macalester’s Majors: A Network Analysis of Academics at Macalester
Corey Pieper ’21 (Chicago, IL),
Applied mathematics & statistics, Computer science
-
Students at Macalester College are encouraged to take classes in a variety of subjects and are reminded that “it’s all connected”. But how exactly are different areas of study related to one another? By looking at cross-listed courses and graduates with multiple majors or minors, we can get a sense of how disciplines are related to each other from both a departmental and student perspective. We can then use networks to represent these relationships in a visual and captivating way, while diving deep into the connections that we see.

How to Train Your Chatbot
Corey Pieper ’21 (Chicago, IL),
Applied mathematics & statistics, Computer science
-
Chatbots are growing increasingly ubiquitous and advanced, with applications ranging from virtual assistants like Siri and Alexa to the automated chats you find on websites. We built a chatbot using a type of neural network called sequence-to-sequence models. These neural networks are often used for neural machine translation tasks, or translating between languages, but they can also be applied to a single language to predict responses to statements. Come find out how we used deep learning to train the chatbot on thousands of comments from Reddit!

Genetric Kart
David Barrette ’21 (Highland Park, IL),
Computer science
-
Merging biological and computer science concepts. A genetic-based machine learning algorithm teaching a bot to race around Luigi’s Raceway in Mario Kart 64

Quarks: How mathematics helped in solving a millennia-long search for what we are made of.
Diego Lopez Gutierrez ’21 (Lima, Peru),
Mathematics
-
What are we made of? For millennia, scientists have pondered the fundamental building blocks of matter. From cells to molecules to atoms to subatomic particles, every time we found the smallest forms of matter, we were wrong. Until the 1960s when physicists Gell-Mann and Ne’eman proposed their theory of the Eightfold-Way that predicted one of the indivisible particles that make up reality: quarks. For this prediction, they relied on knowledge of representation theory, studying abstract algebraic structures as matrices. I will guide you through the Eightfold-Way by exploring nature’s symmetries and how representation theory can teach us about subatomic particles.

Determinants of Broadway Show Survival
Ekaterina Hofrenning ’21 (St. Paul, MN),
Applied mathematics & statistics
-
I explore the survival analysis literature relating to the success of Broadway shows and the related significant predictors. In doing so, I identify key method differences and discuss the implications on the papers’ results. Across the board, early success, show type, and Tony Awards are significantly related to longevity.

Language Competition: An Agent Based Model
Elizabeth Cain ’21 (Newtown, CT),
Applied mathematics & statistics
-
This project is an examination of the dynamics of language competition using agent based modeling. My application is set in approximated council areas of Scotland with populations of monolingual speakers of either Scots or English and bilingual speakers of both languages. The model shows an increasing number of English speakers and decreasing number of Scots speakers over more than a 100 year time period. This model suggests that as dominant languages grow, “unattractive” languages will die out (barring intervention measures). These findings are in agreement with the results of previous work on this topic.

Dependent Data Without Taking All Day
Ellen Graham ’21 (Olympia, WA),
Applied mathematics & statistics
-
Has this ever happened to you? You choose your model assuming independent outcomes, then find out your outcomes are spatially and temporally correlated! But then, when you try to estimate a model accounting for dependent outcomes, it takes years to fit! Sounds like you need a computationally efficient spatiotemporal linear mixed model! Come learn about my capstone, which will teach you all about a model that produces more accurate model estimates without taking all day!

Bitstrings, the Hypercube, and Odd Notation: How Did We Get Here?
Eric Fong ’21 (Woodbury, MN),
Mathematics
-
The Hyperoctohedral Group has an interesting representation: the signed permutation matrices. But why is this a good way to think about the Hyperotcohedral group? In this presentation, we’ll start on the hypercube, make observations about moving around vertices, coming up with notation that feels useful and concrete along the way, and then talk about the tools of mathematics available to us to play around with this new notation. By the end, we’ll (hopefully) have arrived at the representation theory of the Hyperoctohedral group without even realizing it!

Facial Mask Detection using Transfer Learning
Fan Zhang ’21 (China),
Computer science, Mathematics
-
Currently wearing face masks is necessary for controlling the spread of COVID-19. I built a face mask detection system as our capstone project. Stores and restaurants can adopt the system to ensure customers wear a facial mask in public space. I used transfer learning to classify whether one face is wearing the mask by applying MobileNetV2 as our base mode. After training for 10 epochs, I get about 99% accuracy on the validation dataset. Also, I implemented an interface to supervise whether people wear a mask in the picture or video stream.

Network of Empires
Fan Zhang ’21 (China),
Computer science, Mathematics
-
In this project, I create a complex player network based on the matches of Age of Empires II on Voobly. I explore the centrality and the geological features of the network. With community detection, I discover that in Voobly most people are casual players, and there is a small community of professional players. With centrality measures, I confirm that players with high ratings are not central in the network. Overall, I conclude that professional players play within the small professional community, and casual players play in the majority casual community.

Penalized Regressions and the Bias-Variance Trade off
Federico Chung ’21 (Buenos Aires, Argentina),
Applied mathematics & statistics
-
90% of the data today has been created in the last 2 years, yet sometimes it is hard to extract the signal within all the noise. Penalized regression is a technique used to prevent overfitting, making models less susceptible to capturing noise or random fluctuations particular to the training dataset. By minimizing the effects of overfitting, penalized regression allows for better generalization to new data. This project explores the theory behind penalized regression estimates, and how it compares to classical linear regression. The project then uses data simulation to explore the bias-variance trade-off across different penalized and linear regressions.

Polarized America and Polarized Elections: What does twitter tell us?
Gabriel Ramos ’21 (Puerto Rico),
Computer science
-
As highlighted in Political Science literature with the occurrence of terms such as “50-50 Nation”, “Red vs Blue”, “Culture War”, and “Partisan dislike”, the American Public is seen to increasingly behave in a bimodal pattern. Given those developments in politics and the public in recent years, the media and academic sources portray America as a highly polarized country divided along rigid partisan lines. In the era of technology and social media, twitter as a representation of the broader social media structure’s role in political polarization, especially related to the Presidential election, is a revealing phenomena deeming deeper computational analysis.

Networking Lost
George Clare Kennedy ’21 (Oakland, CA),
Computer science, Mathematics
-
Lost is one of the best-known and most influential shows of the 2000s, and arguably in all of television. It follows the survivors of the crash of Oceanic 815, which crashes on a remote and mysterious island in the show’s pilot. As everyone struggles to survive, we slowly learn about the characters’ backstories and lives before the crash, and how their pasts drive them in what they do on the island. I used a networking algorithm developed by Andrew Beveridge to visually map out the relationships between characters.

An Exploration of the Graceful Labeling Problem
George Clare Kennedy ’21 (Oakland, CA),
Computer science, Mathematics
-
The graceful labeling problem is a famous open problem in mathematics and computer science, first described by Alexander Rosa in 1967. The object of the problem is given a graph, is there a way to label the vertices of that graph uniquely with the numbers 0 to m, where m is the size of the graph, such that when its edges are labeled with the absolute differences of the vertex labels, the edges are labeled uniquely? This project looks at the problem and a couple algorithms for it, and shows some interesting results concerning the Petersen graph.

Strange Triangles
Giselle Cohen ’21 (Portland, Oregon),
Mathematics
-
The Robbins numbers enumerate two families of combinatorial objects, the Alternating Sign Matrices (ASM’s) and the Totally Symmetric Self-Complementary Plane Partitions (TSSCPP’s). Despite these two families of objects being equinumerous, there is no known bijection between them. We explored the relationships between object families that are also equinumerous to try to start to elucidate connections between the ASM’s and the TSSCPP’s. We found new equinumerous objects, proved subfamily connections, and explored the structures of these object families. As this bijection remains an open question, any connections between these object families has value.

The Hyperoctahedral Group
Hannah Chonkan-Urow ’20 (Highland Park, IL),
Mathematics
-
Hypercubes! Bit strings! Signed permutations! Oh my!
This project explores the hyperoctahedral group, Hn and its representation theory. The group Hn is the symmetry group of the hypercube, Qn. This means that Hn acts on Qn by permuting the vertices while preserving all edge-vertex connections. When we label the vertices of the hypercube with bit strings, the hyperoctahedral group additionally reveals fascinating and beautiful properties of the graph.

Detect Face Masks Using Machine Learning
Huichang Zhao ’21 (Jiangsu, China ),
Computer science
-
Living under the global pandemic, our team decided to build a face mask detection system as our capstone project. We trained a machine learning model to classify faces with or without masks, and implemented an interface to detect whether people on a live stream wear a facial mask.

Climate Change and Hurricane Severity: Is anthropogenic climate change exacerbating hurricanes in the United States?
Jackson Henningfield ’21 (Celebration, Florida),
Applied mathematics & statistics
-
The issue of climate change and its effects on not only natural processes but on our way of life is highly contested. There is no denying that the Earth is getting warmer, and we are playing an attributable role in this growing energy imbalance. With data collected from Berkeley Earth, NOAA, and the UN, we attempt to discern a causal relationship between greenhouse gases and the severity of hurricanes through inverse probability weighting and sensitivity analysis.

Survival analysis on employee turnover
Jacky (Yuhao) Xiao ’21 (Shenzhen, China),
Applied mathematics & statistics
-
We are still in the severe pandemic, where people lost their jobs and had no source of income. But what could be the reasons/factors for some employees quitting their job, if we did not have COVID? My goal is to predict individual quitting risks, using survival analysis models.

Word Clustering: Exploring How Computers Break Down a Corpus
James Crayton ’21 (Texas),
Computer science
-
How do algorithms view and categorize words? In this project, we explored word clustering algorithms and ways of interpreting their output by using the LOLCAT translation of the book of Genesis as our input.

Ranked Choice Voting on the Permutahedron, as Explained Through TikTok Sea Shanty
Jennifer DeJong ’21 (Sammamish, WA),
Computer science, Mathematics
-
In this project, we investigate new methods to analyze the overall structure of and find patterns within ranked choice voting data with tools from graph signal processing and representation theory. The main idea is to decompose the signal into a linear combination of building block signals, each of which has interpretable symmetry and smoothness properties. I then translated these methods and concepts into TikTok Sea Shanties.

Coding an Epidemic Model on a Dual Layer Network
Jennifer DeJong ’21 (Sammamish, WA),
Computer science, Mathematics
-
Living through Covid exposed a contradiction in epidemic modeling. Many of the current interventions for Covid must be adopted at an individual level – mask wearing, physical distancing, and hand washing. However, we are interested in the predictive modeling of the spread of Covid across a state and don’t always have accurate medical data at an individual level.
For our Network Science capstone, we used the Networkx Python package to build a model that is specifically designed to showcase the impact of household level precautionary behavior on the spread a disease at a state level.

What Factors Determine the Survival of Broadway Shows?
Jeong-Won Tak ’21 (Seoul, South Korea),
Applied mathematics & statistics
-
Did you know that most Broadway shows close after 10 or fewer performances? You may be surprised by this because the Broadway shows that we know such as The Phantom of the Opera, Chicago, and The Lion King have been open and running for more than 20 years. I explore the survival analysis literature relating to the duration of time a Broadway show lasts, and its significant predictors, with a particular focus on Tony Award nominations and wins. In doing so, I identify key differences between study types and discuss their implications on each paper’s results.

Pseudo-BMA: A (Real) Bayesian Model Selection Method
Josh Upadhyay ’21 (Bangkok, TH),
Applied mathematics & statistics
-
Cross-Validation, R^2, train/test splits. These are all common model selection approaches most students have used to evaluate the ‘best’ model to use in a particular situation. But in contrast to the mostly-used frequentist models, how can we apply some of the same model selection criteria to Bayesian models? I’ll attempt to explain the Pseudo Bayesian Model Averaging technique and illustrate its performance with the famous candy dataset, based of what we learned in Bayesian Statistics.

L I F T ! B Y T E ! Q-Learning in Swarm Robotics
Keara Berlin ’21 (Seattle, WA),
Computer science
-
Q-learning is a type of machine learning that uses positive reinforcement (rewards) to train an agent. In our project, we use Q-learning to teach a simulated robot swarm to explore an unknown environment. We implemented the simulation and Q-learning algorithm in NetLogo, a language designed to easily create visual simulations with multiple agents (such as a swarm).

Modeling Language Death: an Agent-Based Approach
Kelsey Stender-Moore ’21 (Bethesda, MD),
Applied mathematics & statistics
-
In this work, I examine the dynamics of language death and spread using an agent-based model set in a simplified of Scotland. Agents in the model can be monolingual or bilingual speakers of Scots and English and are impacted by local and global pressure to communicate with speakers of both languages. This work is an extension of past work done by Abrams, Strogatz, Minett and Wang about language death and endangered languages. The model predicts trends in line with past work, suggesting that, without intervention, Scots and other minor languages will become extinct.

A Dynamic Model the of Jellyfish Life Cycle
Kevin Omodt ’21 (Minneapolis, MN),
Applied mathematics & statistics
-
In recent years, a phenomenon known as “Jellyfish Blooms,” in which large populations of jellyfish spawn from a very small initial population. This has been brought about by changes to our environment, such as rising temperatures and the removal of natural predators. In my project, I replicated a dynamic model for jellyfish populations, which captures key aspects of the real-world dynamics by utilizing differential equations and wrote code to easily manipulate the model. One of the key goals is to demonstrate how a population of two adults can explode into a population of many thousands.

A Tour of Causal Inference through Coronavirus and Precautionary Measures
Laurel Hickey ’21 (Chatham, NJ),
Applied mathematics & statistics
-
Would we have been able to stop the coronavirus pandemic potentially with different mitigating measures done earlier or would we still be in the same position we are in no matter what actions we would have taken? This project walks through the coronavirus pandemic, looking into how we could determine whether there is a causal relationship between the case numbers and mitigating measures such as social distancing, mask wearing, school closures, etc. What are some tools that we could use in order to determine how effective these measures are in preventing or slowing the spread of the virus? And what study designs make the most sense for a causal analysis of slowing the spread of the virus?

The Synthetic Difference in Differences Estimator: An Application to Gentrification and Employment Outcomes
Liam Purkey ’21 (Portland, Oregon),
Applied mathematics & statistics
-
In this presentation, I propose the Synthetic Difference in Differences Estimator presented in Arkhangelsky et al. (2020) as a way to identify the economic effects of gentrification without quasi-experimental variation in the incidence of gentrification. I use this methodological innovations to study the employment effects of gentrification in New York City from 2010 to 2018. I find that gentrification increases the number of neighborhood jobs in Accommodation and Food Services while reducing the number of neighborhood mid-wage and service jobs that are held by residents of those neighborhoods.

Rendering Soft Shadows on a Bunny
Lily Irvin ’21 (Portland, Oregon),
Computer science
-
Ever wondered how graphics programmers get shadows to actually show up on your favorite video game or animated film? This project attempts to answer that question by rendering “soft” shadows on the model of a rabbit. I’ll walk you through how graphics shaders work, and how we got somewhat-realistic shadows to show up in basic c++/OpenGL scene.

Understanding the Global Arms Trade Network : Analysis of Different Community Detection Methods
Ling Ma ’21 (Beijing, China),
Applied mathematics & statistics, Computer science
-
Arms Trade is a good way to understand global military partnership and power dynamics. In a highly globalized world today, we are interested in understanding the (hidden) community of this network. To do so, we explored community detection algorithms combining the original topological network information(Hub score and Page Rank centrality) and the node attribute information(machine learning) to create a clustering(adjusted ANCA, replacing KNN with Minimum Spanning Tree due to the sparsity of the network) that perhaps tells a better story.

Causal Analysis: How does U.S. foreign aid affect recipient’s GDP growth
Ling Ma ’21 (Beijing, China),
Applied mathematics & statistics, Computer science
-
A lot of studies have been done on the effectiveness of U.S. foreign aid, which makes up to 1% of the total federal budget. Naturally, we wonder what factors might affect the effectiveness of the aid on the recipient. In addition to GDP, the recipient’s growth is also measured in the GDP per capita growth rate. The pertaining factors consist of US_ODA, Rest_of_world_aid_total, polity2, recipient_trade_world, and the effect of potential unmeasured variable .etc. Since they form convoluted relationships, we conduct causal analysis to inform our model construction.

If You Give a Robot Swarm a Cookie, They Will Gradually Learn How to Navigate a Randomized Environment
Linnea Prehn ’20 (Apple Valley, Minnesota),
Computer science
-
What is swarm robotics? How does it differ from the development of other kinds of artificial intelligence, and how can it be used practically today? Our project focuses on the use of a multi-agent system (or robot swarm) to navigate a simple simulated environment. Swarm robotics is becoming more popular in fields including terrestrial, aerial, and aquatic navigation. While our project unfortunately does not literally enter any of these environments, you can use your imagination to pretend that each agent is maneuvering around deadly subaquatic terrain as they avoid blue squares in our experimental simulation.

LOLCATZ Word Clustering
Logan Caraco ’21 (Los Angeles, CA),
Computer science, Mathematics
-
We created a novel natural language semantics analyzed by building on top of existing word clustering methods. We tested this algorithm on the LOLCATZ dialect of English.

Young Symmetrizer Coefficient Matrix Properties
Logan Caraco ’21 (Los Angeles, CA),
Computer science, Mathematics
-
We studied the efficient construction and curious properties of the Young Symmetrizer formula. The artifact presents a game version of the difficult aspects of constructing the Young Symmetrizer.

Cartograph: Visualizing Non-Spatial Data as Maps
Lu Li ’21 (Tianjin, China),
Computer science, Mathematics
-
Targeted search using search engines like Google is useful when one has a specific question they hope to answer, e.g. “how does the K means clustering algorithm work?”. However, it might not be the best tool for the initial exploration of a topic, e.g. “tell me about the field of computer science”. Cartograph captures meaningful relationships among non-spatial data and visualizes the data as maps. This allows people to explore a new field in a way that is familiar to them — geographically and interactively. Come see how to visualize Wikipedia as a map!

Uncovering the shape of data with Topology: an empirical study of Minimal Cycle Representatives in Persistent Homology using Linear Programming
Lu Li ’21 (Tianjin, China),
Computer science, Mathematics
-
Topological Data Analysis uncovers structure in data by quantifying its shape using a tool called persistent homology, which tracks holes of various dimensions across a proximity parameter. These holes are encoded as cycle representatives of persistent homology classes. However, the non-uniqueness of these representatives can lead to different interpretations of the same set of classes. One approach is to minimize representatives against a meaningful measure, for instance, length. In this project, we provide a study of the effectiveness and computational cost of several optimization procedures for constructing homological cycle bases. Come learn about TDA and how to find minimal representatives for topological features!

Methods in Time-Varying Treatments
Lydia Yoder ’21 (Concord, MA),
Applied mathematics & statistics
-
Many experiments are designed so that an experimental group receives one treatment, but in many observational and clinical settings, a participant’s treatment status can vary over time. For example, in a longitudinal study, a participant’s compliance to a given treatment might vary over time. This presentation explains how experimenters can still use longitudinal data to make causal inferences through the use of inverse probability weighting and regression.

Modeling and Forecasting US News and World Report College Rankings
Maddie AlQatami ’21 (Boulder, CO),
Computer science, Mathematics
-
The U.S. News & World Report (USNWR) annual college ranking lists are widely-circulated measures of the comparative quality of higher education institutions. In this project, we effectively model the 2020 ranking list using an elastic net approach, forecast future data values for all ranked institutions based on historical data using univariate time series simulation, and apply an elastic net model to our forecasted data to project rankings for 2021. We use these projected rankings to understand what constitutes a significant change in rank and determine the most effective course of action if Macalester College wishes to efficiently improve its rank.

Examining Alt-Right Communities in Subreddit Hyperlink Networks
Maddie AlQatami ’21 (Boulder, CO),
Computer science, Mathematics
-
Should social media platforms be taking proactive stances to censor communities that disseminate hate content? If so, how can they be identified quickly and accurately? In this project, I analyze hyperlink networks in alt-right communities on Reddit to find and remove only the most important community members in order to efficiently incapacitate the community. I propose that given a subset of subreddits with known ideology classification, important subreddits can be identified by their high betweenness-centrality scores and removed. This increases the network diameter and average path length of alt-right networks, thus reducing the spread of violent ideology between subcommunities.

Designing Visa-Vis: Navigating International Students through the OPT Application Process
Michael Maldonado ’20 (Texas),
Computer science
-
International students must successfully complete an application for Optical Practicum Training (OPT) in order to receive employment opportunities in the United States. Information about this process and relevant deadlines is not easily accessible, making planning for this requirement challenging and confusing for many students. In this capstone project, we seek to create a user interface that facilitates this process as much as possible. We iterate through several intermediate designs, at each step receiving user feedback on the design’s ease of use.

Does the language we use affect what we see on Google search images?
Mohamed Abdi Mohamed ’21 (Somalia),
Computer science
-
Does the language we use affect what we see on Google search images?

Summarize!
Nana A. Odeibea Amoah ’21 (Accra, Ghana),
Computer science
-
Can you think back to a time when you simply had too much to read and too little time to really read and understand the material you were given for your classes? How did that impact your learning in class? Imagine there was a quick way to sift through these materials and pull out the important information that you would need. We made this solution real by creating a text summarizer using techniques under Artificial Intelligence. Visit the blog post to learn more!

Causal Effect of Moderate Alcohol Consumption on Health
Nicole Frey ’21 (Madison, WI),
Applied mathematics & statistics
-
After a tour of causal inference topics, test your new knowledge on the literature! We analyze two articles about the causal effect of alcohol consumption on health. Moderate drinking has long been considered healthy, but experts are beginning to advocate for stricter recommendations.

Competing Risks in Survival Analysis
Precious Dlamini ’21 (The Kingdom of Eswatini),
Applied mathematics & statistics
-
In survival analysis, there are numerous models that are used to analyze survival data. However, traditional models are not designed to take into account cases where there are multiple competing risks that may lead to the same terminating event. When these models are used, they result in inaccurate estimates of the probability of the cause-specific event of interest. This project discusses in detail, the use of less frequently applied but better suited models and illustrates how they can be utilized to analyze survival data with competing risks.

Message in a Bottle: Can You Tell the Difference?
Randy Jose Beidelschies ’21 (Southwest Florida),
Computer science
-
Steganography: what is it and why is it important to me? As individuals who frequent the internet, we should be cognizant and protective over our information. Whether it be buying from a website, sending information to an employer, or relaying important records, there are things that we might not want the wrong eyes to see. In this project, my classmates and I from Fall 2020 decided to show the importance of protection of information through our steganography encryption application. For this presentation, I discuss what methods we used to encode information and which is more secure. Can you tell the difference between these images and which one contains sensitive information?

Economic Proofs: A Topological Approach
Rayyan Saad Mobarak ’21 (Dhaka, Bangladesh),
Applied mathematics & statistics
-
Problems in economics rely on the formulation of models consisting of agents who maximize their benefits given their constraints. Proving the existence of such outcomes requires rigorous mathematics. In this project, I show how Topology can provide a foundation in building such proofs. I take a simple economic market under perfect competition and use the Intermediate Value Theorem to prove the existence of a point of equilibrium. Once models become more abstract where intuition becomes difficult, readers are encouraged to explore Topology as an approach to proving equilibrium points.

Network of Transfers
Redon Kurti ’21 (Albania),
Computer science, Mathematics
-
Welcome to the Network of Transfers. Together, we will explore and analyze the transfer market of the top European Football Leagues using the magic of Network Science. We will determine the clubs and leagues that spend/earn the most money from transfers in each position. We will highlight big clubs like Real Madrid, Manchester United, Juventus, Liverpool, etc. and explain their importance to our networks using a range of statistical measures. Enjoy!

Implementation of an Informed Search Algorithm to Solve Crossword Puzzles
Redon Kurti ’21 (Albania),
Computer science, Mathematics
-
This project presents an algorithmic approach to crossword puzzle solving, drawing from previous successful programs and the research on problem-solving techniques extrapolated from understanding human approaches. An informed search algorithm is implemented which recursively traverses the solution space to a given puzzle, relying on external APIs for information retrieval and implementing collision resolving algorithms and a metric for evaluating the best solution. The performance of this algorithm is first tested on small example puzzles, then progressively challenged by increasing the difficulty and size of the puzzle.

Exploring the Dynamics between Languages and Google Search Results via Cloud AI Vision
Richard Tian ’21 (Beijing, China),
Computer science
-
Have you noticed the different results you got when using Google Search in different languages? For example, when you search the term “Mogadishu” (a city in Somalia) on Google, the search results are pretty different based on the search languages. It shows wars, soldiers, and poor villagers when you search in English, whereas it presents a glorious high-tech city when you search in Somali. In this project, we leverage Cloud AI Vision to label images (mainly cities) gathered from different languages’ search results and generate word cloud images from the analysis. We hope that this would shed some light on how the search results differ when we use different languages.

Painting for the Lazy: Neural Style Transfer Exploration and Comparison
Rose Lutz ’21 (Minneapolis, MN),
Computer science
-
In my capstone, we will explore how to use neural style transfer to convert one content image into the style of another image. In this particular project, we used famous paintings for our style image to make the content image look as if it was painted in the same style as the famous painting. Come learn how we accomplished this process, and how to use AI to make your own art.

Causal Effect of Alcohol Consumption on Life Expectancy
Ruijing Zhang ’20 (Chengdu, China),
Applied mathematics & statistics
-
In this project, I and my teamates are trying to investigate whether alcohol consumption would significantly affect life expectancy. We collected data across countries, like the alcohol consumption per capita, average life expectancy at birth, average literacy rate and national income. By joining these data and conducting regression / inverse-probability weighting analysis, we hope to successfully investigate the true relationship between alcohol consumption and life expectancy.

Why All Country Music Sounds the Same: A Network Approach
Ryan Specht ’21 (Somerville, Ohio),
Computer science, Mathematics
-
If you turn your radio dial to the country station, you will quickly find that the songs all sound the same. I use a co-star network made of producers, singers, and songwriters to investigate some of the potential causes of the musical monotony. I combine network analysis with a general knowledge of the country music industry to come to conclusions relevant to both country music lovers and those new to the genre.

The Importance of Empathy in Robots
Ryan Specht ’21 (Somerville, Ohio),
Computer science, Mathematics
-
As robots become more entrenched in our lives and we begin to expect them to have a social component we must ensure human robot interactions are productive. I argue that producing robots with the ability to show empathy is the way to do it. I will combine research in robotics with psychological research to lay out the case for the importance of empathy in robots as well as to give a realistic path toward creating such robots.

Google Image Analysis: Measuring changes of perception based on the language of the search term
Sam Rosevear ’21 (Lewisburg, Pennsylvania),
Computer science
-
When we search a city name in Google Images, what can the results tell us? We can begin to understand the perception of a city by looking at popular images of it. But will the city’s perception change based on what language we search for it in? Might English-speakers perceive “New York” differently than others? By expanding on techniques practiced in the Collective Intelligence course, this project analyzed the Google Image results for 4 different cities in 4 different languages, and measured the differences in perception.

Polarization in the 2020 General Election
Sarah Lipstone ’21 (Oak Park, CA),
Computer science
-
United States presidential elections seem to become more polarized and bitter every year. It can be easy to get caught up in the petty back-and-forth among partisans during debates, on TV news networks, and, especially, on Twitter. With the ability of users to share short and provocative posts quickly and President Trump’s frequent use of it, Twitter has become a popular forum for political opinions and discourse. For this project, tweets from a dataset provided by IEEE are analyzed using sentiment analysis.

Accounting for Interference in Causal Inference studies
Siddhant Singh ’21 (Lucknow, India),
Applied mathematics & statistics
-
Causal inference tries to address the question if the “treatment” in a situation actually affects the result, or if the correlation arises from other unmeasured variables. A core assumption in standard causal inference studies is the lack of “interference” between participants – that is, the treatment of one individual doesn’t affect the result of another. However, this assumption breaks down in certain situations, like vaccine studies, where herd immunity affects the outcomes of even those people who don’t receive the vaccine. We explore how interference can be incorporated into causal inference and provide a sample analysis using a cholera vaccine simulation.

A Network of Macalester Alumni Majors
Siguo Li ’21 (Shanghai, China),
Applied mathematics & statistics
-
In this project, we created a network to explore the relationship between a student’s major and their career using data from Macalester students and alumni on LinkedIn. We want to explore the relationship between majors and industry people work in during and after their time in an undergraduate institution. We found that on LinkedIn, more Mac alums working in different companies are Economic majors and Political Science majors. The Economics major contributes the most to companies in this LinkedIn network, which means that among all the Macalester alumni in LinkedIn, the most popular major for companies is Economics. We also found that majors that are related to each other also end up in companies in the same community.

The Effect of COVID-19 on Urban Mobility in New York City
Son Phan ’21 (Maui, HI),
Applied mathematics & statistics
-
With the onset of the COVID-19 pandemic, travel has come to a halt and countries around the world have instituted various lockdown procedures. New York in particular has, since March 2020, closed non-essential businesses, limited social gatherings, and required quarantine for any travelers.
In this project, we aim to take a look at the abrupt effect of the Coronavirus procedures on how people commute in New York City. Utilizing a data set of taxi travel times between popular areas in the city we look to analyze the difference in speed and mobility before and during the pandemic.

Lonely? Build a Recurrent Neural Network Chatbot that Sounds Like You!
Sophie Pincus-Kazmar ’21 (Evansville, WI),
Computer science
-
How can technology communicate with us and what makes it so convincing at times? In this project, we explore natural language processing and text generation through short story generators and by building a simple chatbot. In order to improve and personalize the bot, we train it on our own Facebook Messenger data and develop a custom evaluation system for the bot’s responses. We also experiment with different data sources and formats to optimize the chatbot.

MapReduce: Optimizing Large-scale Distributed Data Processing
Stephanie (Thy) Le ’21 (Hanoi, Vietnam),
Computer science
-
In this project, my teammate and I explore MapReduce, a programming model that can handle processing a massive amount of data on a network of worker machines. I will discuss the model and our implementation, including the partitioning of data and the communication scheme between worker machines, and present our own optimization technique that cut the runtime threefold.

Parallelization with MPI4PY
Sun Gyu Park ’21 (Seoul, South Korea),
Computer science
-
During 2020 summer research, I worked on the parallelization of a computational program relevant to viscous fluid mechanics provided by Professor William Mitchell. There were two versions of the code; the simpler one is called “Slender-Body Theory” and a more complicated one is called the “full-PDE.” The most time-costly part of both computations was the numerical evaluation of many integrals. Both versions were implemented using multidimensional array operations in NumPy. My attempts were to parallelize the Slender-Body Theory computation by flattening the arrays and using “scatter-gather” operations or simple message-passing with the MPI4PY package in Python.

T-Gammon: Applying Temporal Difference Learning Methods to the Game of Backgammon
Tate Munnich ’21 (Port Townsend, Washington),
Computer science, Mathematics
-
Have you ever wondered how computer opponents in games are created? How is it that some AI opponents are strong or human-like, while others may be erratic or unintelligent? In this project I explore the ancient game of backgammon and various algorithms aimed to create intelligent computer players. I begin by exploring strategies that use mental shortcuts thought up by human players to find strong moves. Eventually, I implement a type of reinforcement learning inspired by those used by the strongest AI players today. This final type of learning is “knowledge-free,” meaning that it devises its strategy entirely from repeated self-play.

The Hyperoctahedral Group and the Hypercube
Tate Munnich ’21 (Port Townsend, Washington),
Computer science, Mathematics
-
How on earth would you rotate a 4D cube? What do bit strings have to do with the vertices of a square? This “multidimensional” project answers these questions and many more through the lens of representation theory. We use high-dimensional cubes to reveal interesting and useful properties of a notable set of symmetries called the hyperoctahedral group.

A Tour of Causal Inference
Thy Nguyen ’21 (Hanoi, Vietnam),
Applied mathematics & statistics
-
“Correlation is not causation”. Well then, what is? What are the conditions that must be met before a claim can be made about the relationship between two variables? When can we say that one causes the other? And if so, to what extent? Most importantly, why do we need to be so careful when throwing the words “cause” and “causal” around? In this research project, I attempt to provide an overview of causal inference, or the study of cause-and-effect relationships, and investigate a link between smoking and COVID-19 to demonstrate why the field is so important.

Custom dlib mouth feature detector and shape predictor algorithms
Tianrui Liu ’21 (Beijing, China),
Computer science
-
Having explored convolutional neural networks and decision trees in the Intro to AI class, we are curious to learn more algorithms that are widely applied in image processing and computer vision. In this project, we investigated the default model ensemble regression tree in the dlib library and implemented our custom dlib mouth feature detector. Additionally, we explored different tuning techniques by comparing the standard grid search and global optimization.

What can affect employment turnover?
Tiantian Sun ’21 (Beijing),
Applied mathematics & statistics
-
The length of service of the employees has received more and more attention recently because it might reflect the well-being of the employees, natural progression of career path, characteristics of the industries, and so on. Therefore, it is important to measure the employment turnover rate and explore the variables that might generate an impact on the length of service; and the results could be useful for suggestions for both the companies and the employees. To study which set of factors can affect the employment turnover, we used the length of service which is the duration of time that an employee worked in the company as the outcome variable with the methods of survival analysis. We employed non-parametric, semi-parametric, and parametric approaches and tested the parametric results with both AIC (Akaike’s Information Criterion) and Likelihood-ratio test (LRT) for the robustness of our analysis.

Sudoku Solver: Using Graph Coloring to Solve Sudoku Puzzles
Toshini Sharma ’21 (New Delhi, India),
Computer science
-
Do you like solving Sudoku puzzles? Do you like graph theory? Well, this is the perfect project for you to explore! I set out to make a Sudoku solver and on my journey, discovered this wonderful notion of representing Sudoku puzzles as graphs. Being a graph theory nerd, I naturally then wondered whether I could now solve the Sudoku graph using a classic graph theory method like graph coloring! So, I created a program to do just that! Come explore the logic and journey behind my Sudoku Solver!

Analysis of Breast Cancer Treatments
Will Madairy ’21 (Charlotte, North Carolina),
Applied mathematics & statistics
-
How do differing breast cancer treatments compare in effectiveness? When considering which treatment is best, we would expect for more invasive procedures to lead to greater survival outcomes, but is this truly the case? Using data from the Netherland Cancer Institute, we were able to analyze survival outcomes across different treatments. In this analysis we compare outcomes for patients who underwent chemotherapy, hormonal, and amputation treatments and determine if any of the treatments leads to longer survival time compared to others.

Two-dimensional Spatially Varying Carrying Capacity & Elephant Migration
Xilin Niu ’21 (Wuhan, China),
Mathematics
-
In this project, Chuan Ping ’21 and I tried to elucidate the mathematical mechanisms behind elephant migrations. We extrapolated the 1-D scenario of animal migration we learned in class, with regards to spatially varying carrying capacities in a 2-D space based on real-world data. We seek to model the growth and migration of the elephant population so that after a reasonable amount of time, the distribution of animals in the 2-D space should carry attributes associated with the environment reflected by the carrying capacity. We therefore employed the geographic data of parts of Angola, a country in southwest Africa, and constructed the model to test the hypothesis that our model approximates the real-life distribution.

Signal Processing on the Permutahedron: Tight Spectral Frames for Ranked Data Analysis
Yilin Chen ’21 (Beijing, China),
Computer science, Mathematics
-
In ranked choice voting, voters order their preferences for n candidates. We can view ranked data as residing on the vertices of the permutahedron, with vertices labeled by permutations and edges representing adjacent transpositions. The goal of this project is to investigate a novel ranked data analysis method to identify, interpret, and exploit structure in ranked data. Our method combines spectral graph decomposition from signal processing with symmetry decomposition from representation theory to create an overcomplete dictionary of atoms, each of which captures both smoothness and structural information of data on the permutahedron.

Custom dlib Mouth Feature Detector and Shape Predictor Algorithms
Yilin Chen ’21 (Beijing, China),
Computer science, Mathematics
-
In this project, we investigate the default model ensemble regression tree in the dlib library and implement our custom dlib mouth feature detector. Through literature review of papers closely related to face detection and facial landmarks prediction, we look into various shape predictor algorithms such as histogram of oriented gradients (HOG) and support vector machines (SVM), and also explore different tuning techniques by comparing the standard grid search and global optimization.

What are the key factors affecting the survival of breast cancer patients?
Yiming Miao ’21 (China),
Applied mathematics & statistics, Computer science
-
Cancer has become a critical health problem. Among all sorts of cancer, breast cancer is the second common one in American women. Motivated by providing statistical insight into this disease, my teammate and I analyzed breast cancer data from Rotterdam Tumor Bank. This analysis aimed to investigate potential key factors affecting the survival time of breast cancer patients and also promote the awareness of breast cancer screening.

Face Mask Detection Using Machine Learning
Yiming Miao ’21 (China),
Applied mathematics & statistics, Computer science
-
Ever since 2020, the COVID-19 disease has spread across the whole world and this overwhelming historic pandemic impacted the life of every earth citizen. To prevent the spread of the COVID-19, CDC calls on everyone to wear masks in any indoor public space. To relieve this supervision pressure, we conducted this project to build a face mask detection system to automatically detect any violation of wearing masks.

How Do Search Results Differ When Using Different Languages
Yiping Zhong ’21 (Beijing, China),
Computer science
-
When we go to search “Somalia” in Google, the search engine gives very different results based on our search language. It shows wars, soldiers, and poor villagers when we search in English, whereas it presents a glorious high-tech city in Somali language. The difference really catch up our eyes and raise our curiosity to explore how the search results differ when we use different languages.

Appa
Zahara M. Spilka ’21 (Wisconsin),
Computer science
-
For our project, my partner and I created a scene in which a 3D model of the fictional flying bison, Appa, from Avatar: the Last Airbender, glows. To create this glow, we used a bloom shader which makes Appa’s eyes emitt a soft blue light.

Which music genre is the most appealing?
Zhangchi Liu ’20 (Shanghai, China),
Applied mathematics & statistics
-
Music recommendation service has grown significantly in the past few years. Services such as Spotify and Apple Music use different algorithms to determine which songs a user is most likely to enjoy based on their past listening habits and other users who are similar to that used. In real life, our music preferences are also greatly influenced by our friends and families. Therefore, I explore how neighbor music preferences affect an agent’s music taste, and aim to determine which genre would end up being the most popular via the agent based model.

Real-time Facial Expression Recognition Using Machine Learning with OpenCV
Zhenggang Tan ’21 (China),
Applied mathematics & statistics, Computer science
-
When I was in high school, I was a huge fan of the TV series Lie to Me, which talks about how a doctor and detective identifies lies by looking at facial and body expression and micro expression. What if we could have a machine or program that does the detection for us accurately? In this project, we will be examining a very classic way of building an accurate facial expression recognition using machine learning and the package OpenCV.

Factors that Affect Breast Cancer Survival — A Survival Analysis with Rotterdam Dataset
Zhenggang Tan ’21 (China),
Applied mathematics & statistics, Computer science
-
According to World Health Organization, Cancer is a leading cause of death worldwide, accounting for an estimated 9.6 million deaths in 2018. And amongst all cancer types, breast cancer(along with lung cancer) has the top cases of death: 2.09 million cases in 2018. According to the CDC, Breast cancer is also the second most common cancer among women in the United States, comprising 22.9% of invasive cancers in women and 16% of all female cancers. We think it would be interesting to take a look at survival and many related statistics of breast cancer.

Network of Odyssey: A Mathematical Quest to Understand Classical Canon
Zhiyuan (Simon) Wang ’21 (Shanghai),
Mathematics
-
Greek Mythology 101.

A Bayesian Analysis of College Rankings
Zuofu Huang ’21 (Nanjing, China),
Applied mathematics & statistics
-
We develop a Bayesian model for predicting the US News ranking of universities and liberal arts colleges. In real life, we observe a lot of variability in the rankings of universities with similar qualifications. This variability is even more unpredictable for liberal arts colleges. With four easily accessible predictors (SAT score, admissions rate, location, and the number of undergraduate students), we are able to achieve a reasonable estimate of the school rankings. Last, we experiment with the Shiny application as a tool for students to “find” their fitness with potential undergraduate institutions.