Guanqun Yang 杨冠群

Ph.D. Candidate in Computer Science @ Stevens Institute of Technology

I am currently working towards building NLP systems that are well-tested, secure against adversaries, and interpretable for domain specifications.

Previously my projects were mainly focused on statistical machine learning and its application in speech signal processing, recommendation system, and text mining. I am also interested in optimization and graph structures in a real-world setting. I also spent plenty of time on both lower-level design and fabrication and higher-level control and pose estimation of mobile robots.

Statistical Machine Learning and its Applications

Statistical machine learning is a process of incrementally improving the performance of computer systems under specific metrics by employing statistical techniques but without prior explicit programming. The core of (supervised) machine learning is to find patterns of given data and apply this pattern to previously unseen data and thereby giving predictions of them. Because of the universal needs for pattern recognition, machine learning algorithms are utilized to a wide range of areas, including art, humanities, economics, and many others. For example, neural style transfer in arts, KuroNet in ancient Japanese identification, and algorithmic trade.

Fairness-Preserving Machine Learning

Given the availability of colossal dataset and ever-increasing computing power, the responsibility of decision-making is gradually shifted to algorithms. From something as small as distributing free admissions to a movie premiere to more consequential ones like granting an automobile loan, these systems make decisions that affect people's life without raising their awareness. However, many recent cases exemplify that algorithmic decision-making systems could inherit and amplify the bias encoded in data. Application Tracking Systems (ATS) used by Amazon show gender bias against female when evaluating job applicants’ resumes, making equally qualified women fail to get the interview opportunity while men could. Credit card approval systems wrongly associate race and gender information with whether or not the applicant would default, creating significant racial and gender disparity in the final approval statistics. Commerical facial recognition systems embody as much as 30% performance degradation when the subject has darker skin, which is often associated with race.

Guanqun Yang, Lingxiao Wang, Deep Learning under Fairness Constraint, 2019
[Full Text]
Guanqun Yang, Fairness: What is the Right Thing to Do? A Comparative Study of Fairness-Preserving Algorithms, 2018
[Presentation Slide]-[Full Text]

Autonomous Motion Planning by Deep Reinforcement Learning for Fall Prevention in Hospitals

The cost and availability of elderly care services have gradually become a major concern with the ever-increasing elderly population. At the same time, mobile robots feature the ability to perform repetitive tasks with high efficiency and accuracy in these environments, which makes them suitable alternatives to nursery professionals. In this project, we aim to implement a system that could detect the danger factors in the environment and then guide the people in need to their destination via the detected secure route by deep reinforcement learning.

The system consists of two parts in cascade, namely object detection, and motion planning.

Object detection: detect danger factors from the environment with deep learning.
Motion planning: generate a secure path by reinforcement learning.

The environment map is first fed into the system, and mobile robots explore the environment and make safety or danger annotation on the map until all areas are annotated. Then the mobile robot detects a secure route that is suitable for users to his/her destination. Finally, the robot leads the user to the destination.

Guanqun Yang, Japan-US-Canada Advanced Collaborative Education Program (JUACEP), 2018 at Nagoya University
Adivisor: Prof. Yoji Yamada
[Presentation Slide]-[Full Text]

Speech Segment Identification for Person Recognition

The similarity measure and estimation system is an important component of the automatic speaker recognition (ASR) system, and this process decides the performance of such systems. However, the accuracy of similarity evaluation degrades when the speech segment is short or polluted by noise. In this project, we manage to develop a speech segment verification system specifically designed to address these issues.

Multiple traditional algorithms, including SVM, GMM, and neural networks, are attempted but shown invalid because of the problems mentioned above. The dynamic time wrapping (DTW) algorithm, however, turns out to be successful. Precisely, the similarity is measured by the DTW algorithm and used to generate a threshold in training time. With the help of the threshold, the testing speech segments are compared, and a similarity measure is made available.

Two critical measures, TPR or sensitivity and FPR or specificity, are less than 15% under clean conditions and less than 35% under noisy conditions with 10dB babble noise.

Yucong Wang, Jingjing Zhang, Guanqun Yang, Zhengtao Zhou, Design and Implementation of Speaker Similarity Estimation System based on UCLA Variability Database, 2018
[Full Text]

Popularity Analysis of Twitter Hashtags for SuperBowl 2015

Twitter is a major social networking service where people could post and interact with messages known as tweets. One of the most important features of this service is the introduction of hashtags. Spontaneously added by users, they could be used to categorize tweets of different topics. In this project, we will explore six hashtags collected during three weeks (specifically, from two weeks before and one week after the event) when SuperBowl 2015 took place, including #GoHawks, #GoPatriots, #NFL, #Patriots, #SB49 and #SuperBowl.

We employ only the linear model and try to investigate its power and shortcomings for time-series prediction. More detailed, five features, including number of tweets and retweets, sum and maximum number of followers for a specific hashtag and time of day in UTC format, are extracted from the Twitter metadata and a linear model trained on time-step \(k\) (in hour) data is used to predict the Twitter activity on time-step \(k+1\) (in hour). In order to validate the predictions of the linear models, \(t\)-tests are conducted, and associated p-values are compared. From our analysis, it turns out that

Dynamics of some hashtags are more difficult to predict than others, and this does not come from their internal statistics. For example, the hashtag #GoHawks could fit the linear model well while the fit for #GoHawks is surprisingly poor.
Linear models could still make fair predictions when nonlinearity is mild. However, when system dynamics indicate strong nonlinearity, for example, the hashtag activities on Feb. 01, the event day, could not be well predicted for any hashtag.

Graph Structure in Real World Problems

The elements of the graph (vertices, edges, and weights associated with edges) make it suitable for representing relations and result in many applications of graph structures in social networks, transportation, and many others. The following are two of my projects that explore the graph structure in cooperative relations in the entertainment industry and general property of public transportation.

Cooperative Relation between Movie Actors/Actresses

The IMDb database provides a comprehensive summary of numerous properties of movies, including actorsactresses list, genre, rating, and many others. In this project, we manage to explore cooperative relations in the entertainment industry and evaluate the influence of actors/actresses.

The collaboration graph between actors/actresses is a weighted directed graph where each vertix represents a actor or actress and the weight

\[ w_{ij}=\frac{\vert S_i \cap S_j\vert}{\vert S_i\vert} \]

\(\vert S_i \cap S_j \vert\): number of movies actor/actress \(i\) and \(j\) have collaborated
\(\vert S_i \vert\): number of movies actor/actress \(i\) has played

indicates the level of collaboration between two people. By exploring edges associated with some famous actors/actresses, the results are just like we anticipate (see table alongside). However, when we evaluate the ten actors/actresses with the strongest influence by PageRank score, contrary to our prior knowledge, none of them are famous.

Uber Movement in San Francisco

This project aims to analyze the graph structure of more than 1.6 million traveling data in San Francisco area during December 2017. Major components (vertices, edges, and weights) of the graph are first extracted from metadata, and they are then used to generate the graph, find GCC, and finally converted to a legal undirected simple weighted graph.

Multiple graph structures are explored in the resulting graph including MST and maximum flow. What is more, an approximate algorithm is attempted to solve the traveling salesman problem in this graph.

The metadata is available here.

Convex Optimization

Convex optimization is the backbone for solving problems arising from areas such as machine learning, control, estimation and signal processing, and even finance. Many machine learning algorithms gain their popularity because of the underlying convexity of their formulation, including Linear and Logistic regression, SVM (Support Vector Machine), and others. Even though convex optimization is largely seen as a technology, which shows the maturity of many existing algorithms, in order to better understand the mechanism of this process, a SVM classifier is implemented from stratch with the help of CVX.

A Generic Linear Classifier Implementation for Image Recognition

Linear SVM is a popular linear classification algorithm extensively used in many applications, including the classification of images, texts, and even chemical compounds. Depending on number of classes to be classified, the tasks are categorized into binary and multiclass classification problems. The solution of multiclass classification is made possible by numerous strategies, including transforming into binary classification, an extension of binary classification, and hierarchical classification. In this project, we are aiming to classify handwritten digits based on MNIST dataset using the first strategy mentioned above. Suppose we have \(K\) classes of data points to be classified (in our dataset, \(K=10\)), then we have two schemes:

One-versus-One (OvO): \(\frac{K(K-1)}{2}\) classifiers are generated in training time. In testing time, each query point is tested on all classifiers. For each classifier, the membership of the query point is determined and recorded. After \(\frac{K(K-1)}{2}\) rounds of comparisons, the query point will have votes on class 1, class 2, \(\cdots\), class \(K\). Eventually, the query point's membership is decided by the class that has the maximum number of votes.
One-versus-All (OvA): \(K\) classifiers are generated in training time. In testing time, each query point is also tested on all classifiers. For each classifier, the membership could only be determined when the query point belongs to a specific class. In other cases, the decision is no more valid, and a random membership is assigned.

From the description of two different schemes, the OvO scheme should provide a more accurate prediction, which is evident in the figure alongside. Note that the Sampling Ratio means the number of data points used from the original dataset.

Previous Projects

The following projects are completed during my undergraduate studies at Northeastern University, China. These projects are mainly focused on the design and implementation of robotics control systems.

Pose Estimation of Mobile Robots Based on the Integration of IMU and Vision

Pose estimation is an integral part of mobile robot locomotion, and it directly determines the locomotion accuracy of such systems. Both IMU-based kinematics methods and vision-based methods are able to provide pose information of robots. However, the former one is prone to the unevenness of ground while the performance of the latter one degrades when illumination fluctuates. In this project, these two methods are integrated under the Kalman filter framework. Two sources of information are first integrated using the Extended Kalman Filter (EKF). Then, in order to address the sudden maneuvering of mobile robots and to adaptively adjust the contribution of two sources, Interactive Multiple Model (IMM) framework is employed.

Experimental results indicate the validity of such systems.

The mobile robot is instructed to move in a square, and the left part of the figure is the ground truth pose acquired using a high-precision motion capturing device. The pose estimation provided by Kinect lost track of features because of environment change and therefore provided inaccurate information about the pose (the top figure on the right). After integrating this pose with the one provided by IMU, the position estimation is improved, and the orientation estimation is corrected (see marked part), showing the success of this integration.

Pose Estimation of Mobile Robots Based on the Integration of IMU and Vision (in Chinese)
Bachelor Thesis, Northeastern University (China), 2017
[Full Text in Chinese]-[Abstract in English]

SVPWM Controller Implementation for 3-Phase Asynchronous Motor

The three-phase asynchronous motor features non-linearity in its mathematical model and tight electromagnet coupling, and this makes this family of motors is hard to control effectively by solely using regular PID controllers. Space vector pulse width modulation (SVPWM) algorithm manages to generates sinusoid waves by simple ON/OFF operations of power electronic devices like IGBT. Incorporating PID controllers into the system, this project manages to implement a double loop SVPWM governing system in C++ on TMS320F2812 DSP by Texas Instruments. The 3-phase asynchronous motor in the SVPWM system could be governed like the AC motor with high precision and low latency.

Design and Implementation of a Flapping Propulsion Plant for Underwater Robot

Some underwater tasks like seabed exploration require relatively low velocity but high flexibility, and this makes the traditional screw propulsion solution no more applicable. In this project, we design and implement a bionic propulsion system inspired by sea turtles, where flipper could move in four degrees of freedom (elevating/heaving and walking/surging), and therefore make possible low-velocity underwater maneuvering.

This propulsion system consists of two cylinders, where each controls two degrees of freedom, and the flipper is governed by AC motor in the outer cylinder, which is in turn controlled by another AC motor fixed in the inner cylinder. Based on the prototype of our design, multiple key parameters were identified experimentally and analytically for the system model. Furthermore, different control algorithms were employed for the rapid response of the mechanical structure of the prototype. Specifically, the deadbeat and ripple-free algorithm showed the best dynamic and static characteristics. The prototype of this propulsion system indicated its flexibility and extensibility for underwater exploration in the low-speed setting.

Guanqun Yang, Qiwang Jia, Dong Zhao, Hongpeng Yang, Undergraduate Innovation Initiative, Chinese Academy of Sciences, 2016
Advisor: Prof. Haitao Gu