Date & Time:
November 8, 2019 1:30 pm – 2:30 pm
Location:
Crerar 298, 5730 S. Ellis Ave., Chicago, IL,
11/08/2019 01:30 PM 11/08/2019 02:30 PM America/Chicago MS Presentation: Zhi Hong Crerar 298, 5730 S. Ellis Ave., Chicago, IL,

Enabling Generalizable Scientific Named Entity
Recognition

Over the past decades, we have witnessed the explosive growth of the
hardware capabilities on computers. Machine Learning and Deep Learning
models, of which the theoretical foundations have been established
long ago, are finally computationally feasible. This does not only
affect computer science. In fact, more and more disciplines are
turning into “data sciences”, with cheaper, safer, easier data-based
simulations providing insights and guidance for traditional
experiments. These data-based methods require large of amounts of
data, especially structured data that can be easily understood and
processed by computers. Yet scientists have relied on written papers,
not digital databases, to disseminate their discoveries for several
centuries. Scientific papers are intended to be read by humans, and
most adequately convey not only discoveries, but the conditions and
methods by which those discoveries were made. Unfortunately, the
ambiguity and variability inherent in natural language makes the
automated extraction of claims from scientific papers very difficult.
Even apparently simple tasks, such as isolating reported values for
physical quantities (e.g., “the melting point of X is Y”) can be
complicated by such factors as domain-specific conventions about how
named entities (the X in the example) are referenced. Although there
are domain-specific toolkits that can handle such complications in
certain areas, a generalizable, adaptable model for scientific texts
is still lacking. In this thesis, we present our first step towards
automating this process. We have de- signed, implemented, and
evaluated models based on classifiers and neural networks for
recognizing scientific entities in free text in multiple domains.
Experiments show that our neural network model outperforms a leading
domain-specific extraction toolkit by up to 50%, as measured by F1
score, while also being easily adapted to new domains.

Zhi Hong

M.S. Candidate, University of Chicago

Zhi's advisor is Prof. Ian Foster

Related News & Events

NeurIPS 2023 Award-winning paper by DSI Faculty Bo Li, DecodingTrust, provides a comprehensive framework for assessing trustworthiness of GPT models

Feb 01, 2024
Video

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Jan 26, 2024
Video

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Jan 23, 2024

UChicago Undergrad Analyzes Machine Learning Models Used By CPD, Uncovers Lack of Transparency About Data Usage

Oct 31, 2023

In The News: U.N. Officials Urge Regulation of Artificial Intelligence

"Security Council members said they feared that a new technology might prove a major threat to world peace."
Jul 27, 2023

UChicago Computer Scientists Bring in Generative Neural Networks to Stop Real-Time Video From Lagging

Jun 29, 2023

UChicago Team Wins The NIH Long COVID Computational Challenge

Jun 28, 2023

UChicago Assistant Professor Raul Castro Fernandez Receives 2023 ACM SIGMOD Test-of-Time Award

Jun 27, 2023
Michael Franklin

Mike Franklin, Dan Nicolae Receive 2023 Arthur L. Kelly Faculty Prize

Jun 02, 2023

PhD Student Kevin Bryson Receives NSF Graduate Research Fellowship to Create Equitable Algorithmic Data Tools

Apr 14, 2023

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

Apr 07, 2023

UChicago / School of the Art Institute Class Uses Art to Highlight Data Privacy Dangers

Apr 03, 2023
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube