Program Overview
Day | Date | Focus | Location |
---|---|---|---|
Day 1 | Tuesday, Dec. 3 | Training workshops | Spieker Forum at Chou Hall |
Day 2 | Wednesday, Dec. 4 | Talks and posters | Spieker Forum at Chou Hall |
Day 3 | Thursday, Dec. 5 | Talks and posters | Spieker Forum a Chou Hall |
Day 4 | Friday, Dec. 6 | Collaboration and coding | BIDS (190 Doe) |
Register for TextXD 2019 - Submit a poster application
Day 1: Tuesday, December 3rd (Workshops)
Location: Spieker Forum at Chou Hall
These workshops will generally be interactive coding sessions with jupyter notebooks, so we strongly recommend bringing a laptop with a working installation of Anaconda / Python. No prior experience with text analysis is assumed.
Time | Topic | Speaker | Institution |
---|---|---|---|
9am | Breakfast | ||
9:30am | Welcome | Claudia von Vacano | D-Lab |
9:40am | Text as Data Introduction | Jaren Haber | UC Berkeley, Sociology |
10:35am | Web APIs and Scraping | Geoff Bacon | UC Berkeley, Linguistics |
11:30am | Coffee Break | ||
11:45am | Topic modeling | Ilya Akdemir | UC Berkeley, Law |
12:45pm | Lunch | ||
1:40pm | Word embeddings | Alina Arseniev-Koehler | UCLA, Sociology |
2:45pm | Supervised machine learning | Caroline Le Pennec-Caldichoury | UC Berkeley, Economics |
3:45pm | Coffee Break | ||
4pm | Deep learning | Dima Lituiev | UC San Francisco, Bakar Computational Health Sciences Institute |
5pm | Discussion |
Day 2: Wednesday, December 4th (Talks)
Location: Spieker Forum at Chou Hall
Time | Topic | Speaker | Institution |
---|---|---|---|
9am | Breakfast | ||
9:30am | Welcome | Heather Haveman | UC Berkeley, Sociology & Business |
9:40am | Keynote | Chris Potts | Stanford University, Linguistics |
10:30am | Session 1 - Psychological Threads | ||
“I come before you a changed man”: Historical Changes in the Vocabulary of Parole Release Decisions | Isaac Dalke | UC Berkeley, Sociology | |
“The words of trauma” - Text Analysis of the effect of War World II on Salinger’s literature | Anat Talmon, Chen Edelsburg, Nimrod Talmon | Stanford University, Psychology and Tel Aviv University | |
11:15am | Coffee Break | ||
11:30am | Session 2 - Policy | ||
Gender Stereotypes in Professor-Student Interactions | Zachary Bleemer | UC Berkeley, Economics | |
State-level racial attitudes and adverse birth outcomes: applying natural language processing to Twitter data to quantify state context for pregnant women | Thu Nguyen | UC San Francisco, Epidemiology & Biostatistics | |
NLP approaches to detecting behavioral failures in sustainable transportation infrastructure | Omar Isaac Asensio | Georgia Institute of Technology, Public Policy | |
12:30pm | Lunch + Poster session | ||
Exploratory Expansion of Accounting Word Lists using Word-Embedding Models on SEC Filings | Brian Chivers | ||
Title TBD | Raquel Coelho | ||
Refugee Education: A Survey of Topics and Trends in Newswires and Press Releases, 2009 to 2018 | Seungah Lee | ||
Who is cuing whom? The dual process of shaping knowledge gap in climate change communication | Yijyun Lin | ||
The Limits of Interest: Capture, Financialization, or Contestation in the Politics of Rule-Making on Derivatives | Konrad Posch | ||
Predicting Semantic Fluency Using Large-scale Language Corpora | Zhihao Zhang | ||
Applying natural language processing algorithms to detect behavioral failures in emerging electric vehicle infrastructure | Sooji Ha | ||
1:30pm | Keynote: Towards Universal Language Understanding | Yunyao Li | IBM, Scalable Knowledge Intelligence |
2:15pm | Session 3 - Theory and Methods | ||
Interpreting and improving NLP models via disentangled interpretations | Chandan Singh | UC Berkeley, Computer Science | |
Cross-domain classification | Barea Sinno | University of Texas at Austin, Ohio State University | |
Automated methods enable direct computation on phenotypic descriptions for novel candidate gene prediction | Ian Braun | Iowa State University, Computational Biology | |
3:15pm | Coffee Break | ||
3:30pm | Session 4 - Politics | ||
Detecting Meaningful Multi-word Expressions in Political Text | Kenneth Benoit | London School of Economics, Methodology | |
Who speaks for Women in the Indian Parliament? | Saloni Bhogale | Ashoka University, Trivedi Centre | |
Sentiment is Not Stance: Target-Aware Classification for Political Text Analysis | Samuel E. Bestvater, Burt Monroe | The Pennsylvania State University, Political Science | |
4:30pm | Keynote | Justin Grimmer | Stanford University, Political Science |
5:30pm | Reception - Berkeley Institute for Data Science (190 Doe Library) |
Day 3: Thursday, December 5th (Talks)
Location: Spieker Forum at Chou Hall
Time | Topic | Speaker | Institution |
---|---|---|---|
9am | Breakfast | ||
9:30am | Welcome | ||
9:40am | Keynote | Kathleen Carley | Carnegie Mellon University, Computer Science |
10:30am | Session 5 - Innovation | ||
Quantifying Innovation with BERT: Linguistic Prescience and Firm Stock Returns | Paul Vicinanza | Stanford University, Graduate School of Business | |
Identifying (Dis)Continuities in Ed Tech’s Discourse of Invention | Sebastian Muñoz-Najar Galvez | Stanford University, Graduate School of Education | |
11:15am | Coffee Break | ||
11:30am | Session 6 - Public Health | ||
NLP for conversational dialog | Orianna DeMasi | UC Davis, Computer Science | |
#Vape: Measuring E-cigarette Influence on Instagram with Deep Learning and Text Analysis | Julia Vassey | UC Berkeley, Public Health | |
No More Silence: Monitoring Bias with Word2Vec | Lauren Kaplan | UC San Francisco, Medicine | |
12:30pm | Lunch + Poster session | ||
Natural Language Processing for Materials Discovery and Design | John Dagdelen | ||
Teaching machine synthesis: collecting dataset of “codified synthesis recipes” extracted from millions of publications | Olga Kononova | ||
A Transparent and Adaptable Method to Extract Colonoscopy and Pathology Data Using Natural Language Processing | Liyan Liu | ||
Understanding emerging forms of cannabis use through online communities | Meredith Meacham | ||
Making Sense of Clinical Trial Descriptions: A Text Analysis Approach | Munif Ishad Mujib | ||
Impacts of the Arts | Gabriel Harp | ||
FrameNet and Natural Language Processing | Miriam R L Petruck, Collin Baker | ||
1:30pm | Session 7 - Lightning Talks | ||
Hidden Political Dynasties in China: Analyzing Chinese Baby Names as Ultra-Short Political Text Data | Tao Li | University of Macau, Government & Public Administration | |
Are both policemen and policewomen police officers? The gender connotations of gender-fair language | Alina Arseniev-Koehler | UCLA, Sociology | |
Uses of the Machine-learning Protest Event Database System | Alex Hanna | Google, ML Fairness | |
A pipeline for analyzing Akkadian texts | Aleksi Sahala | University of Helsinki, Linguistics | |
Summer Institute in Computational Social Science in the San Francisco Bay Area: Computational Social Science for Social Good | Jaren Haber and Jae Yeon Kim | UC Berkeley | |
2pm | Session 8 - Biomedical | ||
Application of text mining methods to identify lupus nephritis from electronic health records | Milena Gianfrancesco | UC San Francisco, Medicine | |
Unstructured Text Analysis in Electronic Health Records to Characterize Sepsis Presentation | Meghana Bhimarao | Kaiser Permanente, Division of Research | |
Extracting patient-reported functional status and disease activity information from electronic health records | Tome Eftimov | Stanford University, Biomedical Data Science | |
Natural language processing for automated rapid cancer ascertainment | Liyan Liu | Kaiser Permanente, Division of Research | |
3:15pm | Coffee Break | ||
3:30pm | Session 9 - News and Media | ||
“Downloading” the news: Reproducible access to text as data | Cody Hennesy | University of Minnesota, Libraries | |
Media Attention and Bureaucratic Responsiveness | Aaron Erlich | McGill University, Political Science | |
Using Text Data as Alternative | Jae Yeon Kim | UC Berkeley, Political Science | |
4:30pm | Keynote | Brandon Stewart | Princeton University, Sociology |
5:30pm | Reception - Tap Haus, 2518 Durant Ave |
Day 4: Friday, December 6th (Collaboration)
Location: Berkeley Institute for Data Science (190 Doe Library)
Theme: Text Analysis for Social Good
Day 4 will be at BIDS and will include a hackathon component as well as parallel breakout sessions for discussing major issues in text analysis / NLP. The hackathon will feature multiple projects with associated datasets and starter jupyter notebooks. Participants will form teams and apply text analysis methods of their choice, potentially leading to future research collaborations. Breakout sessions will feature introductory presentations followed by facilitated discussions leading to summary recommendations on the chosen topic.
Time | Topic | Breakout session(s) |
---|---|---|
9am | Breakfast | |
9:30am | Welcome - David Mongeau, BIDS | |
9:40am | Project introductions | |
10am | Coding / collaboration | Pedagogy of Text Analysis - Evan Muzzall |
11am | Coffee Break | |
11:15am | Coding / collaboration | Text Analysis for Social Good |
12:30pm | Lunch | |
1:30pm | Coding / collaboration | TextXD 2020 priorities |
3:00pm | Coffee Break | |
3:15pm | Coding / collaboration | |
4:00pm | Report back & conference close |