Archive for the ‘Natural Language Processing (NLP)’ Category

Leveraging Clinical Narrative Text at Scale: Applying Natural Language Processing to Understand the Patient Care Experience @NIH.gov


News & Events


Leveraging Clinical Narrative Text at Scale: Applying Natural Language Processing to Understand the Patient Care Experience @NIH.gov

November 5, 2020
1:00 – 2:30 pm EST

Leveraging Clinical Narrative Text at Scale

View WebinarExternal Web Site Policy


Recent advances in biomedical informatics, such as the application of machine learning and natural language processing (NLP) to Electronic Health Record (EHR) data, have resulted in breakthroughs in the ability to identify, classify, and predict aspects of patient care, symptoms, and communication in the clinical record. These new tools that allow researchers and clinicians to generate and examine new research questions examining patient care quality and health-related quality of life.

This webinar presents four research studies highlighting different NLP applications in EHR free text. Topics include (1) phenotyping for treatment-related side effects and system severity, (2) cancer treatment-related side-effects, (3) non-adherence to cancer therapy and recurrence, and (4) documentation of patient care goals.

Presenters discuss how these methodologies support the capture and use of real-world data for precision oncology care, population health management, and how to scale-up applications within and across health care systems.


Julian Hong, MD, MS
Assistant Professor
Healthcare Delivery Research Program
Department of Radiation Oncology
UCSF Medical Center

Charlotta Lindvall, MD, PhD
Instructor, Medicine, Harvard Medical School
Physician, Department of Psychosocial Oncology and Palliative Care
Dana-Farber Cancer Institute

Tina Hernandez-Boussard, PhD
Associate Professor
Stanford University School of Medicine

Imon Banerjee, PhD
Assistant Professor
Department of Biomedical Informatics
Emory University School of Medicine

Intended Audience

Extramural investigators at junior, mid, and senior levels. Basic familiarity with data science methods may be helpful, but not required.

For more information, please contact Roxanne Jensen.

This webinar will be archived on the HDRP Events webpage.


From: <messenger@webex.com>

Reply-To: <jennifer.leidenberger@nih.gov>

Date: Wednesday, November 18, 2020 at 4:09 PM

To: “Aviva Lev-Ari, PhD, RN” <AvivaLev-Ari@alum.berkeley.edu>

Subject: Resources from the November 5 Leveraging Clinical Narrative Text at Scale Webinar

Read Full Post »

re:Invent 2020 – Virtual 3 weeks Conference, Nov. 30 – Dec. 18, 2020: How Healthcare & Life Sciences leaders are using AWS to transform their businesses and innovate on behalf of their customers. 


Preview the tracks that will be available, along with the general agenda, on the website. 





AWS re:Invent 2020 – Life Sciences Attendee Guide

AWS re:Invent routinely fills several Las Vegas venues with standing-room only crowds, but we are bringing it to you with an all-virtual and free event this year. This year’s conference is gearing up to be our biggest yet and we have an exciting program planned with five keynotes, 18 leadership sessions, and over 500 breakout sessions beginning November 30. Hear how AWS experts and inspiring members of the Life Sciences & Genomics industry are using cloud technology to transform their businesses and innovate on the behalf of their customers.For Life Sciences attendees looking to get the most out of their experience, follow these steps:

  • Register for re:Invent.
  • Take a look at all of the Life Sciences sessions available, as well as lots of other information and additional activities, in our curated Life Sciences Attendee Guide coming soon!
  • Check back on this post regularly, as we’ll continually update it to reflect the newest information.

Life Sciences at re:Invent 2020

AWS enables pharma and biotech companies to transform every stage of the pharma value chain, with services that enhance data liquidity, operational excellence, and customer engagement. AWS is the trusted technology provider with the cost-effective compute and storage, machine learning capabilities, and customer-centric know how to help companies embrace innovation and bring differentiated therapeutics to market faster.

Life Sciences sessions

Bookmark this blog and check back for direct links to each session and add to your re:Invent schedule as soon as the session catalog is released:

LFS201: Life Sciences Industry: Executive Outlook
Learn how AWS technology is helping organizations improve their data liquidity, achieve operational excellence, and enhance customer engagement.

LFS202: Improving data liquidity in Roche’s personalized healthcare platform
Learn how Roche’s personalized healthcare platform is accelerating drug discovery and transforming the patient journey with digital technology.

LFS302: AstraZeneca genomics on AWS: from petabytes to new medicines
Learn how AstraZeneca built an industry leading genomics pipeline on AWS to analyze 2 million genomes in support of precision medicine.

LFS303: Building patient-centric virtualized trials
Learn how Evidation Health architects on AWS to create patient-centric experiences in decentralized and virtual clinical trials.

LFS304: Streamlining manufacturing and supply chain at Novartis
Learn how Novartis is creating real-time analytics and transparency in the pharma manufacturing process and supply chain to bring innovative medicines to market.

LFS305: Accelerating regulatory assessments in life sciences manufacturing
Learn how Merck leveraged Amazon Machine Learning to build an evaluation and recommendation engine for streamlining pharma manufacturing change requests.

Other related sessions of interest:

ENT203: How BMS automates GxP compliance for SAP systems on AWS

GPS211: AWS Partners driving innovation amidst COVID-19

HLC203: Securing Personal Health Information and High Risk Data Sets

WPS202: Transform research environments with Service Workbench on AWS

AIM310: Intelligent document processing for healthcare organizations

Healthcare Attendee Guide

AWS re:Invent routinely fills several Las Vegas venues with standing-room only crowds, but we are bringing it to you with an all-virtual and free event this year. This year’s conference is gearing up to be our biggest yet and we have an exciting program planned for the Healthcare industry with five keynotes, 18 leadership sessions, and over 500 breakout sessions beginning November 30. See how AWS experts and talented members of the Healthcare industry are using cloud technology to transform their businesses and innovate on the behalf of their customers.For Healthcare attendees looking to get the most out of their experience, follow these steps:

  • Register for re:Invent.
  • Take a look at all of the Healthcare sessions available, as well as lots of other information and additional activities, in our curated Healthcare Attendee Guide coming soon!
  • Check back on this post regularly, as we’ll continually update it to reflect the newest information.

Healthcare at re:Invent 2020

AWS is the trusted technology partner to the global healthcare industry. For over 12 years, AWS has established itself as the most mature, comprehensive, and broadly adopted cloud platform and is trusted by thousands of healthcare customers around the world—including the fastest-growing startups, the largest enterprises, and leading government agencies. The secure and compliant AWS technology enables the highly regulated healthcare industry to improve outcomes and lower costs by providing the tools to unlock the potential of healthcare data, predict healthcare events, and build closer relationships with patients and consumers. The healthcare track at re:Invent 2020 will feature customer-led sessions focused on these each of these critical components, accelerating the transformation of healthcare.

Healthcare sessions

Learn more and bookmark each Healthcare session:

HCL201: Healthcare Executive Outlook: Accelerating Transformation
Learn how AWS is working with industry leaders to increase their pace of innovation, unlock the potential of their healthcare data, help predict patient health events, and personalize the healthcare journey for their patients, consumers, and members.

HLC202: Making Healthcare More Personal with MetroPlus Health
Learn how MetroPlus Health leveraged AWS technology to quickly build and deploy an application that personally and proactively reached out to its members during a time of critical need.

HLC203: Securing Personal Health Information and High Risk Data Sets
Learn how Arcadia developed a HITRUST CSF Certified platform by leveraging AWS technology to enable the secure management of data from over 100 million patients.

HLC204: Accelerating the Transition to Virtual Care with AWS
Learn how MedStar Health developed and deployed two call centers in less than week that are supporting more than 3,500 outpatient telehealth sessions a day.

WPS202: Transform research environments with Service Workbench on AWS
Learn how Harvard Medical School is using AWS to procure and deploy domain-specific data, tools, and secure IT environments to accelerate research.

WPS209: Reinventing medical imaging with machine learning on AWS
Learn how Radboud University Medical Center uses AWS to power its machine learning imaging platform with 45,000+ registered researchers and clinicians from all over the world.

WPS211: An introduction to healthcare interoperability and FHIR Works on AWS
Learn about AWS FHIR Works, an open-source project, designed to accelerate the industries use of the interoperability standard, Fast Healthcare Interoperability Resources (FHIR).

WPS304: Achieving healthcare interoperability with FHIR Works on AWS
Learn how Black Pear Software leveraged AWS to build an integration toolkit to help their customers share healthcare data more effectively.

Extras you won’t want to miss out on!

LFS201: Life Sciences Industry: Executive outlook

LFS202: AstraZeneca genomics on AWS: From petabytes to new medicines

LFS303: Building patient-centric virtualized trials

AIM303: Using AI to automate clinical workflows

AIM310: Intelligent document processing for the insurance industry

INO204: Solving societal challenges with digital innovation on AWS

ZWL208: Using cloud-based genomic research to reduce health care disparities

GPS211: AWS Partners driving innovation amidst COVID-19

Read Full Post »

Open Data Science Conference, Virtual and In-Person | October 27th – 30th, 2020, Natural Language Processing Track

Virtual and In-Person | October 27th – 30th, 2020

Natural Language Processing Track

Learn the latest models, advancements, and trends from the top practitioners and researchers behind NLP

Conference Website



Thursday – 10/29/2020

09:00 AM – 10:30 AM – ODSC Keynotes

10:30 AM – 5:30 PM – ODSC Hands-on Trainings and Workshops

10:00 AM – 4:30 PM – Partner Demo Talks

10:30 AM – 5:00 PM – Breakout Talk Sessions

09:30 AM – 4:30 PM – Applied AI Free Virtual Event

12:00 PM – 2:00 PM – Woman Ignite Session

1:00 PM – 1:45 PM – Virtual Networking Event

4:00 PM – 5:30 PM         – AI Investors Reverse Pitch

3:30 PM – 4:30 PM – Meet the Expert


Friday – 10/30/2020 

09:00 AM – 10:30 AM – ODSC Keynotes

10:30 AM – 5:30 PM – ODSC Hands-on Trainings and Workshops

10:30 AM – 5:00 PM – Breakout Talk Sessions

10:30 AM – 5:00 PM – Career Mentor Talks

11:30 AM – 12:00 PM – Meet the Speaker

4:00 PM – 5:30 PM –  Learning from Failure

Are We Ready for the Era of Analytics Heterogeneity? Maybe… but the Data Says No


Wed, October 28, 9:00 AM

Marinela Profi | Global Strategist AI & Model Management | Data Science Evangelist | SAS | WOMEN TECH NETWORK


Type: Keynote


Session Details & Prerequisites Q&A Slack Channel

Keynote Session – Suchi Saria


Wed, October 28, 9:30 AM

Suchi Saria, PhD | Director, Machine Learning & Healthcare Lab | Johns Hopkins University


Type: Keynote


Q&A Slack Channel

A Secure Collaborative Learning Platform


Wed, October 28, 10:00 AM

Raluca Ada Popa, PhD | Assistant Professor | Co-Founder | Berkeley | PreVeil


Type: Keynote


Session Details & Prerequisites Q&A Slack Channel


Data for Good: Ensuring the Responsible Use of Data to Benefit Society


Thu, October 29, 9:00 AM

Jeannette M. Wing, PhD | Avanessians Director of the Data Science Institute and Professor of Computer Science | Columbia University

  • Causal INFERENCE Effects – estimate effects
  • Over and under estimation of instrumental variables
  • Confounders: Model assigned causes – Over and under estimation
  • De-Confounder: Estimate substitute confounders – Over and under estimation
  • Convolutional Neuro-networks model
  • Economics: Monopsony, Robo-Advising
  • History: Topic modeling with NLP,
  • Trustworthy Computing vs Trustworthy AI: Safety, Fairness, Robustness
  • Classifiers: Fair/Unfair make then more robust to a class of distributions
  • Image recognition system: DeepXplore: Semantic perturbation
  • DP and ML: PixelDP – STOP sign vs Yield sign
  • HealthCare @Columbia University: 600 Million EHR
  1. The Medical De-confounder: Treatment Effects on A1c DM2

Type: Keynote, Level: All Levels, Focus Area: AI for Good, Machine Learning

Session Details & Prerequisites Q&A Slack Channel

Keynote Session – Ben Taylor

Thu, October 29, 9:30 AM

Ben Taylor, PhD | Chief AI Evangelist | DataRobot

  • Convolution NN – Clustering of Countries: Latin America, Asia
  • Story telling
  • Acceleration:
  1. GPT-3 from OpenAI – Q&A, Translation, grammar
  2. Image GPT
  • Can AI Predict

Type: Keynote, Level: All Levels, Focus Area: Data Science Track

Q&A Slack Channel

Applying AI to Real World Use Cases

Thu, October 29, 10:00 AM

John Montgomery | Corporate Vice President, Program Management, AI Platform | Microsoft

Type: Keynote

  • Machine comprehension
  • Massive ML Models: Vision Model – Reznet
  • Alternative to Azure, OpenAI (Partner of Microsoft) released –>>>>> GPT-3 1758
  • AZURE ML: create models, operationalize models, build models responsibly
  • Model interpretability – Data Science, gov’t regulation: Features importance dashdourd
  • Building accurate models
  1. Little Ceasar’s Pizza: “Hot N-Ready” – Demand forecasting of Pizza Supply by combination of ingredients

Predict: X Quantity by Auto ML

  • Deploy and Manage Many Models: MMM Accelerator: Ten Models at AGL – Australia renewal energy

Model for Responsible ML: Fairness & Interpretability

  • EY – Bank denies a LOAN
  • Mitigation of Bias detection for Men and Women in Loan Applications

Loan Approval

  • Explanation dashboard – Aggregate model: Top feature in loan approval: Education Level
  • Fairness – Hazard performance for Accuracy: Disparity in prediction by Gender

ML is part of AZURE Platform

Bonsai – is Reinforcement Learning: Simulation Scenarios

AutoML – do know standard algorithms vs when you do not know

Session Details & Prerequisites Q&A Slack Channel

TALKS on 10/29/2020


Thu, October 29, 10:30 AM


Tian Zheng, PhD | Chair, Department of Statistics | Associate Director | Columbia University | Data Science Institute

Type: Track Keynote, Level: Intermediate, Focus Area: NLP

  • Stochastic variability inference
  • Case-control likelihood approximation
  • Sampling node system


  • LDA – Latent Distribution Modeling Dirichlet

Probability distribution over the vocabulary of words: Topic assignment


  • MMSB – Mixed Membership

Detect communities in networks

blockmodel – profile of social interaction in different nodes

  • LMV – Pairwise-Link-LDA – same topic proportions have equal % for citing


  1. Draw topic
  2. Draw Beta
  3. For each document
  4. For each document pair

Variational Inference – fully factored model

  • article visibility

Stochastic Variation Inference

  • local (specific to each node) & global (across nodes)
  • At each iteration minibatch of nodes

Sampling Document pairs

  • Stratified sampling scheme – shorter link
  • Informative set sampling [informative vs non-imformative sets]
  • these scheme – Mean estimation problem: Inclusion probability: All links are included
  • Stochastic gradient updates for global parameters
  • Comparison with alternative Approaches
  1. LDA + Regression
  2. Relational topic model
  3. Pairwise-Link-LDA combine LDA and MMB [Same priors]
  • Predictive ranks (random guessing) and Runtimes (compact id distinct no overalp)
  1. evaluate model fit: average predictive rank of held-out documents – Top articles

Cora dataset

LMVS – better predictive performance than

KDD Dataset

Citation trends in HEP: Relevance of Topics vs Visibility

Article recommendation by Rank Topic Proportions

Visibility as a topic-adjusted measure

More recent are more visible

CItation is not a strong indicator for visibility

Visibility as a topic-adjusted measure

Making Deep Learning Efficient

Thu, October 29, 11:20 AM


Kurt Keutzer, PhD | Professor, Co-founder, Investor | UC Berkeley, DeepScale

Type: Track Keynote

  • ML – SubSets
  1. Deep Learning – TRAINING for Clssification – Neuralnets – LeNet vs AlexNet – 7 layers 140x flops – using parallelism
  2. Shallow learning – deterministic and linear classifier used
  3. ML algorithms: Core ML, Audio analysis (Speech and audio recognition) , Multimedia
  4. NLP: translation,
  5. McKinsey & Co. – AI as a Service (AIasS)


Image Classification

  • Object Detection
  • Semantic Segmentation
  • Convolutional NN

Audio Enhancement at BabbleLabs 

Video Sentiment Analysis – Recommendations to Watch or to search

Natural Language Processing & Speech

  • Translation
  • Document understanding
  • Question answering
  • general language understanding evaluation (GLUE)

BerkeleyDeepDrive (BDD)

BERT – Transformer – 7 seconds per sentence

  • BERT-base
  • Q-BERT
  • Transformer

Computational Patterns of Deep NN (DNN) – TRAINING required for DNN



Recommendation Models – DNN – Parallelism

  • Facebook – 80% is recommendation = Advertisement
  • No sharing of data by Collector: Alibaba, Facebook, twitter


  • Latency – NETWORK WIFI
  • Energy
  • Computation power
  • Privacy
  • Quantization: Fewer Memory Accesses
  • Lower Precision implies higher
  • Flat Loss Landscape – Precision Layer by Layer
  • Move computation to the EDGE


Language Complexity and Volatility in Financial Markets: Using NLP to Further our Understanding of Information Processing

Thu, October 29, 12:10 PM


Ahmet K. Karagozoglu, Ph.D. | C.V. Starr Distinguished Professor of Finance | Visiting Scholar, Volatility and Risk Institute | Hofstra University | New York University Stern School of Business

Type: Track Keynote, Level: All Levels, Focus Area: NLP


Intelligibility Throughout the Machine Learning Life Cycle

Thu, October 29, 2:00 PM


Jenn Wortman Vaughan, PhD | Senior Principal Researcher | Microsoft Research

Type: Talk, Level: Beginner-Intermediate, Focus Area: Machine Learning

  • A Human-centered Agenda for Intelligibility
  • Beyond the model: Data, objectives, performance metrics
  • context of relevant stakeholders
  • Properties of system design vs Properties of Human behavior

Learning with Limited Labels

Thu, October 29, 3:05 PM


Shanghang Zhang, PhD | Postdoc Researcher | University of California, Berkeley

Type: Talk, Level: Intermediate-Advanced, Focus Area: Deep Learning, Research frontiers


How AI is Changing the Shopping Experience

Thu, October 29, 3:05 PM


Sveta Kostinsky | Director of Sales Engineering | Samasource
Marcelo Benedetti | Senior Account Executive | Samasource

Type: Talk, Level: Intermediate, Focus Area: Machine Learning, Deep Learning

  • quality rubric
  • Internal QA Sampling
  • Client QA Sampling
  • Auto QA

Transfer Learning in NLP

Thu, October 29, 3:40 PM


Joan Xiao, PhD | Principal Data Scientist | Linc Global

Type: Talk, Level: Intermediate, Focus Area: NLP, Deep Learning

Transfer learning enables leveraging knowledge acquired from related data to improve performance on a target task. The advancement of deep learning and large amount of labelled data such as ImageNet has made high performing pre-trained computer vision models possible. Transfer learning, in particular, fine-tuning a pre-trained model on a target task, has been a far more common practice than training from scratch in computer vision.

In NLP, starting from 2018, thanks to the various large language models (ULMFiT, OpenAI GPT, BERT family, etc) pre-trained on large corpus, transfer learning has become a new paradigm and new state of the art results on many NLP tasks have been achieved.

In this session we’ll learn the different types of transfer learning, the architecture of these pre-trained language models, and how different transfer learning techniques can be used to solve various NLP tasks. In addition, we’ll also show a variety of problems that can be solved using these language models and transfer learning.

  •  Transfer learning: Computer Vision – ImageNet Classification
  •  ResNet, GoogleNet, ILSVRC – VGG, ILSVRC’12 – AlexNet
  •   Feature Extrator vs Fine-tune
  •  Transfer learning: NLP
  • Transfer Transformer: Text-to-Text Transfer Transformer 
  1. Word embeddings: No context is taken into account – Word2vec, Glove
  2. ELMo – embedding from language models: Contextual,
  3. BERT – Bi-directional Encoder Representations fro Transformers
  4. MLM – Masked Language Model: Forward, Backward, Masked
  5. Next Sentence Prediction
  6. Achieved SOTA – 11 tasks: GLUE, SQuAD 1.0
  • Predisction models;
  • Input
  • Label – IsNext vs NotNext

 GLUE Test score


  • Featured-based approach

BERT Variants – TinyBert, Albert, RoBETa, DistilBert

Multi-lingual BERT, BERT other languages

A Primer in BERTology: How BERT Works

 OpenAI built a text generator – too dangerous to release

OpenAI GPT-3 – Trained on 300B tokens – THREE models:

  1. Zero-shot – English to French – no training
  2. one-shots
  3. Few-shot – the GOAL – GPT-3
  4. GRT-3 is large scale NLP

Examples – Feature extraction

  • English to SQL
  • English to CSS
  • English to LaTex

Semantic textual similarity

NL inference 

ULMFiT – Fine tuning – the larger the # of Training examples – the better the performance 

  1. LM pre-training – start from scratch: BART, Big Bird, ELECTRA, Longformer
  2. LM fine-tuning
  3. Classifier fine-tuning

Data augmentation

Contextual Augmentation

  1. Original sentence
  2. masked
  3. augmented

Test generation

  1. boolean questions
  2. from structured data, i.e.,  RDF – Resource Description Framework


Generalized Deep Reinforcement Learning for Solving Combinatorial Optimization Problems


Fri, October 30, 9:00 AM

Azalia Mirhoseini, PhD | Senior Research Scientist | Google Brain

Type: Keynote


Many problems in systems and chip design are in the form of combinatorial optimization on graph structured data. In this talk, I will motivate taking a learning based approach to combinatorial optimization problems with a focus on deep reinforcement learning (RL) agents that generalize. I will discuss our work on a new domain-transferable reinforcement learning methodology for optimizing chip placement, a long pole in hardware design. Our approach is capable of learning from past experience and improving over time, resulting in more optimized placements on unseen chip blocks as the RL agent is exposed to a larger volume of data. Our objective is to minimize PPA (power, performance, and area), and we show that, in under 6 hours, our method can generate placements that are superhuman or comparable on modern accelerator chips, whereas existing baselines require human experts in the loop and can take several weeks.


Azalia Mirhoseini is a Senior Research Scientist at Google Brain. She is the co-founder/tech-lead of the Machine Learning for Systems Team in Google Brain where they focus on deep reinforcement learning based approaches to solve problems in computer systems and metal earning. She has a Ph.D. in Electrical and Computer Engineering from Rice University. She has received a number of awards, including the MIT Technology Review 35 under 35 award, the Best Ph.D. Thesis Award at Rice and a Gold Medal in the National Math Olympiad in Iran. Her work has been covered in various media outlets including MIT Technology Review, IEEE Spectrum, and Wired.

Session Details & Prerequisites Q&A Slack Channel
  • Learning Based Approaches vs branch & Bound, Hill climbing, ILP
  • scale on distributed platforms
  • Device Placement – too big to fit – PARTITION among multiple devices – evaluate run time per alternative placements
  • Learn Placement on NMT – Profiling Placement on NMT
  • CPU + layers encoder and decoders – overhead tradeoffs – parallelization for work balancing
  • RL-based placement vs Expert placement
  • Memory copying task
  • Generalization to be achieved forr Device Placement Architecture
  • Embeddings that transfer knowledge across graphs
  • Graph Partitioning: Normalized cuts objective: Volume , Cuts,
  • Learning based approach Train NN on nodes of graph assign Probability of node belonging to a given partition
  • Continuous relaxation of Normalized cuts
  • Optimize expected normalized Cuts
  • Generalized Graph Partitioning Framework
Chip Placement Problem (Floor planning) – Chip Design – resource optimization, canonical reimforcement learning
  • Placement Optimmization using AGENTS to place the nodes
  • Train Policy to be using for placement of ALL chips
  • Compiling a Dataset of Chip Placements
  • Policy/Value Model Architecture to save wire length used
  • RISC-V: Placement Visualization: Training from Scratch (Human) 6-8 weeks vs Pre-Trained 24 hours

Keynote Session – Zoubin Ghahramani

Fri, October 30, 9:30 AM

Zoubin Ghahramani, PhD | Distinguished Scientist and Sr Research Director | Professor of Information Engineering | ex-Chief Scientist and VP of AI | Google | University of Cambridge | Uber

Type: Keynote

Q&A Slack Channel

  • Data- models predictiona decisions Understanding
  • AI & Games
  • AI + ML
  • Deep Learning! (DL)
  1. NN  – tunable nonlinear functions with many parameters
  2. Parameters are weights of NN
  3. Optimization + Statistics
  4. DL – New-branding of NN
  5. Many layers – ReLUs attention
  6. Cloud resources
  7. SW – TensorFloe, JAX
  8. Industry investment in DL

DL – very successful

  • non-parametric statistics
  • use huge data – simulated data
  • automatic differentiation
  • stay close to identity – makes models deeps ReLU, LSTMs GRUs, ResNets
  • Symmentry parameter tieying

Limitations of DL

  • data hungry
  • adversarial examples
  • black-boxes – difficult to trust
  • uncertainty – not easily incorporated

Beyond DL

  • ML as Probabilistic Modeling: Data observed from a system
  • uncertainty
  • inverse probability
  • Bayes rule Priors from measured quantities inference for posterior
  • learning and predicting can be seen as forms of inference – likelihood
  • approximations from estimation of Likelihoods
  1. Learning
  2. Prediction
  3. Model Comparison
  4. Sum rule: Product rule

Why do probabilities matter in AI and DS?

  • COmplexity control and structure learning
  • exploration-exlpoitations trade-offs
  • Building prior knowledge algorithms for small and large data sets
  • BDP – Bayesian DL
  • Gaussian Processes – Linear and logistics regressions SVMs
  • BDL – Baysian NN/ GP Hybrids
  • Deep Sum=Product Networks – deescrimitive programming

Probabilistic Programming Languagues

Languages: Tensors, Turing,

Automatic Statistician –

  • model discovery from data and explain the results

Probabilistic ML

  • Learn from Data  decision theory Prob AI BDL, Prob Prog,

Zoubin Ghahramani, 2015, Probabilistic machine learning and AI, Nature 521; 452-459


The Future of Computing is Distributed

Fri, October 30, 10:00 AM

Ion Stoica, PhD | Professor of Computer Science Division | Co-Founder | Berkeley | Anyscale | Databricks | Conviva Networks
  • 1970 – ARPA net 1970 – distributed
  • 1980 – High performance computing – HPC 1980s
  • 1990 – WEB – Amazon
  • 2000 – Big data – Google

Distributed computing – Few courses at universities

  • Rise of deep learning (DL)
  • Application becomes AI centered: Healthcare, FIN, Manufacturing
  • Morse law – is dead: Memory and Processors
  • Specialized hardware: CPU, GPU, TPU
  • Memory dwarfed by demand
  • Memory: Tutring Project 17B
  • GPT-2 8.3B
  • GPT-1
  • Micro-services: Clusters of clouds – integrating with distributed workloads
  • AI is overlapping with HPC
  • AI and Big Data

AI Applications

  • MPI,
  • Stitching several existing systems

RAY riselab @Berkeley – Universal framework for distributed computing (Python and JAVA) across different Libraries

  • Asynchronous execution enable parallelism
  • Function -> Task (API)
  • Object ID – every task scheduled
  • Library Ecosystem – Native Libraries 3rd Party Libraries
  • Amazon and AZURE SPARK, MARS (Tensor)


  • Number of contributors increase fast N=300


TALKS on 10/30/2020


Advances and Frontiers in Auto AI & Machine Learning – Lisa Amini

Lisa Amini, Director | IBM Research – Cambridge
  • Auto AI – holistic approach
  • Auto ML – Models: Feature creation, modeling, training & testing

AI AUTOmation for Enterprise

  • Feature Preprocessor ->>Feature Transformer Feature selector Estimator
  • Joint-optimization problem
  1. Method selection
  2. Hyper-parameter Optimization
  3. Black-box constraints
  • Bias Mitigation Algorithms
  1. Pre-processing algo
  2. In-processing Algo
  3. Post-processing algo
  • Automation for Data – READINESS for ML
  • relational data –
  • knowledge augmentation
  • Data readiness reporting
  • Labeling Automation: Enhance

Knowledge augmentation – Federated Learning

  • External data sources
  • existing data
  • documents containing domain knowledge
  • Automating Augmenting Data with knowledge: feature-concept mapping


  • Time Series Forecasting

AI to decision Optimization

  • Demand forecasting from Standard AutoAI by ADDING Historical Decisions and Historical Business Impact__>> reinforced learning – Automatically created model from past and Auto AI


  • Meta-learning for performance prediction
  • Train the META data
  • Score production data with AI


  • staged deployment with contextual bandits


  • Performance prediction meta model applied over windows of production traffic


  • End-to-end AI life cycle
  • expanding scope of automation; Domain knowledge and decision optimization


The State of Serverless and Applications to AI


Fri, October 30, 11:20 AM

Joe Hellerstein, PhD | Chief Strategy Officer, Professor of Computer Science | Trifacta, Berkeley

The Cloud and practical AI have evolved hand-in-hand over the last decade. Looking forward to the next decade, both of these technologies are moving toward increased democratization, enabling the broad majority of developers to gain access to the technology.

Serverless computing is a relatively new abstraction for democratizing the task of programming the cloud at scale. In this talk I will discuss the limitations of first-generation serverless computing from the major cloud vendors, and ongoing research at Berkeley’s RISELab to push forward toward “”””stateful”””” serverless computing. In addition to system infrastructure, I will discuss and demonstrate applications including data science, model serving for machine learning, and cloud-bursted computing for robotics.


Joseph M. Hellerstein is the Jim Gray Professor of Computer Science at the University of California, Berkeley, whose work focuses on data-centric systems and the way they drive computing. He is an ACM Fellow, an Alfred P. Sloan Research Fellow and the recipient of three ACM-SIGMOD “Test of Time” awards for his research. Fortune Magazine has included him in their list of 50 smartest people in technology , and MIT’s Technology Review magazine included his work on their TR10 list of the 10 technologies “most likely to change our world”. Hellerstein is the co-founder and Chief Strategy Officer of Trifacta, a software vendor providing intelligent interactive solutions to the messy problem of wrangling data. He has served on the technical advisory boards of a number of computing and Internet companies including Dell EMC, SurveyMonkey, Captricity, and Datometry, and previously served as the Director of Intel Research, Berkeley.

Type: Talk, Level: Intermediate, Focus Area: AI for Good, Machine Learning

Session Details & Prerequisites Q&A Slack Channel
  • What happened with the Cloud – no app
  • Parallelism – distributed computers – scale up or down, consistency and partial failure
  • Serverless Computing: Functions-as-a-Service (FaaS)
  • Developers OUTSIDE AWS, AZURE< Google to program the CLoud
  • Python for the Cloud
  • AutoScaling – yes
  • Limitations of FaaS: AWS Lambda: I/O Bottlenecks, lifetine 15 min, No Inbound Network COmmunication
  • Program State: local data – managed across invocations
  • Data Gravity – expensive to move

Distributed consistency – data replication: Agree on data  value mutable variable x [undate took place]

  • Two-Phase commit [ Consensus – Paxos]
  • coordination avoidance: waiting for control TALL LATENCY- DISTRIBUTION OF PERFORMANCE
  • Slowdown cascades: I/O
  • Application semantics: Programs requires coordination
  • Program must have property of Monotonic
  • MONOTONICITY: Input grows/output grows – wait on information not on Coordination

CALM – infinitely-scalable systems – no coordination ->> parallelism and smooth scalability

Monotonicity syntactically in a logic language

Hydro: a Platform for Programming the Cloud

Anna Serverless KVS – Hydro Project

  • shared-nothing at all scales (even across Threads)
  • Fast under contention: 90% request handling

Cloudburst: A stateful Serverless Platform: CACHE close to compute: Cache consistency

Latency Python, Cloudburst, AWS, AWS Lambda:

  • AWS Lambda is SLOW for AI vs Python, Cloudburst

Scalable AWS Lambda simultaneously

  • Motion planning compute
  • Cloudburst + Anna requirement


Bloom Lab




Just Machine Learning

Fri, October 30, 1:10 PM


Tina Eliassi-Rad, PhD | Professor | Core Faculty | Northeastern University | Network Science Institute

Type: Talk, Level: All Levels, Focus Area: Machine Learning

In 1997, Tom Mitchell defined the well-posed learning problem as follows: “A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.” In this talk, I will discuss current tasks, experiences, and performance measures as they pertain to fairness in machine learning. The most popular task thus far has been risk assessment. We know this task comes with impossibility results (e.g., see Kleinberg et al. 2016, Chouldechova 2016). I will highlight new findings in terms of these impossibility results. In addition, most human decision-makers seem to use risk estimates for efficiency purposes and not to make fairer decisions. I will present an alternative task definition whose goal is to provide more context to the human decision-maker. The problems surrounding experience have received the most attention. Joy Buolamwini (MIT Media Lab) refers to these as the “under-sampled majority” problem. The majority of the population is non-white, non-male; however, white males are overrepresented in the training data. Not being properly represented in the training data comes at a cost to the under-sampled majority when machine learning algorithms are used to aid human decision-makers. In terms of performance measures, a variety of definitions exist from group- to individual- to procedural-fairness. I will discuss our null model for fairness and demonstrate how to use deviations from this null model to measure favoritism and prejudice in the data.


  • Assessing risk
  • Ranking
  • Statistical parity: among classifier

PARITY vs imperfect classifier – can’t satisfy all the three conditions

  • Precision
  • Tru positive
  • False parity

All classifier do not consider context or allow for uncertainty

  • Learning to Place within existing cases
  • Incentives/values of Human decision maker which incorporate in the decision external factors
  • Game-theoretical framework
  • How human exemplars make decision
  • Are algorithms value free?

Computational Ethics

  • Logically consistent principle
  • Camouflage – machine did not learn on the task but on the cloudiness of the sky
  • Model Cards for Model Reporting
  • The “undersampled majority”
  • Experience: Demonstration: Should we learn from demonstrations or from simulations?
  • Complex networks: guilt by association vs privilege and prejudice, individual fairness
  • Datasheets for Datasets
  • Algorithms are like prescription drug: Adverse events

Human vs Machine judgement

  • Performance measure – FAIRNESS: Group, individual
  • Normativity throughout the entire well-posed learning problem
  • Incentive/values
  • Human or machines to make decisions?
  • Laws are needed if algorithms are used as expert witness


Machine Learning for Biology and Medicine

Fri, October 30, 2:00 PM

Sriram Sankararaman, PhD | Professor, Computer Science | University of California – Los Angeles

Type: Talk, Focus Area: Machine Learning


Biology and medicine are deluged with data so that techniques from machine learning and statistics will increasingly play a key role in extracting insights from the vast quantities of data being generated. I will provide an overview of the modeling and inferential challenges that arise in these domains.

In the first part of my talk, I will focus on machine learning problems arising in the field of genomics. The cost of genome sequencing has decreased by over 100,000 fold over the last decade. Availability of genetic variation data from millions of individuals has opened up the possibility of using genetic information to identifying the cause of diseases, developing effective drugs, predicting disease risk and personalizing treatment. While genome-wide association studies offer a powerful paradigm to discovering disease-causing genes, the hidden genetic structure of human populations can confound these studies. I will describe statistical models that can infer this hidden structure and show how these inferences lead to novel insights into the genetic basis of diseases.

In the second part of my talk, I will discuss how the availability of large-scale electronic medical records is opening up the possibility of using machine learning in clinical settings. These electronic medical records are designed to capture a wide range of data associated with a patient including demographic information, laboratory tests, images, medications and clinical notes. Using electronic records from around 60,000 surgeries over five years in the UCLA hospital, I will describe efforts to use machine learning algorithms to predict mortality after surgery. Our results reveal that these algorithms can accurately predict mortality from information available prior to surgery indicating that automated predictive systems have great potential to augment clinical care.


Sriram Sankararaman is an assistant professor in the Departments of Computer Science, Human Genetics, and Computational Medicine at UCLA where he leads the machine learning and genomic lab. His research interests lie at the interface of computer science, statistics and biology and is interested in developing statistical machine learning algorithms to make sense of large-scale biomedical data and in using these tools to understand the interplay between evolution, our genomes and traits. He received a B.Tech. in Computer Science from the Indian Institute of Technology, Madras, a Ph.D. in Computer Science from UC Berkeley and was a post-doctoral fellow in Harvard Medical School before joining UCLA. He is a recipient of the Alfred P. Sloan Foundation fellowship (2017), Okawa Foundation grant (2017), the UCLA Hellman fellowship (2017), the NIH Pathway to Independence Award (2014), a Simons Research fellowship (2014), and a Harvard Science of the Human Past fellowship (2012) as well as the Northrop-Grumman Excellence in Teaching Award at UCLA (2019).

  • ML & BioMedicine

BioMedical data: high D, heterogeneous, noisy data

  1. Clinical Data & DL
  • Predict death after surgery – 1000 dealth complication, sepsis acout kidney injury
  • Mortality during and after surgery
  • collaboration: Anesthesiology, PeriOps, UCLA Health
  • Data warehouse – EMR 4/2013 – 12/2018
  • 60,000 patients in data: Age, height, weight, gender,ASA Status- input from physician

Pre-operative mortality risk prediction – False positive, missing data: Lab data was collected, what were the values

2% of admission associate with mortality

SMOTE: over-sampling of associate with risk

Learning setup: Temporal training-testing split, hyper parameter

Models: Logistics, Random forest, gradient-boosted trees

Feature sets: ASA status, surrugate-ASA

  • ASA Status – did not contribute  with it and without it the same
  • Lab values and timing of lab – is the most important festure.
  • RANDOM FOREST model was selected
  • Precision/recall curve
  • The model reduced number of patients flagged by around 20x

Open problemsL Interoperability, Learning over private data

2. Epidemiological dat and ML – Social distancing in COVID-19 Pandemic

  • Effectiveness of social distancing
  • SEIR
  • Average duration of infection
  • Susceptible-Exposed-Infectious-Removed (SEIR) model
  • R-naught applied to social distancing the ratio of Susceptible /Exposed is compared to Infectious/Removed the lowe the better
  • Social distancing-relaxation – Relaxation in 2022
  • COVID spread – estimate when SOcial distabcing need to END
  • UK, NY, Spain, France, Germany, Denmark
  • Hierarchical Bayesian model: Shared Global parameters, Location-specific, Observations
  • Hierarchical Bayesian model SEIR Model: Data generation process
  • Empirical Bayes: Maximize likelihood of the global parameters
  • Trajectory based on Model Fit
  • Estimation of uncertainty
  • End of Social distancing – time distribution around a mean
  • No seasonality, no infinite immunity, No vaccine
  • Quantify Uncertainty
  • Work with domain knowledge experts is great

The Bayesians are Coming! The Bayesians are Coming, to Time Series – Aric LaBarr

Fri, Oct 30, 2020 5:50 PM – 6:35 PM EDT

Aric LaBarr, Associate Professor of Analytics | Institute for Advanced Analytics at NC State University
With the computational advances over the past few decades, Bayesian analysis approaches are starting to be fully appreciated. Forecasting and time series also have Bayesian approaches and techniques, but most people are unfamiliar with them due to the immense popularity of Exponential Smoothing and autoregressive integrated moving average (ARIMA) classes of models. However, Bayesian modeling and time series analysis have a lot in common! Both are based on using historical information to help inform future modeling and decisions. Using past information is key to any time series analysis because the data typically evolves over time in a correlated way. Bayesian techniques rely on new data updating their models from previous instances for better estimates of posterior distributions. This talk will briefly introduce the differences between classical frequentist approaches of statistics to their Bayesian counterparts as well as the difference between time series data made for forecasting compared to traditional cross-sectional data. From there, it will compare the classical Exponential Smoothing and ARIMA class models of time series to Bayesian models with autoregressive components. Comparing the results of these models across the same data set allows the audience to see the potential benefits and disadvantages of using each of the techniques. This talk aims to allow people to update their own skill set in forecasting with these potentially Bayesian techniques. At the end, the talk explores the technique of model ensembling in a time series context. From these ensembles, the benefits of all types of models are potentially blended together. These models and their respective outputs will be displayed in R
  • Single Exponential Smoothing
  • ARIMA – long-memory models – Autoregressive AR
  • Moving Average (MA) model – short memory
  • Intergrated AR+MA = ARIMA

Learning Intended Reward Functions: Extracting all the Right Information from All the Right Places

Fri, October 30, 3:45 PM


Anca Dragan, PhD | Assistant Professor, EECS | Head | UC Berkeley | InterACT lab

Type: Talk, Focus Area: Deep Learning

Learning Intended Reward Functions: Extracting all the Right Information from All the Right Places


Content: AI work tends to focus on how to optimize a specified reward function, but rewards that lead to the desired behavior consistently are not so easy to specify. Rather than optimizing specified reward, which is already hard, robots have the much harder job of optimizing intended reward. While the specified reward does not have as much information as we make our robots pretend, the good news is that humans constantly leak information about what the robot should optimize. In this talk, we will explore how to read the right amount of information from different types of human behavior — and even the lack thereof.
Learning outcomes: After participating, you should be able to articulate the common pitfalls we face in defining an AI reward, loos, or objective function. You should also develop a basic understanding of the main algorithmic tools we have for avoiding these pitfalls.

Target audience: Participants with some AI experience, be in supervised or reinforcement learning.


Anca Dragan is an Assistant Professor in EECS at UC Berkeley, where she runs the InterACT lab. Her goal is to enable robots to work with, around, and in support of people. She works on algorithms that enable robots to a) coordinate with people in shared spaces, and b) learn what people want them to do. Anca did her PhD in the Robotics Institute at Carnegie Mellon University on legible motion planning. At Berkeley, she helped found the Berkeley AI Research Lab, is a co-PI for the Center for Human-Compatible AI, and has been honored by the Presidential Early Career Award for Scientists and Engineers (PECASE), the Sloan fellowship, the NSF CAREER award, the Okawa award, MIT’s TR35, and an IJCAI Early Career Spotlight.

  • Sequential decision making
  • defining what robots goal is
  • Autonomous car
  • AI = optimize intended rewards vs specified reward
  • parametrization of the reward function
  • Agent over-learn from specified rewards but under-learn from other sources
  • observing feedback and express the human feedback in observation (human) model
  • How can we model reward design/specification as a noisy and suboptiman process
  • Development vs deployment environment
  • Robot trust the development environment
  • good behavior incentivized reward
  • maximize winning, maximizing score, minimize winning, minimize score
  • model the demo as a reward-rational implicit
  • Human feedback as a reward-rational implicit choice
  • The state of the environment as a reward-rational implicit choice
  • task specification –>> reward



ODSC West Keynotes

Suchi Saria, PhD
Suchi Saria, PhD

Director Of The Machine Learning And Healthcare Lab, John C. Malone Endowed Chair, Founder Of Bayesian Health, MIT Technology Review’s 35 Innovators Under 35, And A World Economic Forum Young Global Leader

Johns Hopkins University

Jeannette M. Wing, PhD
Jeannette M. Wing, PhD

Avanessians Director Of The Data Science Institute, Professor Of Computer Science Columbia University, Former Corporate Vice President Microsoft, Former Assistant Director, National Science Foundation

Columbia University

Ion Stoica, PhD
Ion Stoica, PhD

Professor Of Computer Science, Head Of RISELab. Co-Founder Of Anyscale, Databricks, And Conviva Networks, ACM Fellow, SIGOPS Hall Of Fame Award (2015), SIGCOMM Test Of Time Award (2011)

UC Berkeley

Raluca Ada Popa, PhD
Raluca Ada Popa, PhD

Cybersecurity & Applied Cryptography Professor, MIT Technology Review’s 35 Under 35, Recipient Of Intel Early Career Faculty Honor Award, George M. Sprowls Award For Best MIT CS Doctoral Thesis, Co-Founder Of PreVeil

UC Berkeley

Zoubin Ghahramani, PhD
Zoubin Ghahramani, PhD

Chief Scientist, Founding Director Of The AlanTuring Institute, Prof. Of Information Engineering & Deputy Director Of The Leverhulme Centre For The Future Of Intelligence, Fellow Of St John’s College Cambridge And Of The Royal Society

Uber | The University of Cambridge

Azalia Mirhoseini, PhD
Azalia Mirhoseini, PhD

Senior Research Scientist At Google Brain. Advisor At Cmorq. Co-Founder Machine Learning For Systems Moonshot At Brain Focusing On Deep RL. MIT Technology Review 35 Under 35 Award

Google Brain

Marinela Profi
Marinela Profi

Global Strategist For AI, Global Ambassador For The Women Tech Network, Author Of “Mastering Model Lifecycle Orchestration: An Interactive Guide”


John Montgomery
John Montgomery

Corporate Vice President, Visual Studio, Microsoft Azure AI Lead, Former Chief Information Office At Imagine Publishing, Author At Visual Studio


Ben Taylor,PhD
Ben Taylor,PhD

Chief AI Evangelist, Deep Learning & HPC Expert, Co-Founder & Chief Scientist At Zeff.Ai, Former Chief Scientist At HireVue, ProductCraft Contributor



Open Data Science


Pre-conference Day

ODSC BootCamp

10:00 am

Fundamentals | Morning Sessions


Choose from 6 foundation sessions in Programming, Mathematics for Data Science, and Statistics

Virtual break


11:00 am
12:00 pm
1:00 pm
2:00 pm

Fundamentals | Afternoon Sessions


Choose from 6 foundation sessions in Programming, Mathematics for Data Science, and Statistics

3:00 pm
4:00 pm
5:00 pm


Open Data Science


Day 1

ODSC Trainings, Workshops & AI Expo, Ai x and Ai x Keynotes

10:00 am

Hands-on Training and Workshops


Choose from Five 3.5 hours Training Sessions and Six 90 minute Workshop Sessions

Networking break


Morning Partners Demo Talks


Choose from 12 Partners Sessions

11:00 am

Virtual Exhibitor Showcase


Visit 30+ Virtual Partners booth

12:00 pm
1:00 pm

Networking break


2:00 pm

Hands-on Training and Workshops


Choose from Five 3.5 hours Training Sessions and Six 90 minute Workshop Sessions

Afternoon Partners Demo Talks


Choose from 12 Partners Sessions

3:00 pm
4:00 pm
5:00 pm


Open Data Science


Day 2

ODSC Keynotes, Talks, Trainings, Workshops, AI Expo & Events

9:00 am

ODSC Keynote


10:00 am

Morning Hands-on Training and Workshops


Choose from Five 3.5 hours Training Sessions and Six 90 minute Workshop Sessions

Networking break


Virtual Exhibitor Showcase & Partners Demo Talks


Choose from 12 Morning Partners Sessions & Visit 25+ Virtual Partners booth

11:00 am

Breakout Talk Sessions


Choose from 7 talk presentations

12:00 pm
1:00 pm

Networking break


2:00 pm

Afternoon Hands-on Training and Workshops


Choose from Five 3.5 hours Training Sessions and Six 90 minute Workshop Sessions

Virtual Exhibitor Showcase & Partners Demo Talks


Choose from 12 Afternoon Partners Sessions & Visit 25+ Virtual Partners booth

Breakout Talk Sessions


Choose from 7 talk presentations

3:00 pm
4:00 pm
5:00 pm


Open Data Science


Day 3

ODSC Keynotes, Talks, Trainings, Workshops, Events, & Career Expo

9:00 am

ODSC Keynote


10:00 am

Morning Hands-on Training and Workshops


Choose from Five 3.5 hours Training Sessions and Six 90 minute Workshop Sessions

Virtual Lunch & Networking break


Virtual Career Expo


Get n touch with 30+ Hiring Partners and choose from 12 Mentor Talks

11:00 am

Breakout Talk Sessions


Choose from 7 talk presentations

12:00 pm
1:00 pm

Virtual Lunch & Networking break


2:00 pm

Afternoon Hands-on Training and Workshops


Choose from Five 3.5 hours Training Sessions and Six 90 minute Workshop Sessions

Breakout Talk Sessions


Choose from 7 talk presentations

3:00 pm
4:00 pm
5:00 pm



Click for
more info

Nadja Herger, PhD


Click for
more info

Viktoriia Samatova


Click for
more info

Nina Hristozova


Click for
more info

Daniel Whitenack, PhD


David Talby: NLP for healthcare
Click for
more info

David Talby, PhD


Click for
more info

Tian Zheng, PhD


Click for
more info

Phoebe Liu


Click for
more info

Frank Zhao


TOPICS – trends in NLP, including pre-trained models, with use-cases focusing on deep learning, speech-to text, and semantic search.

  • Natural Language Processing
  • NLP Transformers
  • Pre-trained Models
  • Text Analytics
  • Natural Language Understanding
  • Sentiment Analysis
  • Natural Language Generation
  • Speech Recognition
  • Named Entity Extraction


  • BERT
  • XLNet
  • GPT-2
  • Transformers
  • Word2Vec
  • Deep Learning Models
  • RNN & LSTM
  • Machine Learning Models
  • ULMFiT
  • Transfer Learning


  • Tensorflow 2.0
  • Hugging Face Transformers
  • PyTorch
  • Theano
  • SpaCy
  • NLTK
  • AllenNLP
  • Stanford CoreNLP
  • Keras

Read Full Post »