Feeds:
Posts
Comments

Archive for the ‘Machine Learning’ Category

Data Science: Step by Step – A Resource for LPBI Group One-Year Internship in IT, IS, DS

Reporter: Aviva Lev-Ari, PhD, RN

9 free Harvard courses: learning Data Science

In this article, I will list 9 free Harvard courses that you can take to learn data science from scratch. Feel free to skip any of these courses if you already possess knowledge of that subject.

Step 1: Programming

The first step you should take when learning data science is to learn to code. You can choose to do this with your choice of programming language?—?ideally Python or R.

If you’d like to learn R, Harvard offers an introductory R course created specifically for data science learners, called Data Science: R Basics.

This program will take you through R concepts like variables, data types, vector arithmetic, and indexing. You will also learn to wrangle data with libraries like dplyr and create plots to visualize data.

If you prefer Python, you can choose to take CS50’s Introduction to Programming with Python offered for free by Harvard. In this course, you will learn concepts like functions, arguments, variables, data types, conditional statements, loops, objects, methods, and more.

Both programs above are self-paced. However, the Python course is more detailed than the R program, and requires a longer time commitment to complete. Also, the rest of the courses in this roadmap are taught in R, so it might be worth learning R to be able to follow along easily.

Step 2: Data Visualization

Visualization is one of the most powerful techniques with which you can translate your findings in data to another person.

With Harvard’s Data Visualization program, you will learn to build visualizations using the ggplot2 library in R, along with the principles of communicating data-driven insights.

Step 3: Probability

In this course, you will learn essential probability concepts that are fundamental to conducting statistical tests on data. The topics taught include random variables, independence, Monte Carlo simulations, expected values, standard errors, and the Central Limit Theorem.

The concepts above will be introduced with the help of a case study, which means that you will be able to apply everything you learned to an actual real-world dataset.

Step 4: Statistics

After learning probability, you can take this course to learn the fundamentals of statistical inference and modelling.
This program will teach you to define population estimates and margin of errors, introduce you to Bayesian statistics, and provide you with the fundamentals of predictive modeling.

Step 5: Productivity Tools (Optional)

I’ve included this project management course as optional since it isn’t directly related to learning data science. Rather, you will be taught to use Unix/Linux for file management, Github, version control, and creating reports in R.

The ability to do the above will save you a lot of time and help you better manage end-to-end data science projects.

Step 6: Data Pre-Processing

The next course in this list is called Data Wrangling, and will teach you to prepare data and convert it into a format that is easily digestible by machine learning models.

You will learn to import data into R, tidy data, process string data, parse HTML, work with date-time objects, and mine text.

As a data scientist, you often need to extract data that is publicly available on the Internet in the form of a PDF document, HTML webpage, or a Tweet. You will not always be presented with clean, formatted data in a CSV file or Excel sheet.

By the end of this course, you will learn to wrangle and clean data to come up with critical insights from it.

Step 7: Linear Regression

Linear regression is a machine learning technique that is used to model a linear relationship between two or more variables. It can also be used to identify and adjust the effect of confounding variables.

This course will teach you the theory behind linear regression models, how to examine the relationship between two variables, and how confounding variables can be detected and removed before building a machine learning algorithm.

Step 8: Machine Learning

Finally, the course you’ve probably been waiting for! Harvard’s machine learning program will teach you the basics of machine learning, techniques to mitigate overfitting, supervised and unsupervised modelling approaches, and recommendation systems.

Step 9: Capstone Project

After completing all the above courses, you can take Harvard’s data science capstone project, where your skills in data visualization, probability, statistics, data wrangling, data organization, regression, and machine learning will be assessed.

With this final project, you will get the opportunity to put together all the knowledge learnt from the above courses and gain the ability to complete a hands-on data science project from scratch.

Note: All the courses above are available on an online learning platform from edX and can be audited for free. If you want a course certificate, however, you will have to pay for one.

Building a data science learning roadmap with free courses offered by MIT.

8 Free MIT Courses to Learn Data Science Online

 enrolled into an undergraduate computer science program and decided to major in data science. I spent over $25K in tuition fees over the span of three years, only to graduate and realize that I wasn’t equipped with the skills necessary to land a job in the field.

I barely knew how to code, and was unclear about the most basic machine learning concepts.

I took some time out to try and learn data science myself — with the help of YouTube videos, online courses, and tutorials. I realized that all of this knowledge was publicly available on the Internet and could be accessed for free.

It came as a surprise that even Ivy League universities started making many of their courses accessible to students worldwide, for little to no charge. This meant that people like me could learn these skills from some of the best institutions in the world, instead of spending thousands of dollars on a subpar degree program.

In this article, I will provide you with a data science roadmap I created using only freely available MIT online courses.

Step 1: Learn to code

I highly recommend learning a programming language before going deep into the math and theory behind data science models. Once you learn to code, you will be able to work with real-world datasets and get a feel of how predictive algorithms function.

MIT Open Courseware offers a beginner-friendly Python program for beginners, called Introduction to Computer Science and Programming.

This course is designed to help people with no prior coding experience to write programs to tackle useful problems.

Step 2: Statistics

Statistics is at the core of every data science workflow — it is required when building a predictive model, analyzing trends in large amounts of data, or selecting useful features to feed into your model.

MIT Open Courseware offers a beginner-friendly course called Introduction to Probability and Statistics. After taking this course, you will learn the basic principles of statistical inference and probability. Some concepts covered include conditional probability, Bayes theorem, covariance, central limit theorem, resampling, and linear regression.

This course will also walk you through statistical analysis using the R programming language, which is useful as it adds on to your tool stack as a data scientist.

Another useful program offered by MIT for free is called Statistical Thinking and Data Analysis. This is another elementary course in the subject that will take you through different data analysis techniques in Excel, R, and Matlab.

You will learn about data collection, analysis, different types of sampling distributions, statistical inference, linear regression, multiple linear regression, and nonparametric statistical methods.

Step 3: Foundational Math Skills

Calculus and linear algebra are two other branches of math that are used in the field of machine learning. Taking a course or two in these subjects will give you a different perspective of how predictive models function, and the working behind the underlying algorithm.

To learn calculus, you can take Single Variable Calculus offered by MIT for free, followed by Multivariable Calculus.

Then, you can take this Linear Algebra class by Prof. Gilbert Strang to get a strong grasp of the subject.

All of the above courses are offered by MIT Open Courseware, and are paired with lecture notes, problem sets, exam questions, and solutions.

Step 4: Machine Learning

Finally, you can use the knowledge gained in the courses above to take MIT’s Introduction to Machine Learning course. This program will walk you through the implementation of predictive models in Python.

The core focus of this course is in supervised and reinforcement learning problems, and you will be taught concepts such as generalization and how overfitting can be mitigated. Apart from just working with structured datasets, you will also learn to process image and sequential data.

MIT’s machine learning program cites three pre-requisites — Python, linear algebra, and calculus, which is why it is advisable to take the courses above before starting this one.

Are These Courses Beginner-Friendly?

Even if you have no prior knowledge of programming, statistics, or mathematics, you can take all the courses listed above.

MIT has designed these programs to take you through the subject from scratch. However, unlike many MOOCs out there, the pace does build up pretty quickly and the courses cover a large depth of information.

Due to this, it is advisable to do all the exercises that come with the lectures and work through all the reading material provided.

SOURCE

Natassha Selvaraj is a self-taught data scientist with a passion for writing. You can connect with her on LinkedIn.

https://www.kdnuggets.com/2022/03/8-free-mit-courses-learn-data-science-online.html

Read Full Post »

2022 EmTechDigital @MIT, March 29-30, 2022

Real Time Coverage: Aviva Lev-Ari, PhD, RN 

#EmTechDigital

@AVIVA1950

@pharma_BI

@techreview

SPEAKERS

https://event.technologyreview.com/emtech-digital-2022/speakers

Ali
Alvi

Turing Group Program Manager

Microsoft

Refik
Anadol

CEO, RAS Lab; Lecturer

UCLA

Lauren
Bennett

Group Software Engineering Lead, Spatial Analysis and Data Science

Esri

Elizabeth
Bramson-Boudreau

CEO

MIT Technology Review

Tara
Chklovski

Founder & CEO

Technovation

Sheldon
Fernandez

CEO

DarwinAI

David
Ferrucci

Founder, CEO, & Chief Scientist

Elemental Cognition

Anthony
Green

Podcast Producer

MIT Technology Review

Agrim
Gupta

PhD Student, Stanford Vision and Learning Lab

Stanford University

Mike
Haley

VP of Research

Autodesk

Will Douglas
Heaven

Senior Editor for AI

MIT Technology Review

Natasha
Jaques

Senior Research Scientist

Google Brain

Tony
Jebara

VP of Engineering and Head of Machine Learning

Spotify

Clinton
Johnson

Racial Equity Unified Team Lead

Esri

Danny
Lange

SVP of Artificial Intelligence

Unity Technologies

Julia (Xing)
Li

Deputy General Manager

Baidu USA

Darcy
MacClaren

Senior Vice President, Digital Supply Chain

SAP North America

Haniyeh
Mahmoudian

Global AI Ethicist

DataRobot

Andrew
Moore

GM and VP, Google Cloud AI

Google

Mira
Murati

SVP, Research, Product, & Partnerships

OpenAI

Prem
Natarajan

Vice President Alexa AI, Head of NLU

Amazon

Andrew
Ng

Founder and CEO

Landing AI

Amy
Nordrum

Editorial Director, Special Projects & Operations

MIT Technology Review

Kavitha
Prasad

VP & GM, Datacenter, AI and Cloud Execution and Strategy

Intel Corporation

Bali
Raghavan

Head of Engineering

Forward

Rajiv
Shah

Principal Data Scientist

Snorkel AI

Sameena
Shah

Managing Director, J.P. Morgan AI Research

JP Morgan Chase

David
Simchi-Levi

Director, Data Science Lab

MIT

Jennifer
Strong

Senior Editor for Podcasts and Live Journalism

MIT Technology Review

Fiona
Tan

CTO

Wayfair

Zenna
Tavares

Research Scientist, Columbia University; Co-Founder

Basis

Nicol
Turner Lee

Director, Center for Technology Innovation

Brookings Institution

Raquel
Urtasun

Founder & CEO

Waabi

Oriol
Vinyals

Principal Scientist

DeepMind

MIT Inside Track

David
Cox

IBM Director

MIT-IBM Watson AI Lab

Luba
Elliott

Curator, Producer, and Researcher

Creative AI

Charlotte
Jee

Reporter, News

MIT Technology Review

Naveen
Kamat

Executive Director, Data and AI Services

Kyndryl

Joseph
Lehar

Senior Vice President, R&D Strategy

Owkin

Stefanie
Mueller

Associate Professor

MIT CSAIL

Jianxiong
Xiao

Founder and CEO

AutoX

TUESDAY, MARCH 29

 

Data-Centric AI

Better Data, Better AI

Data powers AI. Good data can mean the difference between an impactful solution or one that never gets off the ground. Re-assess the foundational AI questions to ensure your data is working for, not against, you.

Innovation to Reality

The challenges of implementing AI are many. Avoid the common pitfalls with real-world case studies from leaders who have successfully turned their AI solutions into reality.

Harness What’s Possible at the Edge

With its potential for near instantaneous decision making, pioneers are moving AI to the edge. We examine the pros and cons of moving AI decisions to the edge, with the experts getting it right.

Generative AI Solutions

The use of generative AI to boost human creativity is breaking boundaries in creative areas previously untouched by AI. We explore the intersection of data and algorithms enabling collaborative AI processes to design and create.

Day 1: Data-Centric AI (9:00 a.m. – 5:20 p.m.)

Day 1: Data-Centric AI (9:00 a.m. – 5:20 p.m.)

9:00 AM

Welcome Remarks

Will Douglas Heaven

Senior Editor for AI, MIT Technology Review

Better Data, Better AI (9:10 a.m. – 10:35 a.m.)

Data powers AI. Good data can mean the difference between an impactful solution or one that never gets off the ground. Re-assess the foundational AI questions to ensure your data is working for, not against, you.

9:10 AM

Empowering Data-Centric AI

Data is the most under-valued and de-glamorized aspect of AI. Learn why shifting the focus from model/algorithm development to quality of the data is the next and most efficient, way to improve the decision-making abilities of AI.

Andrew Ng

Founder and CEO, Landing AI

9:40 AM

The Mechanics of Data-First AI

Data labeling is key to determining the success or failure of AI applications. Learn how to implement a data-first approach that can transform AI inference, resulting in better models that make better decisions.

Rajiv Shah

Principal Data Scientist, Snorkel AI

10:10 AM

Thought Leadership in Responsible AI

Question the status quo. Build stakeholder trust. These are foundational elements of thought leadership in AI. Explore how organizations can use their data and algorithms in ethical and responsible ways while building bigger and more effective systems.

Haniyeh Mahmoudian

Global AI Ethicist, DataRobot

Mainstage Break (10:35 a.m. – 11:05 a.m.)

Networking and refreshments for our live audience and a selection of curated content for those tuning in virtually.

10:35 AM

MIT Inside Track: From AI Startup to Tech “Unicorn” (available online only)

With its next-generation machine learning models fueling precision medicine, French biotech company, Owkin, captured the attention of the pharma industry. Learn how they did it and get tips to navigate the complex task of scaling your innovation.

Joseph Lehar

Senior Vice President, R&D Strategy, Owkin

Networking Break

Networking and refreshments for our live audience.

Innovation to Reality (11:05 a.m. – 12:30 p.m.)

The challenges of implementing AI are many. Avoid the common pitfalls with real-world case studies from leaders who have successfully turned their AI solutions into reality.

11:05 AM

Secrets of Successful AI Deployments

Deploying AI in real-world environments benefits from human input before and during implementation. Get an inside look at how organizations can ensure reliable results with the key questions and competing needs that should be considered when implementing AI solutions.

Andrew Moore

GM and VP, Google Cloud AI, Google

11:35 AM

From Research Lab to Real World

AI is evolving from the research lab into practical real world applications. Learn what issues should be top of mind for businesses, consumers, and researchers as we take a deep dive into AI solutions that increase modern productivity and accelerate intelligence transformation.

Julia (Xing) Li

Deputy General Manager, Baidu USA

12:00 PM

Closing the 20% Performance Gap

Getting AI to work 80% of the time is relatively straightforward, but trustworthy AI requires deployments that work 100% of the time. Unpack some of the biggest challenges that come up when eliminating the 20% gap.

Bali Raghavan

Head of Engineering, Forward

Lunch and Networking Break (12:30 p.m. – 1:30 p.m.)

12:30 PM

Lunch and Networking Break

Lunch served at the MIT Media Lab and a selection of curated content for those tuning in virtually.

Harness What’s Possible at the Edge (1:30 p.m. – 3:15 p.m.)

With its potential for near instantaneous decision making, pioneers are moving AI to the edge. We examine the pros and cons of moving AI decisions to the edge, with the experts getting it right.

1:30 PM

AI Integration Across Industries – Presented by Intel

To create sustainable business impact, AI capabilities need to be tailored and optimized to an industry or organization’s specific requirements and infrastructure model. Hear how customers’ challenges across industries can be addressed in any compute environment from the cloud to the edge with end-to-end hardware and software optimization.

Kavitha Prasad

VP & GM, Datacenter, AI and Cloud Execution and Strategy, Intel Corporation

Elizabeth Bramson-Boudreau

CEO, MIT Technology Review

1:55 PM

Explainability at the Edge

Decision making has moved from the edge to the cloud before settling into a hybrid setup for many AI systems. Through the examination of key use-cases, take a deep dive into understanding the benefits and detractors of operating a machine-learning system at the point of inference.

Sheldon Fernandez

CEO, DarwinAI

2:25 PM

AI Experiences at the Edge

Enable your organization to transform customer experiences through AI at the edge. Learn about the required technologies, including teachable and self-learning AI, that are needed for a successful shift to the edge, and hear how deploying these technologies at scale can unlock richer, more responsive experiences.

Prem Natarajan

Vice President Alexa AI, Head of NLU, Amazon

2:50 PM

The Road Ahead

Reimagine AI solutions as a unified system, instead of individual components. Through the lens of autonomous vehicles, discover the pros and cons of using an all-inclusive AI-first approach that includes AI decision-making at the edge and see how this thinking can be applied across industry.

Raquel Urtasun

Founder & CEO, Waabi

Mainstage Break (3:15 p.m. – 3:45 p.m.)

Networking and refreshments for our live audience and a selection of curated content for those tuning in virtually.

3:15 PM

Networking Break

Networking and refreshments for our live audience.

MIT Inside Track: The Impact of Creative AI (available online only)

Advances in machine learning are enabling artists and creative technologists to think about and use AI in new ways. Discuss the concept of creative AI and look at project examples from London’s art scene that illustrate the various ways creative AI is bridging the gap between the traditional art world and the latest technological innovations.

Luba Elliott

Curator, Producer, and Researcher, Creative AI

Generative AI Solutions (3:45 p.m. – 5:10 p.m.)

The use of generative AI to boost human creativity is breaking boundaries in creative areas previously untouched by AI. We explore the intersection of data and algorithms enabling collaborative AI processes to design and create.

3:45 PM

Enhancing Design through Generative AI

Change the design problem with AI. The creative nature of generative AI enhances design capabilities, finding efficiencies and opportunities that humans alone might not conceive. Explore business applications including project planning, construction, and physical design.

Mike Haley

VP of Research, Autodesk

4:15 PM

Using Synthetic Data and Simulations

Deep learning is data hungry technology. Manually labelled training data has become cost prohibitive and time-consuming. Get a glimpse at how interactive large-scale synthetic data generation can accelerate the AI revolution, unlocking the potential of data-driven artificial intelligence.

Danny Lange

SVP of Artificial Intelligence, Unity Technologies

4:40 PM

The Art of AI

Push beyond the typical uses of AI. Explore the nexus of art, technology, and human creativity through the unique innovation of kinetic data sculptures that use machines to give physical context and shape to data to rethink how we engage with the physical world.

Refik Anadol

CEO, RAS Lab; Lecturer, UCLA

Last Call with the Editors (5:10 p.m. – 5:20 p.m.)

5:10 PM

Last Call with the Editors

Before we wrap day 1, join our last call with all of our editors to get their analysis on the day’s topics, themes, and guests.

Networking Reception (5:20 p.m. – 6:20 p.m.)

WEDNESDAY, MARCH 30

Evolving the Algorithms

What’s Next for Deep Learning

Deep learning algorithms have powered most major AI advances of the last decade. We bring you into the top innovation labs to see how they are advancing their deep learning models to find out just how much more we can get out of these algorithms.

AI in Day-To-Day Business

Many organizations are already using AI internally in their day-to-day operations, in areas like cybersecurity, customer service, finance, and manufacturing. We examine the tools that organizations are using when putting AI to work.

Making AI Work for All

As AI increasingly underpins our lives, businesses, and society, we must ensure that AI must work for everyone – not just those represented in datasets, and not just 80% of the time. Examine the challenges and solutions needed to ensure AI works fairly, for all.

Envisioning the Next AI

Some business problems can’t be solved with current deep learning methods. We look at what’s around the corner at the new approaches and most revolutionary ideas propelling us toward the next stage in AI evolution.

Day 2: Evolving the Algorithms (9:00 a.m. – 5:25 p.m.)

9:00 AM

Welcome Remarks

Will Douglas Heaven

Senior Editor for AI, MIT Technology Review

What’s Next for Deep Learning (9:10 a.m. – 10:25 a.m.)

Deep learning algorithms have powered most major AI advances of the last decade. We bring you into the top innovation labs to see how they are advancing their deep learning models to find out just how much more we can get out of these algorithms.

9:10 AM

Transforming Traditional Algorithms

Transformer-based language models are revolutionizing the way neural networks process natural language. This deep dive looks at how organizations can put their data to work using transformer models. We consider the problems that business may face as these massive models mature, including training needs, managing parallel processing at scale, and countering offensive data.

Ali Alvi

Turing Group Program Manager, Microsoft

9:35 AM

Human-like Problem Solving

Critical thinking may be one step closer for AI by combining large-scale transformers with smart sampling and filtering. Get an early look at how AlphaCode’s entry into competitive programming may lead to a human-like capacity for AI to write original code that solves unforeseen problems.

Oriol Vinyals

Principal Scientist, DeepMind

10:00 AM

Aligning AI Technologies at Scale

As advanced AI systems gain greater capabilities in our search for artificial general intelligence, it’s critical to teach them how to understand human intentions. Look at the latest advancements in AI systems and how to ensure they can be truthful, helpful, and safe.

Mira Murati

SVP, Research, Product, & Partnerships, OpenAI

Mainstage Break (10:25 a.m. – 10:55 a.m.)

Networking and refreshments for our live audience and a selection of curated content for those tuning in virtually.

10:25 AM

Networking Break

Networking and refreshments for our live audience.

Business-Ready Data Holds the Key to AI Democratization – Presented by Kyndryl

Good data is the bedrock of a self-service data consumption model, which in turn unlocks insights, analytics, personalization at scale through AI. Yet many organizations face immense challenges setting up a robust data foundation. Dive into a pragmatic perspective on abstracting the complexity and untangling the conflicts in data management for better AI.

Naveen Kamat

Executive Director, Data and AI Services, Kyndryl

AI in Day-To-Day Business (10:55 a.m. – 12:20 p.m.)

Many organizations are already using AI internally in their day-to-day operations, in areas like cybersecurity, customer service, finance, and manufacturing. We examine the tools that organizations are using when putting AI to work.

10:55 AM

Improving Business Processes with AI

Effectively operationalized AI/ML can unlock untapped potential in your organization. From enhancing internal processes to managing the customer experience, get the pragmatic advice and takeaways leaders need to better understand their internal data to achieve impactful results.

Fiona Tan

CTO, Wayfair

11:25 AM

Accelerating the Supply Chain

Use AI to maximize reliability of supply chains. Learn the dos and don’ts to managing key processes within your supply chain, including workforce management, streamlining and simplification, and reaping the full value of your supply chain solutions.

Darcy MacClaren

Senior Vice President, Digital Supply Chain, SAP North America

David Simchi-Levi

Director, Data Science Lab, MIT

11:55 AM

Putting Recommendation Algorithms to Work

Machine and reinforcement learning enable Spotify to deliver the right content to the right listener at the right time, allowing for personalized listening experiences that facilitate discovery at a global scale. Through user interactions, algorithms suggest new content and creators that keep customers both happy and engaged with the platform. Dive into the details of making better user recommendations.

Tony Jebara

VP of Engineering and Head of Machine Learning, Spotify

Lunch and Networking Break (12:20 p.m. – 1:15 p.m.)

12:20 PM

Lunch and Networking Break

Lunch served at the MIT Media Lab and a selection of curated content for those tuning in virtually.

Making AI Work for All (1:15 p.m. – 2:35 p.m.)

As AI increasingly underpins our lives, businesses, and society, we must ensure that AI must work for everyone – not just those represented in datasets, and not just 80% of the time. Examine the challenges and solutions needed to ensure AI works fairly, for all.

1:15 PM

Mapping Equity

Walk through the practical steps to map and understand the nuances, outliers, and special cases in datasets. Get tips to ensure ethical and trustworthy approaches to training AI systems that grow in scope and scale within a business.

Lauren Bennett

Group Software Engineering Lead, Spatial Analysis and Data Science, Esri

Clinton Johnson

Racial Equity Unified Team Lead, Esri

1:45 PM

Bridging the AI Accessibility Gap

Get an inside look at the long- and short-term benefits of addressing inequities in AI opportunities, ranging from educating the tech youth of the future to a 10,000-foot view on what it will take to ensure that equity top is of mind within society and business alike.

Tara Chklovski

Founder & CEO, Technovation

2:10 PM

The AI Policies We Need

Public policies can help to make AI more equitable and ethical for all. Examine how policies could impact corporations and what it means for building internal policies, regardless of what government adopts. Identify actionable ideas to best move policies forward for the widest benefit to all.

Nicol Turner Lee

Director, Center for Technology Innovation, Brookings Institution

Mainstage Break (2:35 p.m. – 3:05 p.m.)

Networking and refreshments for our live audience and a selection of curated content for those tuning in virtually.

2:35 PM

Networking Break

Networking and refreshments for our live audience.

MIT Inside Track: Accelerating the Advent of Autonomous Driving (available online only)

From the U.S. to China, the global robo-taxi race is gaining traction with consumers and regulators alike. Go behind the scenes with AutoX – a Level 4 driving technology company – and hear how it overcame obstacles while launching the world’s second and China’s first public, fully driverless robo-taxi service.

Jianxiong Xiao

Founder and CEO, AutoX

Envisioning the Next AI (3:05 p.m. – 4:50 p.m.)

Some business problems can’t be solved with current deep learning methods. We look at what’s around the corner at the new approaches and most revolutionary ideas propelling us toward the next stage in AI evolution.

3:05 PM

How AI Is Powering the Future of Financial Services – Presented by JP Morgan Chase

The use of AI in finance is gaining traction as organizations realize the advantages of using algorithms to streamline and improve the accuracy of financial tasks. Step through use cases that examine how AI can be used to minimize financial risk, maximize financial returns, optimize venture capital funding by connecting entrepreneurs to the right investors; and more.

Sameena Shah

Managing Director, J.P. Morgan AI Research, JP Morgan Chase

3:30 PM

Evolution of Mind and Body

In a study of simulated robotic evolution, it was observed that more complex environments and evolutionary changes to the robot’s physical form accelerated the growth of robot intelligence. Examine this cutting-edge research and decipher what this early discovery means for the next generation of AI and robotics.

Agrim Gupta

PhD Student, Stanford Vision and Learning Lab, Stanford University

4:00 PM

A Path to Human-like Common Sense

Understanding human thinking and reasoning processes could lead to more general, flexible and human-like artificial intelligence. Take a close look at the research building AI inspired by human common-sense that could create a new generation of tools for complex decision-making.

Zenna Tavares

Research Scientist, Columbia University; Co-Founder, Basis

4:25 PM

Social Learning Bots

Look under the hood at this innovative approach to AI learning with multi-agent and human-AI interactions. Discover how bots work together and learn together through personal interactions. Recognize the future implications for AI, plus the benefits and obstacles that may come from this new process.

Natasha Jaques

Senior Research Scientist, Google Brain

Closing Segment (4:50 p.m. – 5:25 p.m.)

4:50 PM

Pulling Back the Curtain on AI

David Ferrucci was the principal investigator for the team that led IBM Watson to its landmark Jeopardy success, awakening the world to the possibilities of AI. We pull back the curtain on AI for a wide-ranging discussion on explicable models, and the next generation of human and machine collaboration creating AI thought partners with limitless applications.

David Ferrucci

Founder, CEO, & Chief Scientist, Elemental Cognition

5:15 PM

Closing Remarks

Closing Toast (5:25 p.m. – 5:45 p.m.)

Read Full Post »

AI enabled Drug Discovery and Development: The Challenges and the Promise

Reporter: Aviva Lev-Ari, PhD, RN

 

Early Development

Caroline Kovac (the first IBM GM of Life Sciences) is the one who started in silico development of drugs in 2000 using a big db of substances and computer power. She transformed an idea into $2b business. Most of the money was from big pharma. She was asking what is are the new drugs they are planning to develop and provided the four most probable combinations of substances, based on in Silicon work. 

Carol Kovac

General Manager, Healthcare and Life Sciences, IBM

from speaker at conference on 2005

Carol Kovac is General Manager of IBM Healthcare and Life Sciences responsible for the strategic direction of IBM′s global healthcare and life sciences business. Kovac leads her team in developing the latest information technology solutions and services, establishing partnerships and overseeing IBM investment within the healthcare, pharmaceutical and life sciences markets. Starting with only two employees as an emerging business unit in the year 2000, Kovac has successfully grown the life sciences business unit into a multi-billion dollar business and one of IBM′s most successful ventures to date with more than 1500 employees worldwide. Kovac′s prior positions include general manager of IBM Life Sciences, vice president of Technical Strategy and Division Operations, and vice president of Services and Solutions. In the latter role, she was instrumental in launching the Computational Biology Center at IBM Research. Kovac sits on the Board of Directors of Research!America and Africa Harvest. She was inducted into the Women in Technology International Hall of Fame in 2002, and in 2004, Fortune magazine named her one of the 50 most powerful women in business. Kovac earned her Ph.D. in chemistry at the University of Southern California.

SOURCE

https://www.milkeninstitute.org/events/conferences/global-conference/2005/speaker-detail/1536

 

In 2022

The use of artificial intelligence in drug discovery, when coupled with new genetic insights and the increase of patient medical data of the last decade, has the potential to bring novel medicines to patients more efficiently and more predictably.

WATCH VIDEO

https://www.youtube.com/watch?v=b7N3ijnv6lk

SOURCE

https://engineering.stanford.edu/magazine/promise-and-challenges-relying-ai-drug-development?utm_source=Stanford+ALL

Conversation among three experts:

Jack Fuchs, MBA ’91, an adjunct lecturer who teaches “Principled Entrepreneurial Decisions” at Stanford School of Engineering, moderated and explored how clearly articulated principles can guide the direction of technological advancements like AI-enabled drug discovery.

Kim Branson, Global head of AI and machine learning at GSK.

Russ Altman, the Kenneth Fong Professor of Bioengineering, of genetics, of medicine (general medical discipline), of biomedical data science and, by courtesy, of computer science.

 

Synthetic Biology Software applied to development of Galectins Inhibitors at LPBI Group

 

The Map of human proteins drawn by artificial intelligence and PROTAC (proteolysis targeting chimeras) Technology for Drug Discovery

Curators: Dr. Stephen J. Williams and Aviva Lev-Ari, PhD, RN

Using Structural Computation Models to Predict Productive PROTAC Ternary Complexes

Ternary complex formation is necessary but not sufficient for target protein degradation. In this research, Bai et al. have addressed questions to better understand the rate-limiting steps between ternary complex formation and target protein degradation. They have developed a structure-based computer model approach to predict the efficiency and sites of target protein ubiquitination by CRNB-binding PROTACs. Such models will allow a more complete understanding of PROTAC-directed degradation and allow crafting of increasingly effective and specific PROTACs for therapeutic applications.

Another major feature of this research is that it a result of collaboration between research groups at Amgen, Inc. and Promega Corporation. In the past commercial research laboratories have shied away from collaboration, but the last several years have found researchers more open to collaborative work. This increased collaboration allows scientists to bring their different expertise to a problem or question and speed up discovery. According to Dr. Kristin Riching, Senior Research Scientist at Promega Corporation, “Targeted protein degraders have broken many of the rules that have guided traditional drug development, but it is exciting to see how the collective learnings we gain from their study can aid the advancement of this new class of molecules to the clinic as effective therapeutics.”

Literature Reviewed

Bai, N. , Riching K.M. et al. (2022) Modeling the CRLRA ligase complex to predict target protein ubiquitination induced by cereblon-recruiting PROTACsJ. Biol. Chem.

The researchers NanoBRET assays as part of their model validation. Learn more about NanoBRET technology at the Promega.com website.

SOURCE

https://www.promegaconnections.com/protac-ternary-complex/?utm_campaign=ms-2022-pharma_tpd&utm_source=linkedin&utm_medium=Khoros&utm_term=sf254230485&utm_content=030822ct-blogsf254230485&sf254230485=1

Read Full Post »

 

Medical Startups – Artificial Intelligence (AI) Startups in Healthcare

Reporters: Stephen J. Williams, PhD and Aviva Lev-Ari, PhD, RN and Shraga Rottem, MD, DSc,

The motivation for this post is two fold:

First, we are presenting an application of AI, NLP, DL to our own medical text in the Genomics space. Here we present the first section of Part 1 in the following book. Part 1 has six subsections that yielded 12 plots. The entire Book is represented by 38 x 2 = 76 plots.

Second, we bring to the attention of the e-Reader the list of 276 Medical Startups – Artificial Intelligence (AI) Startups in Healthcare as a hot universe of R&D activity in Human Health.

Third, to highlight one academic center with an AI focus

ETH Logo
 
ETH AI Center - Header Image
 
 
Dear friends of the ETH AI Center,

We would like to provide you with some exciting updates from the ETH AI Center and its growing community.

We would like to provide you with some exciting updates from the ETH AI Center and its growing community. The ETH AI Center now comprises 110 research groups in the faculty, 20 corporate partners and has led to nine AI startups.

As the Covid-19 restrictions in Switzerland have recently been lifted, we would like to hear from you what kind of events you would like to see in 2022! Participate in the survey to suggest event formats and topics that you would enjoy being a part of. We are already excited to learn what we can achieve together this year.

We already have many interesting events coming up, we look forward to seeing you at our main and community events!

SOURCE

https://news.ethz.ch/html_mail.jsp?params=%2FUnFXUQJ%2FmiOP6akBq8eHxaXG%2BRdNmeoVa9gX5ArpTr6mX74xp5d78HhuIHTd9V6AHtAfRahyx%2BfRGrzVL1G8Jy5e3zykvr1WDtMoUC%2B7vILoHCGQ5p1rxaPzOsF94ID

 

 

LPBI Group is applying AI for Medical Text Analysis with Machine Learning and Natural Language Processing: Statistical and Deep Learning

Our Book 

Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS & BioInformatics, Simulations and the Genome Ontology

Medical Text Analysis of this Books shows the following results obtained by Madison Davis by applying Wolfram NLP for Biological Languages on our own Text. See below an Example:

Part 1: Next Generation Sequencing (NGS)

 

1.1 The NGS Science

1.1.1 BioIT Aspect

 

Hypergraph Plot #1 and Tree Diagram Plot #1

for 1.1.1 based on 16 articles & on 12 keywords

protein, cancer, dna, genes, rna, survival, immune, tumor, patients, human, genome, expression

(more…)

Read Full Post »

@MIT Artificial intelligence system rapidly predicts how two proteins will attach: The model called Equidock, focuses on rigid body docking — which occurs when two proteins attach by rotating or translating in 3D space, but their shapes don’t squeeze or bend

Reporter: Aviva Lev-Ari, PhD, RN

This paper introduces a novel SE(3) equivariant graph matching network, along with a keypoint discovery and alignment approach, for the problem of protein-protein docking, with a novel loss based on optimal transport. The overall consensus is that this is an impactful solution to an important problem, whereby competitive results are achieved without the need for templates, refinement, and are achieved with substantially faster run times.
28 Sept 2021 (modified: 18 Nov 2021)ICLR 2022 SpotlightReaders:  Everyone Show BibtexShow Revisions
 
Keywords:protein complexes, protein structure, rigid body docking, SE(3) equivariance, graph neural networks
AbstractProtein complex formation is a central problem in biology, being involved in most of the cell’s processes, and essential for applications such as drug design or protein engineering. We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures, assuming no three-dimensional flexibility during binding. We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right location and the right orientation relative to the second protein. We mathematically guarantee that the predicted complex is always identical regardless of the initial placements of the two structures, avoiding expensive data augmentation. Our model approximates the binding pocket and predicts the docking pose using keypoint matching and alignment through optimal transport and a differentiable Kabsch algorithm. Empirically, we achieve significant running time improvements over existing protein docking software and predict qualitatively plausible protein complex structures despite not using heavy sampling, structure refinement, or templates.
One-sentence SummaryWe perform rigid protein docking using a novel independent SE(3)-equivariant message passing mechanism that guarantees the same resulting protein complex independent of the initial placement of the two 3D structures.
 
SOURCE
 

MIT researchers created a machine-learning model that can directly predict the complex that will form when two proteins bind together. Their technique is between 80 and 500 times faster than state-of-the-art software methods, and often predicts protein structures that are closer to actual structures that have been observed experimentally.

This technique could help scientists better understand some biological processes that involve protein interactions, like DNA replication and repair; it could also speed up the process of developing new medicines.

Deep learning is very good at capturing interactions between different proteins that are otherwise difficult for chemists or biologists to write experimentally. Some of these interactions are very complicated, and people haven’t found good ways to express them. This deep-learning model can learn these types of interactions from data,” says Octavian-Eugen Ganea, a postdoc in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper.

Ganea’s co-lead author is Xinyuan Huang, a graduate student at ETH Zurich. MIT co-authors include Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health in CSAIL, and Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering in CSAIL and a member of the Institute for Data, Systems, and Society. The research will be presented at the International Conference on Learning Representations.

Significance of the Scientific Development by the @MIT Team

EquiDock wide applicability:

  • Our method can be integrated end-to-end to boost the quality of other models (see above discussion on runtime importance). Examples are predicting functions of protein complexes [3] or their binding affinity [5], de novo generation of proteins binding to specific targets (e.g., antibodies [6]), modeling back-bone and side-chain flexibility [4], or devising methods for non-binary multimers. See the updated discussion in the “Conclusion” section of our paper.

 

Advantages over previous methods:

  • Our method does not rely on templates or heavy candidate sampling [7], aiming at the ambitious goal of predicting the complex pose directly. This should be interpreted in terms of generalization (to unseen structures) and scalability capabilities of docking models, as well as their applicability to various other tasks (discussed above).

 

  • Our method obtains a competitive quality without explicitly using previous geometric (e.g., 3D Zernike descriptors [8]) or chemical (e.g., hydrophilic information) features [3]. Future EquiDock extensions would find creative ways to leverage these different signals and, thus, obtain more improvements.

   

Novelty of theory:

  • Our work is the first to formalize the notion of pairwise independent SE(3)-equivariance. Previous work (e.g., [9,10]) has incorporated only single object Euclidean-equivariances into deep learning models. For tasks such as docking and binding of biological objects, it is crucial that models understand the concept of multi-independent Euclidean equivariances.

  • All propositions in Section 3 are our novel theoretical contributions.

  • We have rewritten the Contribution and Related Work sections to clarify this aspect.

   


Footnote [a]: We have fixed an important bug in the cross-attention code. We have done a more extensive hyperparameter search and understood that layer normalization is crucial in layers used in Eqs. 5 and 9, but not on the h embeddings as it was originally shown in Eq. 10. We have seen benefits from training our models with a longer patience in the early stopping criteria (30 epochs for DIPS and 150 epochs for DB5). Increasing the learning rate to 2e-4 is important to speed-up training. Using an intersection loss weight of 10 leads to improved results compared to the default of 1.

 

Bibliography:

[1] Protein-ligand blind docking using QuickVina-W with inter-process spatio-temporal integration, Hassan et al., 2017

[2] GNINA 1.0: molecular docking with deep learning, McNutt et al., 2021

[3] Protein-protein and domain-domain interactions, Kangueane and Nilofer, 2018

[4] Side-chain Packing Using SE(3)-Transformer, Jindal et al., 2022

[5] Contacts-based prediction of binding affinity in protein–protein complexes, Vangone et al., 2015

[6] Iterative refinement graph neural network for antibody sequence-structure co-design, Jin et al., 2021

[7] Hierarchical, rotation-equivariant neural networks to select structural models of protein complexes, Eismann et al, 2020

[8] Protein-protein docking using region-based 3D Zernike descriptors, Venkatraman et al., 2009

[9] SE(3)-transformers: 3D roto-translation equivariant attention networks, Fuchs et al, 2020

[10] E(n) equivariant graph neural networks, Satorras et al., 2021

[11] Fast end-to-end learning on protein surfaces, Sverrisson et al., 2020

SOURCE

https://openreview.net/forum?id=GQjaI9mLet

Read Full Post »

Reporter: Stephen J. Williams, Ph.D.

From: Heidi Rheim et al. GA4GH: International policies and standards for data sharing across genomic research and healthcare. (2021): Cell Genomics, Volume 1 Issue 2.

Source: DOI:https://doi.org/10.1016/j.xgen.2021.100029

Highlights

  • Siloing genomic data in institutions/jurisdictions limits learning and knowledge
  • GA4GH policy frameworks enable responsible genomic data sharing
  • GA4GH technical standards ensure interoperability, broad access, and global benefits
  • Data sharing across research and healthcare will extend the potential of genomics

Summary

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.

In order for genomic and personalized medicine to come to fruition it is imperative that data siloes around the world are broken down, allowing the international collaboration for the collection, storage, transferring, accessing and analying of molecular and health-related data.

We had talked on this site in numerous articles about the problems data siloes produce. By data siloes we are meaning that collection and storage of not only DATA but intellectual thought are being held behind physical, electronic, and intellectual walls and inacessible to other scientisits not belonging either to a particular institituion or even a collaborative network.

Scientific Curation Fostering Expert Networks and Open Innovation: Lessons from Clive Thompson and others

Standardization and harmonization of data is key to this effort to sharing electronic records. The EU has taken bold action in this matter. The following section is about the General Data Protection Regulation of the EU and can be found at the following link:

https://ec.europa.eu/info/law/law-topic/data-protection/data-protection-eu_en

Fundamental rights

The EU Charter of Fundamental Rights stipulates that EU citizens have the right to protection of their personal data.

Protection of personal data

Legislation

The data protection package adopted in May 2016 aims at making Europe fit for the digital age. More than 90% of Europeans say they want the same data protection rights across the EU and regardless of where their data is processed.

The General Data Protection Regulation (GDPR)

Regulation (EU) 2016/679 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data. This text includes the corrigendum published in the OJEU of 23 May 2018.

The regulation is an essential step to strengthen individuals’ fundamental rights in the digital age and facilitate business by clarifying rules for companies and public bodies in the digital single market. A single law will also do away with the current fragmentation in different national systems and unnecessary administrative burdens.

The regulation entered into force on 24 May 2016 and applies since 25 May 2018. More information for companies and individuals.

Information about the incorporation of the General Data Protection Regulation (GDPR) into the EEA Agreement.

EU Member States notifications to the European Commission under the GDPR

The Data Protection Law Enforcement Directive

Directive (EU) 2016/680 on the protection of natural persons regarding processing of personal data connected with criminal offences or the execution of criminal penalties, and on the free movement of such data.

The directive protects citizens’ fundamental right to data protection whenever personal data is used by criminal law enforcement authorities for law enforcement purposes. It will in particular ensure that the personal data of victims, witnesses, and suspects of crime are duly protected and will facilitate cross-border cooperation in the fight against crime and terrorism.

The directive entered into force on 5 May 2016 and EU countries had to transpose it into their national law by 6 May 2018.

The following paper by the organiztion The Global Alliance for Genomics and Health discusses these types of collaborative efforts to break down data silos in personalized medicine. This organization has over 2000 subscribers in over 90 countries encompassing over 60 organizations.

Enabling responsible genomic data sharing for the benefit of human health

The Global Alliance for Genomics and Health (GA4GH) is a policy-framing and technical standards-setting organization, seeking to enable responsible genomic data sharing within a human rights framework.

he Global Alliance for Genomics and Health (GA4GH) is an international, nonprofit alliance formed in 2013 to accelerate the potential of research and medicine to advance human health. Bringing together 600+ leading organizations working in healthcare, research, patient advocacy, life science, and information technology, the GA4GH community is working together to create frameworks and standards to enable the responsible, voluntary, and secure sharing of genomic and health-related data. All of our work builds upon the Framework for Responsible Sharing of Genomic and Health-Related Data.

GA4GH Connect is a five-year strategic plan that aims to drive uptake of standards and frameworks for genomic data sharing within the research and healthcare communities in order to enable responsible sharing of clinical-grade genomic data by 2022. GA4GH Connect links our Work Streams with Driver Projects—real-world genomic data initiatives that help guide our development efforts and pilot our tools.

From the article on Cell Genomics GA4GH: International policies and standards for data sharing across genomic research and healthcare

Source: Open Access DOI:https://doi.org/10.1016/j.xgen.2021.100029PlumX Metrics

The Global Alliance for Genomics and Health (GA4GH) is a worldwide alliance of genomics researchers, data scientists, healthcare practitioners, and other stakeholders. We are collaborating to establish policy frameworks and technical standards for responsible, international sharing of genomic and other molecular data as well as related health data. Founded in 2013,3 the GA4GH community now consists of more than 1,000 individuals across more than 90 countries working together to enable broad sharing that transcends the boundaries of any single institution or country (see https://www.ga4gh.org).In this perspective, we present the strategic goals of GA4GH and detail current strategies and operational approaches to enable responsible sharing of clinical and genomic data, through both harmonized data aggregation and federated approaches, to advance genomic medicine and research. We describe technical and policy development activities of the eight GA4GH Work Streams and implementation activities across 24 real-world genomic data initiatives (“Driver Projects”). We review how GA4GH is addressing the major areas in which genomics is currently deployed including rare disease, common disease, cancer, and infectious disease. Finally, we describe differences between genomic sequence data that are generated for research versus healthcare purposes, and define strategies for meeting the unique challenges of responsibly enabling access to data acquired in the clinical setting.

GA4GH organization

GA4GH has partnered with 24 real-world genomic data initiatives (Driver Projects) to ensure its standards are fit for purpose and driven by real-world needs. Driver Projects make a commitment to help guide GA4GH development efforts and pilot GA4GH standards (see Table 2). Each Driver Project is expected to dedicate at least two full-time equivalents to GA4GH standards development, which takes place in the context of GA4GH Work Streams (see Figure 1). Work Streams are the key production teams of GA4GH, tackling challenges in eight distinct areas across the data life cycle (see Box 1). Work Streams consist of experts from their respective sub-disciplines and include membership from Driver Projects as well as hundreds of other organizations across the international genomics and health community.

Figure thumbnail gr1
Figure 1Matrix structure of the Global Alliance for Genomics and HealthShow full caption


Box 1
GA4GH Work Stream focus areasThe GA4GH Work Streams are the key production teams of the organization. Each tackles a specific area in the data life cycle, as described below (URLs listed in the web resources).

  • (1)Data use & researcher identities: Develops ontologies and data models to streamline global access to datasets generated in any country9,10
  • (2)Genomic knowledge standards: Develops specifications and data models for exchanging genomic variant observations and knowledge18
  • (3)Cloud: Develops federated analysis approaches to support the statistical rigor needed to learn from large datasets
  • (4)Data privacy & security: Develops guidelines and recommendations to ensure identifiable genomic and phenotypic data remain appropriately secure without sacrificing their analytic potential
  • (5)Regulatory & ethics: Develops policies and recommendations for ensuring individual-level data are interoperable with existing norms and follow core ethical principles
  • (6)Discovery: Develops data models and APIs to make data findable, accessible, interoperable, and reusable (FAIR)
  • (7)Clinical & phenotypic data capture & exchange: Develops data models to ensure genomic data is most impactful through rich metadata collected in a standardized way
  • (8)Large-scale genomics: Develops APIs and file formats to ensure harmonized technological platforms can support large-scale computing

For more articles on Open Access, Science 2.0, and Data Networks for Genomics on this Open Access Scientific Journal see:

Scientific Curation Fostering Expert Networks and Open Innovation: Lessons from Clive Thompson and others

Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies

eScientific Publishing a Case in Point: Evolution of Platform Architecture Methodologies and of Intellectual Property Development (Content Creation by Curation) Business Model 

UK Biobank Makes Available 200,000 whole genomes Open Access

Systems Biology Analysis of Transcription Networks, Artificial Intelligence, and High-End Computing Coming to Fruition in Personalized Oncology

Read Full Post »

The Vibrant Philly Biotech Scene: Proteovant Therapeutics Using Artificial Intelligence and Machine Learning to Develop PROTACs

Reporter: Stephen J. Williams, Ph.D.

It has been a while since I have added to this series but there have been a plethora of exciting biotech startups in the Philadelphia area, and many new startups combining technology, biotech, and machine learning. One such exciting biotech is Proteovant Therapeutics, which is combining the new PROTAC (Proteolysis-Targeting Chimera) technology with their in house ability to utilize machine learning and artificial intelligence to design these types of compounds to multiple intracellular targets.

PROTACs (which actually is under a trademark name of Arvinus Operations, but is also refered to as Protein Degraders. These PROTACs take advantage of the cell protein homeostatic mechanism of ubiquitin-mediated protein degradation, which is a very specific targeted process which regulates protein levels of various transcription factors, protooncogenes, and receptors. In essence this regulated proteolyic process is needed for normal cellular function, and alterations in this process may lead to oncogenesis, or a proteotoxic crisis leading to mitophagy, autophagy and cellular death. The key to this technology is using chemical linkers to associate an E3 ligase with a protein target of interest. E3 ligases are the rate limiting step in marking the proteins bound for degradation by the proteosome with ubiquitin chains.

Model of PROTAC Ternarary Complex

A review of this process as well as PROTACs can be found elsewhere in articles (and future articles) on this Open Access Journal.

Protevant have made two important collaborations:

  1. Oncopia Therapeutics: came out of University of Michigan Innovation Hub and lab of Shaomeng Wang, who developed a library of BET and MDM2 based protein degraders. In 2020 was aquired by Riovant Sciences.
  2. Riovant Sciences: uses computer aided design of protein degraders

Proteovant Company Description:

Proteovant is a newly launched development-stage biotech company focusing on discovery and development of disease-modifying therapies by harnessing natural protein homeostasis processes. We have recently acquired numerous assets at discovery and development stages from Oncopia, a protein degradation company. Our lead program is on track to enter IND in 2021. Proteovant is building a strong drug discovery engine by combining deep drugging expertise with innovative platforms including Roivant’s AI capabilities to accelerate discovery and development of protein degraders to address unmet needs across all therapeutic areas. The company has recently secured $200M funding from SK Holdings in addition to investment from Roivant Sciences. Our current therapeutic focus includes but is not limited to oncology, immunology and neurology. We remain agnostic to therapeutic area and will expand therapeutic focus based on opportunity. Proteovant is expanding its discovery and development teams and has multiple positions in biology, chemistry, biochemistry, DMPK, bioinformatics and CMC at many levels. Our R&D organization is located close to major pharmaceutical companies in Eastern Pennsylvania with a second site close to biotech companies in Boston area.

Protein degradation

Source: Protevant

The ubiquitin proteasome system (UPS) is responsible for maintaining protein homeostasis. Targeted protein degradation by the UPS is a cellular process that involves marking proteins and guiding them to the proteasome for destruction. We leverage this physiological cellular machinery to target and destroy disease-causing proteins.

Unlike traditional small molecule inhibitors, our approach is not limited by the classic “active site” requirements. For example, we can target transcription factors and scaffold proteins that lack a catalytic pocket. These classes of proteins, historically, have been very difficult to drug. Further, we selectively degrade target proteins, rather than isozymes or paralogous proteins with high homology. Because of the catalytic nature of the interactions,  it is possible to achieve efficacy at lower doses with prolonged duration while decreasing dose-limiting toxicities.

Biological targets once deemed “undruggable” are now within reach.

About Riovant Sciences: from PRNewsWire https://www.prnewswire.com/news-releases/roivant-unveils-targeted-protein-degradation-platform-301186928.html

Roivant develops transformative medicines faster by building technologies and developing talent in creative ways, leveraging the Roivant platform to launch “Vants” – nimble and focused biopharmaceutical and health technology companies. These Vants include Proteovant but also Dermovant, ImmunoVant,as well as others.

Roivant’s drug discovery capabilities include the leading computational physics-based platform for in silico drug design and optimization as well as machine learning-based models for protein degradation.

The integration of our computational and experimental engines enables the rapid design of molecules with high precision and fidelity to address challenging targets for diseases with high unmet need.

Our current modalities include small molecules, heterobifunctionals and molecular glues.

Roivant Unveils Targeted Protein Degradation Platform

– First therapeutic candidate on track to enter clinical studies in 2021

– Computationally-designed degraders for six targets currently in preclinical development

– Acquisition of Oncopia Therapeutics and research collaboration with lab of Dr. Shaomeng Wang at the University of Michigan to add diverse pipeline of current and future compounds

Clinical-stage degraders will provide foundation for multiple new Vants in distinct disease areas

– Platform supported by $200 million strategic investment from SK Holdings

Other articles in this Vibrant Philly Biotech Scene on this Online Open Access Journal include:

The Vibrant Philly Biotech Scene: PCCI Meeting Announcement, BioDetego Presents Colon Cancer Diagnostic Tool

The Vibrant Philly Biotech Scene: Focus on KannaLife Sciences and the Discipline and Potential of Pharmacognosy

The Vibrant Philly Biotech Scene: Focus on Vaccines and Philimmune, LLC

The Vibrant Philly Biotech Scene: Focus on Computer-Aided Drug Design and Gfree Bio, LLC

Philly Biotech Scene: Biobots and 3D BioPrinting (Now called Allevi)

Philly Biotech Scene: November 2015 PCCI Meeting Showcasing ViFant (Penn Center For Innovation)

Spark Therapeutics’ $4.8Billion deal Confirmed as Biggest VC-backed Exit in Philadelphia

Read Full Post »

A laboratory for the use of AI for drug development has been launched in collaboration with Pfizer, Teva, AstraZeneca, Mark and Amazon

Reporter: Aviva Lev-Ari, PhD, RN

AION Labs unites pharma, technology and funds companies including IBF to invest in startups to integrate developments in cloud computing and artificial intelligence to improve drug development capabilities. An alliance of four leading pharmaceutical companies –  
AION Labs
 , the first innovation lab of its kind in the world and a pioneer in the process of adopting cloud technologies, artificial intelligence and computer science to solve the R&D challenges of the pharma industry, today announces its launch.
AstraZeneca ,  
Mark ,  
Pfizer  and 
Teva  – and two leading companies in the field of high-tech and biotech investments, respectively – AWS ( 
Amazon Web Services Inc ) and the Israeli investment fund IBF ( 
Israel Biotech Fund ) – which joined together to establish groundbreaking ventures Through artificial intelligence and computer science to change the way new therapies are discovered and developed.  “We are excited to launch the new innovation lab in favor of discoveries of drugs and medical devices using groundbreaking computational tools,” said Matti Gil, CEO of AION Labs. We are prepared and ready to make a difference in the process of therapeutic discoveries and their development. 
With a strong pool of talent from Israel and the world, cloud technology and artificial intelligence at the heart of our activities and a significant commitment by the State of Israel, we are ready to contribute to the health and well-being of the human race and promote industry in Israel. 
I thank the partners for the trust, and it is an honor for me to lead such a significant initiative. ” 
In addition, AION Labs has announced a strategic partnership with X  
BioMed  , an independent biomedical research institute operating in Heidelberg, Germany. 
BioMed X has a proven track record in advancing research innovations in the field of biomedicine at the interface between academic research and the pharmaceutical industry. 
BioMed X’s innovation model, based on global mass sourcing and incubators to cultivate the most brilliant talent and ideas, will serve as the R & D engine to drive AION Labs’ enterprise model.

SOURCE

Read Full Post »

Science Policy Forum: Should we trust healthcare explanations from AI predictive systems?

Some in industry voice their concerns

Curator: Stephen J. Williams, PhD

Post on AI healthcare and explainable AI

   In a Policy Forum article in ScienceBeware explanations from AI in health care”, Boris Babic, Sara Gerke, Theodoros Evgeniou, and Glenn Cohen discuss the caveats on relying on explainable versus interpretable artificial intelligence (AI) and Machine Learning (ML) algorithms to make complex health decisions.  The FDA has already approved some AI/ML algorithms for analysis of medical images for diagnostic purposes.  These have been discussed in prior posts on this site, as well as issues arising from multi-center trials.  The authors of this perspective article argue that choice of type of algorithm (explainable versus interpretable) algorithms may have far reaching consequences in health care.

Summary

Artificial intelligence and machine learning (AI/ML) algorithms are increasingly developed in health care for diagnosis and treatment of a variety of medical conditions (1). However, despite the technical prowess of such systems, their adoption has been challenging, and whether and how much they will actually improve health care remains to be seen. A central reason for this is that the effectiveness of AI/ML-based medical devices depends largely on the behavioral characteristics of its users, who, for example, are often vulnerable to well-documented biases or algorithmic aversion (2). Many stakeholders increasingly identify the so-called black-box nature of predictive algorithms as the core source of users’ skepticism, lack of trust, and slow uptake (3, 4). As a result, lawmakers have been moving in the direction of requiring the availability of explanations for black-box algorithmic decisions (5). Indeed, a near-consensus is emerging in favor of explainable AI/ML among academics, governments, and civil society groups. Many are drawn to this approach to harness the accuracy benefits of noninterpretable AI/ML such as deep learning or neural nets while also supporting transparency, trust, and adoption. We argue that this consensus, at least as applied to health care, both overstates the benefits and undercounts the drawbacks of requiring black-box algorithms to be explainable.

Source: https://science.sciencemag.org/content/373/6552/284?_ga=2.166262518.995809660.1627762475-1953442883.1627762475

Types of AI/ML Algorithms: Explainable and Interpretable algorithms

  1.  Interpretable AI: A typical AI/ML task requires constructing algorithms from vector inputs and generating an output related to an outcome (like diagnosing a cardiac event from an image).  Generally the algorithm has to be trained on past data with known parameters.  When an algorithm is called interpretable, this means that the algorithm uses a transparent or “white box” function which is easily understandable. Such example might be a linear function to determine relationships where parameters are simple and not complex.  Although they may not be as accurate as the more complex explainable AI/ML algorithms, they are open, transparent, and easily understood by the operators.
  2. Explainable AI/ML:  This type of algorithm depends upon multiple complex parameters and takes a first round of predictions from a “black box” model then uses a second algorithm from an interpretable function to better approximate outputs of the first model.  The first algorithm is trained not with original data but based on predictions resembling multiple iterations of computing.  Therefore this method is more accurate or deemed more reliable in prediction however is very complex and is not easily understandable.  Many medical devices that use an AI/ML algorithm use this type.  An example is deep learning and neural networks.

The purpose of both these methodologies is to deal with problems of opacity, or that AI predictions based from a black box undermines trust in the AI.

For a deeper understanding of these two types of algorithms see here:

https://www.kdnuggets.com/2018/12/machine-learning-explainability-interpretability-ai.html

or https://www.bmc.com/blogs/machine-learning-interpretability-vs-explainability/

(a longer read but great explanation)

From the above blog post of Jonathan Johnson

  • How interpretability is different from explainability
  • Why a model might need to be interpretable and/or explainable
  • Who is working to solve the black box problem—and how

What is interpretability?

Does Chipotle make your stomach hurt? Does loud noise accelerate hearing loss? Are women less aggressive than men? If a machine learning model can create a definition around these relationships, it is interpretable.

All models must start with a hypothesis. Human curiosity propels a being to intuit that one thing relates to another. “Hmm…multiple black people shot by policemen…seemingly out of proportion to other races…something might be systemic?” Explore.

People create internal models to interpret their surroundings. In the field of machine learning, these models can be tested and verified as either accurate or inaccurate representations of the world.

Interpretability means that the cause and effect can be determined.

What is explainability?

ML models are often called black-box models because they allow a pre-set number of empty parameters, or nodes, to be assigned values by the machine learning algorithm. Specifically, the back-propagation step is responsible for updating the weights based on its error function.

To predict when a person might die—the fun gamble one might play when calculating a life insurance premium, and the strange bet a person makes against their own life when purchasing a life insurance package—a model will take in its inputs, and output a percent chance the given person has at living to age 80.

Below is an image of a neural network. The inputs are the yellow; the outputs are the orange. Like a rubric to an overall grade, explainability shows how significant each of the parameters, all the blue nodes, contribute to the final decision.

In this neural network, the hidden layers (the two columns of blue dots) would be the black box.

For example, we have these data inputs:

  • Age
  • BMI score
  • Number of years spent smoking
  • Career category

If this model had high explainability, we’d be able to say, for instance:

  • The career category is about 40% important
  • The number of years spent smoking weighs in at 35% important
  • The age is 15% important
  • The BMI score is 10% important

Explainability: important, not always necessary

Explainability becomes significant in the field of machine learning because, often, it is not apparent. Explainability is often unnecessary. A machine learning engineer can build a model without ever having considered the model’s explainability. It is an extra step in the building process—like wearing a seat belt while driving a car. It is unnecessary for the car to perform, but offers insurance when things crash.

The benefit a deep neural net offers to engineers is it creates a black box of parameters, like fake additional data points, that allow a model to base its decisions against. These fake data points go unknown to the engineer. The black box, or hidden layers, allow a model to make associations among the given data points to predict better results. For example, if we are deciding how long someone might have to live, and we use career data as an input, it is possible the model sorts the careers into high- and low-risk career options all on its own.

Perhaps we inspect a node and see it relates oil rig workers, underwater welders, and boat cooks to each other. It is possible the neural net makes connections between the lifespan of these individuals and puts a placeholder in the deep net to associate these. If we were to examine the individual nodes in the black box, we could note this clustering interprets water careers to be a high-risk job.

In the previous chart, each one of the lines connecting from the yellow dot to the blue dot can represent a signal, weighing the importance of that node in determining the overall score of the output.

  • If that signal is high, that node is significant to the model’s overall performance.
  • If that signal is low, the node is insignificant.

With this understanding, we can define explainability as:

Knowledge of what one node represents and how important it is to the model’s performance.

So how does choice of these two different algorithms make a difference with respect to health care and medical decision making?

The authors argue: 

“Regulators like the FDA should focus on those aspects of the AI/ML system that directly bear on its safety and effectiveness – in particular, how does it perform in the hands of its intended users?”

A suggestion for

  • Enhanced more involved clinical trials
  • Provide individuals added flexibility when interacting with a model, for example inputting their own test data
  • More interaction between user and model generators
  • Determining in which situations call for interpretable AI versus explainable (for instance predicting which patients will require dialysis after kidney damage)

Other articles on AI/ML in medicine and healthcare on this Open Access Journal include

Applying AI to Improve Interpretation of Medical Imaging

Real Time Coverage @BIOConvention #BIO2019: Machine Learning and Artificial Intelligence #AI: Realizing Precision Medicine One Patient at a Time

LIVE Day Three – World Medical Innovation Forum ARTIFICIAL INTELLIGENCE, Boston, MA USA, Monday, April 10, 2019

Cardiac MRI Imaging Breakthrough: The First AI-assisted Cardiac MRI Scan Solution, HeartVista Receives FDA 510(k) Clearance for One Click™ Cardiac MRI Package

 

Read Full Post »

Al is on the way to lead critical ED decisions on CT

Curator and Reporter: Dr. Premalata Pati, Ph.D., Postdoc

Artificial intelligence (AI) has infiltrated many organizational processes, raising concerns that robotic systems will eventually replace many humans in decision-making. The advent of AI as a tool for improving health care provides new prospects to improve patient and clinical team’s performance, reduce costs, and impact public health. Examples include, but are not limited to, automation; information synthesis for patients, “fRamily” (friends and family unpaid caregivers), and health care professionals; and suggestions and visualization of information for collaborative decision making.

In the emergency department (ED), patients with Crohn’s disease (CD) are routinely subjected to Abdomino-Pelvic Computed Tomography (APCT). It is necessary to diagnose clinically actionable findings (CAF) since they may require immediate intervention, which is typically surgical. Repeated APCTs, on the other hand, results in higher ionizing radiation exposure. The majority of APCT performance guidance is clinical and empiric. Emergency surgeons struggle to identify Crohn’s disease patients who actually require a CT scan to determine the source of acute abdominal distress.

Image Courtesy: Jim Coote via Pixabay https://www.aiin.healthcare/media/49446

Aid seems to be on the way. Researchers employed machine learning to accurately distinguish these sufferers from Crohn’s patients who appear with the same complaint but may safely avoid the recurrent exposure to contrast materials and ionizing radiation that CT would otherwise wreak on them.

The study entitled “Machine learning for selecting patients with Crohn’s disease for abdominopelvic computed tomography in the emergency department” was published on July 9 in Digestive and Liver Disease by gastroenterologists and radiologists at Tel Aviv University in Israel.

Retrospectively, Jacob Ollech and his fellow researcher have analyzed 101 emergency treatments of patients with Crohn’s who underwent abdominopelvic CT.

They were looking for examples where a scan revealed clinically actionable results. These were classified as intestinal blockage, perforation, intra-abdominal abscess, or complex fistula by the researchers.

On CT, 44 (43.5 %) of the 101 cases reviewed had such findings.

Ollech and colleagues utilized a machine-learning technique to design a decision-support tool that required only four basic clinical factors to test an AI approach for making the call.

The approach was successful in categorizing patients into low- and high-risk groupings. The researchers were able to risk-stratify patients based on the likelihood of clinically actionable findings on abdominopelvic CT as a result of their success.

Ollech and co-authors admit that their limited sample size, retrospective strategy, and lack of external validation are shortcomings.

Moreover, several patients fell into an intermediate risk category, implying that a standard workup would have been required to guide CT decision-making in a real-world situation anyhow.

Consequently, they generate the following conclusion:

We believe this study shows that a machine learning-based tool is a sound approach for better-selecting patients with Crohn’s disease admitted to the ED with acute gastrointestinal complaints about abdominopelvic CT: reducing the number of CTs performed while ensuring that patients with high risk for clinically actionable findings undergo abdominopelvic CT appropriately.

Main Source:

Konikoff, Tom, Idan Goren, Marianna Yalon, Shlomit Tamir, Irit Avni-Biron, Henit Yanai, Iris Dotan, and Jacob E. Ollech. “Machine learning for selecting patients with Crohn’s disease for abdominopelvic computed tomography in the emergency department.” Digestive and Liver Disease (2021). https://www.sciencedirect.com/science/article/abs/pii/S1590865821003340

Other Related Articles published in this Open Access Online Scientific Journal include the following:

Al App for People with Digestive Disorders

Reporter: Irina Robu, Ph.D.

https://pharmaceuticalintelligence.com/2019/06/24/ai-app-for-people-with-digestive-disorders/

Machine Learning (ML) in cancer prognosis prediction helps the researcher to identify multiple known as well as candidate cancer diver genes

Curator and Reporter: Dr. Premalata Pati, Ph.D., Postdoc

https://pharmaceuticalintelligence.com/2021/05/04/machine-learning-ml-in-cancer-prognosis-prediction-helps-the-researcher-to-identify-multiple-known-as-well-as-candidate-cancer-diver-genes/

Al System Used to Detect Lung Cancer

Reporter: Irina Robu, Ph.D.

https://pharmaceuticalintelligence.com/2019/06/28/ai-system-used-to-detect-lung-cancer/

Artificial Intelligence: Genomics & Cancer

https://pharmaceuticalintelligence.com/ai-in-genomics-cancer/

Yet another Success Story: Machine Learning to predict immunotherapy response

Curator and Reporter: Dr. Premalata Pati, Ph.D., Postdoc

https://pharmaceuticalintelligence.com/2021/07/06/yet-another-success-story-machine-learning-to-predict-immunotherapy-response/

Systemic Inflammatory Diseases as Crohn’s disease, Rheumatoid Arthritis and Longer Psoriasis Duration May Mean Higher CVD Risk

Reporter: Aviva Lev-Ari, PhD, RN

https://pharmaceuticalintelligence.com/2017/10/09/systemic-inflammatory-diseases-as-crohns-disease-rheumatoid-arthritis-and-longer-psoriasis-duration-may-mean-higher-cvd-risk/

Autoimmune Inflammatory Bowel Diseases: Crohn’s Disease & Ulcerative Colitis: Potential Roles for Modulation of Interleukins 17 and 23 Signaling for Therapeutics

Curators: Larry H Bernstein, MD FCAP and Aviva Lev-Ari, PhD, RN https://pharmaceuticalintelligence.com/2016/01/23/autoimmune-inflammtory-bowl-diseases-crohns-disease-ulcerative-colitis-potential-roles-for-modulation-of-interleukins-17-and-23-signaling-for-therapeutics/

Inflammatory Disorders: Inflammatory Bowel Diseases (IBD) – Crohn’s and Ulcerative Colitis (UC) and Others

Curators: Larry H. Bernstein, MD, FCAP and Aviva Lev-Ari, PhD, RN

https://pharmaceuticalintelligence.com/gama-delta-epsilon-gde-is-a-global-holding-company-absorbing-lpbi/subsidiary-5-joint-ventures-for-ip-development-jvip/drug-discovery-with-3d-bioprinting/ibd-inflammatory-bowl-diseases-crohns-and-ulcerative-colitis/

Read Full Post »

Older Posts »

%d bloggers like this: