Medical Text Analysis (NLP)
Research Internship
Medical Text Analysis (MTA) with Natural Language Processing (NLP)
Employer Partner: Leaders in Pharmaceutical Business Intelligence (LPBI) Group
Career(s)
Life Science Research, Bioinformatics, AI, Machine Learning, Statistical NLP, semantic Text Analysis, Big Learning
Overview
LPBI Internships offer the following:
- Affiliation with and mentorship by esteemed scientists and research graduate students.
- Skills development in NLP, ML, AI applications to Medical Text and Drug Discovery
- References/letter of recommendation
- Description of accomplishments and goals achieved during the internship
- Opportunity to contribute to publications
- Exploration of opportunities in life sciences in the US
- Opportunity to collaborate with professionals from various fields such as medicine and Natural Language Processing as well as with other interns.
Internship Description
This research internship introduces students to medical text analysis (NLP). Students are introduced to an opportunity to learn about curating cutting edge medical articles, learn about methodologies of data curation and data annotation for applications of Natural Language Processing for Text Analysis.
STEP 1: Domain Knowledge Expert Specifies the selection criteria for a collection of articles:
1.1 Curated & authored articles vs scientific reports
1.2 All articles in a chapter in a book, [N = 1,2,3, ..,18]
1.3 Selection of articles within a research category [N = 1,2,3, ..,730]
1.4 Selection of articles within several research categories
STEP 2: Create .TXT file for each article in the collection
STEP 3: Create one MERGED .TXT File for all the articles in the collection
STEP 4: Use WordItOut.com and .TXT file per article to generate one word cloud per article
4.1 Edit Graph – remove connective words
4.2 Upload word clouds to the media gallery and record article title as legend and source for the graph, add your name as image producer and date
4.3 Insert word cloud in the article following the author/curator’s name
4.4 Place word cloud in a one PowerPoint presentation for the entire article collection
STEP 5: Use .TXT file per article to create a Bar Diagram for the word frequencies in the article
5.1 Edit bar diagram and remove connective words
5.2 Place each bar diagram in the PowerPoint presentation for the article collection
5.3 To generate the bar diagram USE Wolfram CODE and Instructions in DropBox
STEP 6: Use the one MERGED .TXT file to create ONE Hyper-graph for the entire article collection
6.1 Edit hyper-graph
6.2 Place hyper-graph in the PowerPoint presentation
6.3 To generate the Hypergraph USE Wolfram CODE and Instructions in DropBox
STEP 7: Use the one MERGED .TXT file to create ONE Tree Diagram for the entire article collection
7.1 Edit tree diagram
7.2 Place tree diagram in the PowerPoint presentation
7.3 To generate the tree diagram USE Wolfram CODE and Instructions in DropBox
STEP 8: Transfer all visualization in PowerPoint into a Domain Knowledge Expert Interpretation Folder
Types of Students Desired
College students majoring in life sciences or computer science majors
Masters Students will be given additional challenging tasks
Structure/Schedule
- Cohort/Individual
- Summer and academic school year internships
- 16 weeks – (flexible schedule)
- Zoom 1 time per week with supervisor
- Zoom 1 time during internship with group
- Some additional group meetings related to code review, new code instructions, etc.
Internship Project Work Examples
- Sample project: Curate articles from various medical fields and extract specific information from articles on cancer.*
- Sample project: Curated Natural Language Processing resources for use in shaping a proof-of-concept pilot project to be used in LPBI Group’s business plan.
Skills Used/Gained
- Machine Learning (ML)
- Artificial Intelligence (AI)
- Applications to medical text analysis
- Article Classification
- Creation of Text File Format
- Merging Text File Formats
- Creation of word clouds
- Embedding word clouds in original articles
- Creation of bar diagrams
- Generate hyper-graphs from merged files
- Create tree diagrams from merged files
- PowerPoint presentation skills
- File management in Dropbox
* Other articles on genomics, cardiovascular, immunology, infectious diseases, metabolomics, precision medicine and reproductive genomics are available
Verifiable Certifications Offered
- Medical Text Analysis using statistical NLP, semantic NLP, and Deep Learning
ADDITIONAL INFORMATION
Medical Text Analysis using Wolfram Language for Biological Sciences
Work on ENTIRE books we published – one book per student [six are assigned already]
18 Books in Medicine
https://lnkd.in/ekWGNqA
Perform Deep Learning NLP on 25 Concepts in each article, as follows
- Generate WordClouds for each article in the Book,
- Generate Bar Diagrams for each article in the Book,
- Generate Hypergrpahs – ONE for all articles in EVERY chapter in the book – student was assigned
- Generate Tree Diagram – ONE for all articles in EVERY chapter in the book – student was assigned
- Migrate all artifacts into a Knowledge Graph Flur.ee Blockchain Database – Open source – See #2, below
Trends in Development of Databases and Blockchain | IEEE Conference Publication | IEEE Xplore
- Create a PowerPoint Presentation ONE per Chapter in the Book – SEE attachment [Domain Knowledge Expert Interpretation – Work-in-Progress]
- All the PowerPoints will be included in a NEW GENRE of Scientific Book we will publish – Name of student will be on the NLP section
- Structure of the New Book:
Part 1: eTOCs of the Original Book in English and Spanish: Text & Audio Podcast – Original Authors
Part 2: Deep Learning Wolfram NLP – SEE attachment – NAME OF STUDENT that produced the NLP results with Wolfram
Part 3: Editorial [Preface, Introduction, Summary, Epilogue] – English Audio Podcast – Original Authors