Bioinformatics, MS Online

Online

Bioinformatics DNA

The last decade has seen unprecedented changes in biotech, biomedicine, biomanufacturing, and bioengineering. Most of it is fueled by new genomics and other omics technologies that generate massive amount of data, but also do so at a higher and higher resolution going down to single molecules and single cells. The resulting data need to be interpreted carefully, because a single mutation in a base (e.g., “​SNP​”​) could be the cause of a disease. ​The resulting data is massive, as biotech​’​s Moore​’​s law grows exponentially (doubling every five months in comparison to computers​’ ​doubling every eighteen months).

We at Tandon are educating and nurturing tomorrow​’​s biotech rock-stars, who can address infectious diseases (e.g., Zika or Ebola), genetic diseases (e.g., Cancer, Alzheimer or Autism), public health (Personalized Health Care Program, Diabetes or Obesity), agriculture (e.g., GMO, Genetically Modified Organisms) and green technology (e.g., Energy or GHG(Green House Gas)-sequestration.

Th​is NY State approved​​ program meets industry's demand for professionals with solid foundations in genomics, proteomics,​ and transcriptomics;​ Algorithms, statistics and biotechnology; programming (Python, Perl and R)​;​ ​data science, AI and ML; ​sequence and pathway analysis, as well as a host of genome informatics tools and algorithms such as BLAST, BioPython, BioPerl, Bioconductor, and UCSC genome browser.

Tandon also provides a bridge program to prepare students with insufficient background in core computer science before admission.

Students who earn a Bioinformatics Advanced Certificate may apply those credits towards the Bioinformatics Master's Degree. Note that only 9 credits from the Advanced Certificate can be used towards the Bioinformatics Master Degree program.

    About Bioinformatics

    Bioinformatics is an academic field that seeks to create and advance algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management and analysis of biological data.

    As the global bioinformatics market is projected to reach USD 16.18 billion by 2021 from USD 6.21 billion in 2016, growing at a CAGR (Compound Annual Growth Rate) of 21.1% during the forecast period, the demand for Bioinformatics professionals remains tremendously high.


    Our 30-credit multi-disciplinary program offers the student a refined skill set including but not limited to functional annotation, statistical analysis, algorithmic development and genomics and proteomics.


    A sampling of companies and institutions that are hiring in the bioinformatics field.*

    • Albert Einstein College of Medicine
    • Celgene
    • Columbia University
    • Genewiz
    • Indegene
    • Lallemand
    • Memorial Sloan Kettering Cancer Center
    • Mount Sinai
    • New York Blood Center
    • New York Genome Center
    • NYU Langone Health
    • Regeneron
    • Simons Foundation
    • Weill Cornell Medical Center

    *Information compiled through indeed.com.


    More About the Program

    The faculty at NYU Tandon School of Engineering are highly regarded for their extensive knowledge and professional industry experience. Please click on the associated links below to learn more about each faculty member.

    Mgavi Elombe Brathwaite
    Mgavi Elombe Brathwaite,
    NYU Tandon
     
    Marco Antoniotti

    Marco Antoniotti,
    Università degli Studi di Milano Bicocca

    Vera Cherepinsky,
    Fairfield University

    Jonathan FlowersJonathan Flowers,
    NYU Tandon


    Erik K. Grimmelmann
    NY Tech Alliance
     

    Benjamin HughesBenjamin Hughes,
    NYU Tandon

    Lavanya Kannan

    Lavanya Kannan,
    NYU Langone

     

    Manpreet S. Katari
    Manpreet S. Katari,
    NYU Tandon

     

     

    Houssein Khiabanian

    Hossein Khiabanian,
    Rutgers Cancer Institute of New Jersey

    Judith Kleinberg Bud Mishra
    Bud Mishra,
    NYU Tandon
     

    Ravi Sachidanandam
    Ravi Sachidanandam,
    NYU Tandon

    Jeremy Seto
    Jeremy Seto,
    NYU Tandon
       

    Hear from our current students and alumni why they chose the Bioinformatics Master's Degree at NYU Tandon School of Engineering.

    The Bioinformatics program at NYU Tandon has a rigorous yet flexible curriculum. It not only allows us to tailor our curriculum toward our interest, it also prepares us to succeed as bioinformaticians.

    Less than a month after completing my degree, I was hired as the first Computer Systems Manager for the Bioinformatics Unit at NYC Department of Health and Mental Hygiene.

    -Jade Wang, Class of 2017; current Computer Systems Manager for the Bioinformatics Unit at NYC Department of Health and Mental Hygiene, Bureau of the Public Health Laboratory

    I came into NYU Tandon Bioinformatics with practically no programming experience. . . . There really was no need to worry as all the courses are tailored to make sure you understand the whole picture, the computer science/programming aspect and the biology aspects, not just one part or the other.

    -Carmen Wickware, Class of 2017; current PhD student in the Department of Animal Sciences at Purdue University

    My background is in quantitative finance, and I have some good programming skills. However, I had not had any biology since high school. The Tandon program got me up to speed very quickly. In just a year I have already been able to make a material contribution to the work at a cancer research lab in NYU Langone, and co-author a presentation made at this year's American Society of Clinical Oncology conference.

    -John Cadley, Class of 2017; current PhD student


    Below is a showcase of current NYU Tandon School of Engineering student projects from courses related to our Bioinformatics master's degree program. Please check back often to learn more about our new student projects.

    Check out the new BioStar Handbook!

    Danny Simpson, MS 2016

    From Mollusks to Medicine: A Venomics Approach for the Discovery and Characterization of Therapeutics from Terebridae Peptide Toxins

    Rajeeva Lochan Musunuri, MS 2015

    Validating somatic structural variants with local assembly
    Interning at the New York Genome Center

    Detecting structural variants (SVs) from sequencing data is complex and is fraught with high false negative rate. It is therefore necessary to use multiple orthogonal methodologies (such as read depth, read pairs, split reads) to detect structural variants. When searching for somatic SVs in cancer samples (tumor/normal paired analysis), a false negative call in the normal will lead to a false positive somatic call in the tumor. This can be problematic because SVs are known to be highly relevant in cancer development and metastasis.

    Previous studies have shown that assembly based methods have the highest resolution in determining the SV breakpoints with base-pair precision. In this project, I have created a modular framework for validating and also identifying SV calls by performing local assembly of the reads around the breakpoints with different assembly tools such as TIGRA, SGA, SPAdes, CORTEX, FERMI. The framework provides a way to obtain a high quality clinically actionable set of structural variant calls.

    Marina Hoashi MS Class of 2015

    Mammals have evolved to nourish their offspring exclusively with maternal milk for around half of the lactation period, a crucial infant developmental window. In view of the oral-breast contact during lactation and the altered oral microbiota in Caesarean section (C-section) born infants, we expected differences in milk composition by delivery mode. Here we performed a cross-sectional study of microbes and glycosylation patterns in human milk at different times postpartum, and found differences by time after birth only in women who delivered vaginally. These results warrant further research into the role of microbes in milk glycosylation and its developmental functions.

    Rama Srinivasan

    Read the full abstract

    About the Project
    Identification of Novel Peptides from the Venom Duct Transcriptome of Marine Snail Cinguloterebra Anilis

    Abstract
    Molecules produced in nature that are biologically active continue to be the source and inspiration for a vast number of drugs, diagnostics, and pharmacological tools. However, it remains challenging not only to find new organisms that produce natural products, but also to identify all of the bioactive molecules produced by these organisms.

    Marine snails have proven to be good sources of neuroactive peptides in the past. Whereas toxins from species like cone snails have been moderately well categorized, toxins from the vermivorous Terebrid snails remain more poorly characterized.

    Working in collaboration with the Holford Lab at the Hunter College of CUNY, I focus on discovering neuroactive peptides from the venom tissues of the snail Cinguloterebra anilis. We are working on Illumina RNA-Seq data of the anilis venom duct, and aim to assemble, annotate and filter our way to discovering new toxins, later progressing to physiological assays.

    Oscar L Rodriguez

    Read the full abstract

    About the Project
    Joint Automated Genome Annotation of 73 Human Cell Types

    Abstract
    The ENCODE consortium produced functional genomics data in many cell types. Our goal is to annotate the active genomic functional elements in this diverse set of cell types. The challenge is that many of these cell types have little data available. We aim to leverage existing high quality annotations from six well-studied cell types in the production of annotations for the remaining cell types.

    Novel classification and visualization of genome-wide expression patterns in known breast cancer subtypes | Alexander R. Mankovich, Class of 2014

    Introduction to cancer subtyping and signatures for outcome prediction:
    Breast cancer research, while making steady advances in the disease's diagnosis and the discovery of new therapies, is still limited in its capacity to characterize disease subtypes in full. Five molecular subtypes have been described in the past: HER2+/ERBB2+, basal-like, Luminal A, Luminal B, and normal-like. There are several approaches used to classify these subtypes: histopathology, arising from the examination of tissue to assign a grade and particular physiological manifestation of the tumor; molecular pathology, which measures key proteins expressed by the majority of tumor cells; genetic analysis, which identifies genome-wide changes in tumor cells (such as copy number alterations); and gene-expression, the analysis of particular genes driving tumor biology. These four approaches are used together to delineate a patient's tumor into a detailed subclassification driving clinical outlook such as risk of metastasis, likelihood of recurrence, and potential curative therapies using together to delineate a patient's tumor into a detailed subclassification driving clinical outlook such as risk of metastasis, likelihood of recurrence, and potential curative therapies. .

    Utilizing various analytical, statistical, and visual methods, RNA-seq expression signatures can more precisely guide clinical understanding of the driving forces behind tumor biology and further demarcate diverse breast cancer subtypes based on signature motifs and their associated prognostic or predictive factors - such as possible therapies, metastatic potential, recurrence risk, and survival probability. I propose to create a framework which generates long-range expression signatures from tumor samples, selects signatures which are alike, identifies significant correlating prognostic and predictive factors, and visualizes those relationships in a biologically intuitive manner.

    STAT-GPS: a complete functional genome annotation tool focusing on extensive downstream analysis of genes | Michael D’Eletto, Class of 2014

    After generations of sequencing the genomes of various organisms, there exists an abundance of sequencing data that must be analyzed and annotated.  Bioinformaticians are left with the challenge of using open-source programs to align and assemble these millions of reads.  From these genome assemblies, functional properties of individual genes must be annotated before being loaded into databases like Genbank.  Numerous annotation pipelines have been developed; however, emphasis on extensive downstream functional annotation has been lacking.  Software such as the MAKER pipeline provides gene models based on multiple sources of evidence, but stops short of providing any functional information.  Other tools, such as DAVID are accessible only via a web site and hence would require submitting large amounts of data over the web, something many companies are not comfortable with.  Tools such as AutoFACT are not currently maintained and are primarily aimed at RNA transcript annotation.  Corporations also face special needs in that they (1) require high levels of security for their information and (2) are not always able to pay for software that may be free for academics.  In addition, the level of support, documentation, maintenance, and integration for bioinformatics tools varies greatly and is often at too low a level for a small bioinformatics group to deal with.

    This thesis is a continuation of a graduate project revolved around development of an extensive functional annotation pipeline which emphasizes on downstream analysis of genes.  Initial development of the pipeline focused on primary annotations involving ab initio gene prediction and protein/EST alignment to known hits in various databases.  These primary annotations merely touched the surface of the overall function of each annotated gene.  Continual development of the pipeline has delved into the functional and structural analyses of each gene and its proteins, as well as prediction of regulatory, non-coding elements in the DNA.  These analyses include, but are not limited to: (1) automated homology modeling, (2) pathway assignment, (3) ncRNA prediction, and (4) de-novo promoter element discovery.

    This pipeline, known as STAT-GPS (Solazyme Total Annotation Tool for Genomic and Protein Sequences) utilizes a combination of both open-source software and remote servers to attain the most reliable, accurate, and thorough functional annotation possible.  This program, which is developed in the Python language, is intended for both genomic and RNA transcripts, although genomic transcripts are the main goal.  The source code is available for download and redistribution on Github.  A formal paper intended for publication in the Bioinformatics journal is being written concurrently and will include supplementary data about the efficiency of this pipeline.

    Malcolm Houtz, Class of 2015

    In 2011, Gan et al published work indicating that different accessions of Arabidopsis thaliana use alternate gene models to those annotated in the reference genome. An implication of this finding is that a large proportion of genes predicted to be damaged or knocked out (using the reference genome annotation) in non-reference accessions were in fact not influenced by these mutations. The transcriptomes were reassembled for 18 accessions, and new annotation files were created.

    Using RNA-Seq data already sequenced and assembled by Purugganan Laboratory, we propose to study and potentially re-annotate the transcriptomes of 4 rice accessions.

    The first phase of the project involves matching gene-ids with known polymorphisms or indels to a large FPKM matrix. A summarized categorization of expressed and unexpressed genes will be delivered. The summary will give an indication of expression levels for genes predicted to be damaged. Each accession was tested under many different conditions – summary at different levels may make sense. This piece of the project is intended to extend Malcolm’s very basic R skills.

    If a significant number of genes which are predicted to be damaged are in fact expressed, transcriptomes will be reassembled and annotated. Using an existing General Feature Format file, we will find additional, novel transcripts and create new GFF’s for each of the 4 sequenced accessions.

    Although familiar software (cufflinks) does allow the discovery of novel transcripts, the method for updating an existing GFF with additional transcripts is currently unclear.

    Final deliverables will be pipelines for transcript reassembly and updating GFFs with additional annotations.

    Oscar Rodriguez, Undergrad Class of 2014

    Download project poster

    Background: 
    The ENCODE consortium produced functional genomics data in many cell types. Our goal is to annotate the active genomic functional elements in this diverse set of cell types. The challenge is that many of these cell types have little data available. We aim to leverage existing high quality annotations from six well-studied cell types in the production of annotations for the remaining cell types.

    Approach:
    We use the genome annotation software Segway to perform annotations, augmented with entropic graph-based regularization (EGBR) to leverage existing annotations. We chose cell types that had at least two out of four distinct types of assays (DNase-seq, RNAseq, histone modification ChIP-seq and transcription factor ChIP-seq).

    Results:
    We will produce functional annotations of 73 cell types.  These annotations will be made publicly available on the UCSC Genome Browser.  In addition, the project has successfully migrated the Segway+EGBR annotation software to the DNAnexus cloud computing platform.


    The faculty for the online Bioinformatics Master's Degree program is drawn across NYU and the Tandon School of Engineering. The dedicated faculty focus on the careful study and practice of Bioinformatics, engaging students day-to-day while participating in research.   


    The NYU Tandon School of Engineering's Advisory Board is comprised of experienced leaders from several industries and academia who provide valuable insights and recommendations to NYU Tandon Online, The Online Learning Unit. The Board meets twice a year to review the program's curriculum, progress, and consider new ideas needed to meet industries demands. 


    Bud Mishra,
    Courant Institute
     

    Mgavi Elombe Brathwaite,
    NYU Tandon


    Constantin Aliferis
     


    Erik K. Grimmelmann
    NY Tech Alliance

    Judith Klein-Seetharaman

    Eugene Kolker,
    IBM

    David Kreutter

     


    Nasir Memon,
    NYU Tandon
     

    Giuseppe Narzisi,
    New York Genome Center

    Laxmi Parida,
    IBM

    Clyde Rodriguez
     
    Larry Rudolph,
    Two Sigma

    Ravi Sachidanandam,
    Mt. Sinai Hospital
     

    Rahul Satija,
    New York Genome Center

    Adam Siepel,
    Cold Spring Harbor Lab

    Admission Requirements

    In order to be eligible to apply for any of our Master’s programs, you must meet the following criteria:

    You must hold a Bachelor's Degree from an accredited institution, which includes a minimum of four years of full-time study. Bachelor of Engineering degrees (based on 180+ ECTS credits) may also be considered. Attention will be given to the programs accredited by ABET and programs accredited/approved by other various regional accrediting associations.


    This program requires a graduate status and certain prerequisite courses depending on your background. If you have a background in computer science or a similar program, you are required to take a chemical and biological foundation in Bioinformatics course. If you are from a chemical or biological science background, you are required to take Introduction to Programming and Problem Solving and Data Structures and Algorithms.


    The following is a list of all action items required to apply.

    • Application
    • Application Fee
    • Personal Statement
    • Resume
    • Official Transcripts
    • Letters of Recommendation
    • GRE or GMAT Score

      The GRE is required for full-time applicants to this program and is not required for part-time applicants. It cannot be substituted with the GMAT.

    • English Language Proficiency Testing

    For more details on the above list, please review the Master’s and Advanced Certificate Application Checklist section.


    Curriculum

    Degree Requirements: 30 Credits

    3 Credits Algorithms and Data Structures for Bioinformatics BI-GY7453
    The online course is aimed at introducing the foundational ideas from computer science in designing and implementing bioinformatics algorithms. The goal of the underlying algorithms and data structures is to accurately abstract and model the biological problems and to devise provably correct procedures with efficient computational complexity bounds. The algorithms will be described in pseudo-codes in order to simplify the correctness and complexity analysis, but with sufficient details to enable the students implement them in any suitable software pipelines and hardware architectures.
    Prerequisites: MA-UY 2314
    3 Credits Proteomics for Bioinformatics BI-GY7543
    The online proteomics course contributes an application focused specialty class to the bioinformatics curriculum. It will be a tour-de-force of modern proteomics methods and analysis in the context of practical research and clinical applications. The course will teach fundamentals, applications, experiments and predictions in parallel. Thus, each week will include a mix of interactive approaches from background learning, to understanding experimental methodology pro and con, to software usage and sophisticated bioinformatics approaches to prediction. Limitations and complementary of prediction methods will be emphasized. It is desirable (but not required) for students to complete a Biochemistry course before taking this course.
    Prerequisites: Bioinformatics I.
    3 Credits Bioinformatics Iii: Functional Prediction BI-GY7553
    The course covers functional classifications of proteins; prediction of function from sequence and structure; Orthologs and Paralogs; representations of biological pathways; available systems for the analysis of whole genomes and for human-assisted and automatic functional prediction.
    Prerequisites: Bioinformatics II
    3 Credits Next Generation Sequence Analysis for Bioinformatics BI-GY7653
    The online course is aimed at developing practical bioinformatics skills of next generation sequencing analysis. Students will be introduced to current best practices and in high-throughput sequence data analysis and they will have the opportunity to analyze real data in a high-performance Unix-based computing environment. Special attention will be given to understand the advantages, limitations, and assumptions of most widely bioinformatics methods and the challenges involved in the analysis of large scale datasets. Some of the topics that will be covered include, current sequencing platforms, data formats (FASTA, SAM, BAM, VCF), sequence alignment, sequence assembly, variant calling, RNA-seq analysis, and their biological applications. Students enroll into this course should have knowledge of Basic of programming, unix tools, and shell scripting.
    3 Credits Biology and Biotechnology for Bioinformatics BI-GY7683
    The online course is aimed at introducing the key ideas from biology and biochemistry and how they are used in modern biotechnology. The goal of this course is to develop students’ critical thinking and analytical reasoning skills in the specific context of biotechnology and its modern applications. This course will explore a plethora of technologies used in the fields of genetic engineering, forensics, agriculture, bioremediation and medicine in order to give the students a basic but fundamental experimental skill set which can be applied in conjunction with computational skills to solve biological problems in a scalable manner. Students enroll into this course should have knowledge of basic Sciences (Biology, Physics and Chemistry).


    3 Credits Biological Foundation for Bioinformatics BI-GY7523
    This course intensively reviews the aspects of biochemistry, molecular biology and cell biology necessary to begin research in bioinformatics and to enter graduate courses in biology. The areas covered include cell structure, intracellular sorting, cellular signaling (i.e., receptors), Cytoskelton, cell cycle, DNA replication, transcription and translation. This course extensively uses computer approaches to convey the essential computational and visual nature of the material to be covered.
    Prerequisites: General Chemistry, General Physics, Organic Chemistry, Calculus or permission of instructor.
    3 Credits Proteomics for Bioinformatics BI-GY7543
    The online proteomics course contributes an application focused specialty class to the bioinformatics curriculum. It will be a tour-de-force of modern proteomics methods and analysis in the context of practical research and clinical applications. The course will teach fundamentals, applications, experiments and predictions in parallel. Thus, each week will include a mix of interactive approaches from background learning, to understanding experimental methodology pro and con, to software usage and sophisticated bioinformatics approaches to prediction. Limitations and complementary of prediction methods will be emphasized. It is desirable (but not required) for students to complete a Biochemistry course before taking this course.
    Prerequisites: Bioinformatics I.
    3 Credits Special Topics in “informatics in Chemical and Biological Sciences” BI-GY7573
    This course covers special topics on various advanced or specialized topics in chemo- or bioinformatics that are presented at intervals.
    3 Credits Introduction to Systems Biology BI-GY7613
    This course explains the functioning of basic circuit elements in transcription regulation, signal transduction and developmental networks of living cells, using simplified mathematical models. The course focuses on design principles and information processing in biological circuits. It discusses network motifs, modularity, robustness, evolutional optimization and error minimization by kinetic proofreading in specific applications to bacterial chemotaxis, developmental patterning, neuronal circuits and immune recognition in several well-studied biological systems.
    Prerequisites: Bioinformatics II
    3 Credits Systems Biology: -omes and -omics BI-GY7623
    This course summarizes knowledge in genomics, proteomics, transcriptomics, metabolomics and relative molecular technologies. Topics include an overview of technologies in functional genomics (DNA chip arrays); whole genome expression analysis (EST, MPSS, SAGE, arrays); proteome analysis technology (2D-electrophoresis, protein in situ digestion for mass spectrometric analysis, yeast 2-hybrid analysis. 2-D PAGE, MALDI-TOF spectroscopy); the principles of Nuclear Magnetic Resonance Spectroscopy and Mass Spectrometry technologies for metabolomics, including general principles, the strengths and weaknesses of each technique, the requirements for sample preparation and the options for the management of output data. This course explains how to exploit different -ome database resources for investigations via special practical tasks to lectures. Special attention is focused on nutrigenomics, a multidisciplinary science that uses genomics, transcriptomics and proteomics to study metabolic health. This relatively new area of metabolomics has the potential to contribute significantly to advances in nutrition and health.
    Prerequisites: Bioinformatics II, Bioinformatics III
    3 Credits Transcriptomics BI-GY7633
    Screening of differential expression of genes using microarray technology builds the opportunities for personalized medicine converging soon to medical informatics and to our health care system. The course will start with a discussion of gene expression biology, presenting microarray platforms, design of experiments, and Affymetrix file structures and data storage. R programming is introduced for the preprocessing Affymetrix data for Image analysis, quality control and array normalization, log transformation and putting the data together. Bioconductor software will be dealt with data importing, filtering, annotation and analysis. Machine learning concepts and tools for statistical genomics will be addressed along with distance concept, cluster analysis, heat map and class discovery. Case studies link the methodology to biomolecular pathways, gene ontology, genome browsing and drug signatures.
    3 Credits Next Generation Sequence Analysis for Bioinformatics BI-GY7653
    The online course is aimed at developing practical bioinformatics skills of next generation sequencing analysis. Students will be introduced to current best practices and in high-throughput sequence data analysis and they will have the opportunity to analyze real data in a high-performance Unix-based computing environment. Special attention will be given to understand the advantages, limitations, and assumptions of most widely bioinformatics methods and the challenges involved in the analysis of large scale datasets. Some of the topics that will be covered include, current sequencing platforms, data formats (FASTA, SAM, BAM, VCF), sequence alignment, sequence assembly, variant calling, RNA-seq analysis, and their biological applications. Students enroll into this course should have knowledge of Basic of programming, unix tools, and shell scripting.
    3 Credits Problem Solving for Bioinformatics BI-GY7663
    This course will introduce students to programming in Bioinformatics. The focus will be on object oriented techniques of scripting. Cancer data will be used as examples throughout the course.
    3 Credits Applied Biostatistics for Bioinformatics BI-GY7673
    This online course will introduce the basics of statistics and its applications in various fields of biology. It will lean towards practical applications, allowing for an intuitive understanding of concepts and some rigor in the application of statistics. It will use R for all the programming exercises. The course will not be requiring a lot of programming, and the requisite skills will be introduced in the lectures. The problems, exercises and assignments will be drawn from real-life problems in research papers and books. The student should be able the initiate and solve problems in the field at the end of the course. Students enroll into this course should have knowledge of basics of programming, probability and statistics.
    3 Credits Biology and Biotechnology for Bioinformatics BI-GY7683
    The online course is aimed at introducing the key ideas from biology and biochemistry and how they are used in modern biotechnology. The goal of this course is to develop students’ critical thinking and analytical reasoning skills in the specific context of biotechnology and its modern applications. This course will explore a plethora of technologies used in the fields of genetic engineering, forensics, agriculture, bioremediation and medicine in order to give the students a basic but fundamental experimental skill set which can be applied in conjunction with computational skills to solve biological problems in a scalable manner. Students enroll into this course should have knowledge of basic Sciences (Biology, Physics and Chemistry).
    3 Credits Population Genetics and Evolutionary Biology for Bioinformatics BI-GY7693
    The online course is aimed at introducing the key ideas from population genetics and how they are used to understand the interaction of basic evolutionary processes (e.g., including mutation, natural selection, genetic drift, inbreeding, recombination and gene flow) that determine the genetic composition and evolutionary trajectories of natural populations. The goal of this course is to develop students’ critical thinking and analytical reasoning skills in the specific context of many mechanisms shaping genetic variations and within and between populations. This course will equip the students with mathematical and experimental skills to address public health issues.
    3 Credits Statistics and Mathematics for Bioinformatics BI-GY7723
    The online course is aimed at introducing the fundamental concepts from mathematics, probability and statistics, as relevant to bioinformatics and computational biology. Students enroll into this course should have knowledge of Calculus and Discrete Mathematics.
    3 Credits Translational Genomics and Computational Biology BI-GY7733
    This online course will introduce will expose the students to different aspects of the data analysis and modeling activities that are expected of a Bioinformatician or a Computational Biologist. This course will offer a wide spectrum of examples of applications roughly divided in two broad parts: (a) data analysis in a "translational" settings and (b) more "computational" approaches to Biology pertaining the simulation of biological systems. This course will explore a different set of online resources that contain complex data models of data (e.g., cancer data from TCGA and ICGC); the data thus collected will be used to expose novel model reconstruction tools. Other online resources and related exchange formats will be explored in order to show how simulation of biological systems models (and the related problem of their parameter tuning) in its different forms has become more and more usable and an important tool for biomedicine. Students enroll into this course should have knowledge of basics of programming, undergraduate calculus, probability and statistics, introductory cell biology.
    Pre-requisites: BI-GY 7673
    3 Credits Machine Learning and Data Science for Bioinformatics BI-GY7743
    This online course is aimed at developing practical machine learning and econometric (time series) skills with applications to biological data. The course will use examples from bioinformatics application areas throughout and will emphasize translational aspects.


    To satisfy the Capstone students may choose to take either the Guided Studies or Thesis courses.

    Guided Studies (maximum 6 Credits)

    3 Credits Guided Studies in Bioinformatics I BI-GY7583
    This research/case course can be handled in different ways at the faculty adviser’s discretion. The course may involve a series of cases that are dissected and analyzed, or it may involve teaming students with industry personnel for proprietary or non-proprietary research projects. Generally, the student works under faculty supervision, but the course is intended to be largely self-directed within the guidelines established by the supervising faculty member. Master’s degree candidates must submit an unbound copy of their report to adviser/s one week before the last day of classes.
    Prerequisite: degree status.
    3 Credits Guided Studies in Bioinformatics II BI-GY7593
    This research/case course can be handled in different ways at the faculty adviser’s discretion. The course may involve a series of cases that are dissected and analyzed, or it may involve teaming students with industry personnel for proprietary or non-proprietary research projects. Generally, the student works under faculty supervision, but the course is intended to be largely self-directed within the guidelines established by the supervising faculty member. Master’s degree candidates must submit an unbound copy of their report to adviser/s one week before the last day of classes.
    Prerequisite: degree status.

    (OR)

    Thesis Course (maximum 9 Credits)

    You can register for the Thesis course each semester up to a maximum of three times equivalent to 9 Credits maximum.

    MS Thesis in Bioinformatics BI-GY997X
    Original research, which serves as basis for master’s degree. Minimum research registration requirements for the master’s thesis: 12 units. Registration for research required each semester consecutively until students have completed adequate research projects and acceptable theses and have passed required oral examinations. Research credits registered for each semester realistically reflect time devoted to research.
    Prerequisites for MS candidates: Degree status and consent of graduate adviser and thesis director.


    Suggested Courses

    Either thesis, 9 credits cumulatively over three semesters, or Guided Studies, six credits over two semesters, is a Capstone requirement for completion of the MS in Bioinformatics.

    All courses are subject to change.

    Either thesis, 9 credits cumulatively over three semesters, or Guided Studies, six credits over two semesters, is a Capstone requirement for completion of the MS in Bioinformatics.

    All courses are subject to change.


    3 Credits Algorithms and Data Structures for Bioinformatics BI-GY7453
    The online course is aimed at introducing the foundational ideas from computer science in designing and implementing bioinformatics algorithms. The goal of the underlying algorithms and data structures is to accurately abstract and model the biological problems and to devise provably correct procedures with efficient computational complexity bounds. The algorithms will be described in pseudo-codes in order to simplify the correctness and complexity analysis, but with sufficient details to enable the students implement them in any suitable software pipelines and hardware architectures.
    Prerequisites: MA-UY 2314
    3 Credits Problem Solving for Bioinformatics BI-GY7663
    This course will introduce students to programming in Bioinformatics. The focus will be on object oriented techniques of scripting. Cancer data will be used as examples throughout the course.
    3 Credits Biology and Biotechnology for Bioinformatics BI-GY7683
    The online course is aimed at introducing the key ideas from biology and biochemistry and how they are used in modern biotechnology. The goal of this course is to develop students’ critical thinking and analytical reasoning skills in the specific context of biotechnology and its modern applications. This course will explore a plethora of technologies used in the fields of genetic engineering, forensics, agriculture, bioremediation and medicine in order to give the students a basic but fundamental experimental skill set which can be applied in conjunction with computational skills to solve biological problems in a scalable manner. Students enroll into this course should have knowledge of basic Sciences (Biology, Physics and Chemistry).


    3 Credits Bioinformatics Iii: Functional Prediction BI-GY7553
    The course covers functional classifications of proteins; prediction of function from sequence and structure; Orthologs and Paralogs; representations of biological pathways; available systems for the analysis of whole genomes and for human-assisted and automatic functional prediction.
    Prerequisites: Bioinformatics II
    3 Credits Next Generation Sequence Analysis for Bioinformatics BI-GY7653
    The online course is aimed at developing practical bioinformatics skills of next generation sequencing analysis. Students will be introduced to current best practices and in high-throughput sequence data analysis and they will have the opportunity to analyze real data in a high-performance Unix-based computing environment. Special attention will be given to understand the advantages, limitations, and assumptions of most widely bioinformatics methods and the challenges involved in the analysis of large scale datasets. Some of the topics that will be covered include, current sequencing platforms, data formats (FASTA, SAM, BAM, VCF), sequence alignment, sequence assembly, variant calling, RNA-seq analysis, and their biological applications. Students enroll into this course should have knowledge of Basic of programming, unix tools, and shell scripting.
    3 Credits Applied Biostatistics for Bioinformatics BI-GY7673
    This online course will introduce the basics of statistics and its applications in various fields of biology. It will lean towards practical applications, allowing for an intuitive understanding of concepts and some rigor in the application of statistics. It will use R for all the programming exercises. The course will not be requiring a lot of programming, and the requisite skills will be introduced in the lectures. The problems, exercises and assignments will be drawn from real-life problems in research papers and books. The student should be able the initiate and solve problems in the field at the end of the course. Students enroll into this course should have knowledge of basics of programming, probability and statistics.
    3 Credits Statistics and Mathematics for Bioinformatics BI-GY7723
    The online course is aimed at introducing the fundamental concepts from mathematics, probability and statistics, as relevant to bioinformatics and computational biology. Students enroll into this course should have knowledge of Calculus and Discrete Mathematics.


    3 Credits Guided Studies in Bioinformatics I BI-GY7583
    This research/case course can be handled in different ways at the faculty adviser’s discretion. The course may involve a series of cases that are dissected and analyzed, or it may involve teaming students with industry personnel for proprietary or non-proprietary research projects. Generally, the student works under faculty supervision, but the course is intended to be largely self-directed within the guidelines established by the supervising faculty member. Master’s degree candidates must submit an unbound copy of their report to adviser/s one week before the last day of classes.
    Prerequisite: degree status.
    3 Credits Transcriptomics BI-GY7633
    Screening of differential expression of genes using microarray technology builds the opportunities for personalized medicine converging soon to medical informatics and to our health care system. The course will start with a discussion of gene expression biology, presenting microarray platforms, design of experiments, and Affymetrix file structures and data storage. R programming is introduced for the preprocessing Affymetrix data for Image analysis, quality control and array normalization, log transformation and putting the data together. Bioconductor software will be dealt with data importing, filtering, annotation and analysis. Machine learning concepts and tools for statistical genomics will be addressed along with distance concept, cluster analysis, heat map and class discovery. Case studies link the methodology to biomolecular pathways, gene ontology, genome browsing and drug signatures.
    3 Credits Proteomics for Bioinformatics BI-GY7543
    The online proteomics course contributes an application focused specialty class to the bioinformatics curriculum. It will be a tour-de-force of modern proteomics methods and analysis in the context of practical research and clinical applications. The course will teach fundamentals, applications, experiments and predictions in parallel. Thus, each week will include a mix of interactive approaches from background learning, to understanding experimental methodology pro and con, to software usage and sophisticated bioinformatics approaches to prediction. Limitations and complementary of prediction methods will be emphasized. It is desirable (but not required) for students to complete a Biochemistry course before taking this course.
    Prerequisites: Bioinformatics I.


    3 Credits Special Topics in “informatics in Chemical and Biological Sciences” BI-GY7573
    This course covers special topics on various advanced or specialized topics in chemo- or bioinformatics that are presented at intervals.
    3 Credits Guided Studies in Bioinformatics II BI-GY7593
    This research/case course can be handled in different ways at the faculty adviser’s discretion. The course may involve a series of cases that are dissected and analyzed, or it may involve teaming students with industry personnel for proprietary or non-proprietary research projects. Generally, the student works under faculty supervision, but the course is intended to be largely self-directed within the guidelines established by the supervising faculty member. Master’s degree candidates must submit an unbound copy of their report to adviser/s one week before the last day of classes.
    Prerequisite: degree status.
    3 Credits Population Genetics and Evolutionary Biology for Bioinformatics BI-GY7693
    The online course is aimed at introducing the key ideas from population genetics and how they are used to understand the interaction of basic evolutionary processes (e.g., including mutation, natural selection, genetic drift, inbreeding, recombination and gene flow) that determine the genetic composition and evolutionary trajectories of natural populations. The goal of this course is to develop students’ critical thinking and analytical reasoning skills in the specific context of many mechanisms shaping genetic variations and within and between populations. This course will equip the students with mathematical and experimental skills to address public health issues.
    3 Credits Translational Genomics and Computational Biology BI-GY7733
    This online course will introduce will expose the students to different aspects of the data analysis and modeling activities that are expected of a Bioinformatician or a Computational Biologist. This course will offer a wide spectrum of examples of applications roughly divided in two broad parts: (a) data analysis in a "translational" settings and (b) more "computational" approaches to Biology pertaining the simulation of biological systems. This course will explore a different set of online resources that contain complex data models of data (e.g., cancer data from TCGA and ICGC); the data thus collected will be used to expose novel model reconstruction tools. Other online resources and related exchange formats will be explored in order to show how simulation of biological systems models (and the related problem of their parameter tuning) in its different forms has become more and more usable and an important tool for biomedicine. Students enroll into this course should have knowledge of basics of programming, undergraduate calculus, probability and statistics, introductory cell biology.
    Pre-requisites: BI-GY 7673