Day 2. Introduction to Programming

Much of bioinformatics requires manipulation of data data sets, execution of multiple external programs, and summary and analysis of results. Many programming langagues, such as Perl and Python excel at these tasks. Python is a powerfull language with many external packages that permit sophisticated analysis workflows. Here, we will provide an overview of the Python language, demonstrate basic concepts in programming, and show how to create figures and utilize Jupyter notebooks.


Schedule:

Session Time Topics
I 9:00-10:15 AM Intro to Python and Programming Concepts
  10:15-10:30AM Coffee Break
II 10:30-12:00 AM Variables, Data Structures, and I/O
  12:00-1:00PM Lunch
III 1:00-2:15 PM Numpy, Control Structures and Functions
  2:15-2:30 PM Coffee Break
IV 2:30-4:00 PM Pandas and Plotting


Instructors:

Cristina Mitrea


Topics:

I) Intro to Python and Programming Concepts [1:15 hr] (Slides)

  • What is programming?
  • Programming concepts
  • What is a programming language?
  • What is Python?
  • Python syntax

—- Coffee Break [15 mins] —

II) Variables, Data Structures, and I/O [1:30 hr] (Notebook)

  • Python notebooks
  • Basic code writing
  • What is a variable?
  • Data Structures:
    • Mutable vs Immutable
    • Lists vs Tuples
    • Dictonaries and Sets
  • Input/Output and File Handling
  • numpy

—- Lunch Break [1 hr] —

III) Numpy, Control Structures and Functions [1.15 hr] (Notebook)

  • numpy
  • Control structures and loops
  • Functions

—- Coffee Break [15 mins] —

IV) Pandas and Plotting [1.30 hr] (Notebook)

  • pandas
  • Tabular data analysis with pandas
  • Plotting using plotnine, matplotlib and seaborn

—- End/Wrap-Up —


Resources

Link Description
Cheat Sheet Basic Python beginner’s cheat sheet
CodeAcademy An interactive online python tutorial for beginners
Hitchhiker’s Guide to Python Guide for both novice and expert Python developers to installation, configuration, and usage best practices
Google Python Style Guide Python is the main dynamic language used at Google. This style guide is a list of dos and don’ts for Python programs
Matplotlib Gallery Some examples of the power of matplotlib
Jupyter Project Project Jupyter is a non-profit, open-source project that supports interactive data science and scientific computing across all programming languages
Binder Project How the material for this day’s sessions are being served
NumFOCUS NumFOCUS offers many programs in support of our mission to promote sustainable high-level programming languages, open code development, and reproducible scientific research