Comp Lesson 1: Introduction to Python


Good afternoon

Goal today is to give you a whirlwind introduction to the Python programming language. If you know Python already, today will be boring for you. I will not be offended if you chose to open your devices and do other work now. Because, in some ways, this will be a grueling introduction of line by line code, I will have a brief break (I wouldn’t be surprised if our numbers dwindle as we go along – no worries). Goal is to get you set up and comfortable exploring Python. As part of this the first Homework will test that you can install new packages, find functionality that you need, write your own functions, use Jupyther Notebooks, load, analzye, visualize data, and along the way, demonstrate some use of basic Python operations/flow control/data types.

Installation

Mac

If you are on a Mac, please make sure that XCode is installed!!! This takes awhile, and can be done through the App store. The pre-installed Python is old, not to be used here, but don’t remove it!

Windows

If you are using Windows, note that you will need to use Chrome or Firefox (note Edge) for Jupyter Notebooks.

How to install

There are several ways to install Python locally on your computer.

Installation method 1 (old school): You can download a binary of Python directly from the official Python site corresponding to your platform. Once you follow the corresponding instructions for the installation process, you can launch the Python interpreter as described below. You should also install Jupyter Lab or Jupyter Notebook from here using the pip python package manager.

Installation method 2 (recommended): is to use the Anaconda package manager. If you already have Anacona froma Python < 3.8, please uninstall it.

Now, to install Anaconda (and up-to-date version of Python, too), just download the installler from here and follow the instructions to install Python 3.8. Go ahead and submit your hopkins email address when you register. As the install proceeds, make sure that you install Anaconda to your home directory, not to root.

Option: Install node.js for additional Jupyter features.

Launch

Each line of python code is translated into instructions for your machine. This is different than compiled languages like C, in which a program is translated into machine format prior to execution. The Python interpreter is what does this translation. If Python is installed, you can launch the interpreter Unix-like systems from the command line … (walk through process of finding Terminal, etc)

Python

Here in this class, I am never (at least as far as I can predict) going to illustrate Python analysis by accessing the interpreter via the command line. Rather, we are going to use what are called Jupyter Notebooks. These notebooks are files that contain executable code, and non-executable content such as text and images. They were designed to enourage both reproducible analysis, and to facilitate productive exploration of data. They are also great for teaching. Jupyter should have been installed using either of the methods listed above. You can launch it from the terminal with either

jupyter-lab # installed from pip

# or

jupyter lab # installed from anaconda

# or, if you are using Jupyter Notebook

jupyter notebook

Walk through process and Lab IDE, and components of a Notebook

[ ]:

Google Colab is a cloud-backed compute platform. To save your notebooks you will need to do so either to Google Drive or to Github. If you intend to use the latter, then you must get a Github account (free).

You will use Jupyter Notebooks to complete your homework. You can either install Python and Jupyter on your own system, or, you can use Google Colab. You will need to turn in .ipynb and .html files. Saving a Notebook to .html is easy with Jupyter (show how). There are several ways to do so from Colab, including this method.

Python fundamentals

Let’s start with the canonical 1st line of code when you are learning a new language, Hellow, World!

[86]:
print("Hello, World")
Hello, World

What can we learn from this trivial example? First, that Python has built-in functions, like print(). And we can see that this function takes as an argument a string, and it displays this string in the standard output.

You might ask: “What is an argument?”

The official definition from PC Magazine:

In programming, a value that is passed between programs, subroutines or functions. Arguments are independent items, or variables, that contain data or codes. When an argument is used to customize a program for a user, it is typically called a “parameter.”

OK, so what is a string? In this case, a string is a built-in data type that represents sequences of characters. Another built in data type is an int that represents positive and negative whole numbers.

[87]:
314
-1
0
2**300 + 1
[87]:
2037035976334486086268445688409378161051468393665936250636140449354381299763336706183397377

Note that the size of Python’s integers is limited by machine memory – not by byte size.

There are other several built-in data types, including:

  • bool: boolean, True or False

  • float: floting point double precision numbers

We will come back to booleans later today.

Let’s go back to strings for a minute, and talk about object references or variables. Variables are ways to store values or data, refer to them later and manipulate or use them. Some important aspects of Python variables: - There are some restrictions on variable names, or identifiers (e.g. cannot clash with keywords, need to start with _ or character). - dynamic typing (you do not need to tell the interpreter what data type they will point to

An example will help to illustrate how they work:

[88]:
a = "Winter"
b = "Spring"
c = a
print(a, b, c)
Winter Spring Winter
[89]:
c = b
print(a, b, c)
Winter Spring Spring
[90]:
a = c
print(a, b, c)
Spring Spring Spring

Like I said before, strings can be thought of as sequences of characters. Python has nice functionality for dealing with sequences of things. For example, square brackets ([]) instruct the interpreter to fetch the indicated item in the given sequence. Let’s assign the first sentence of a classic stem cell paper to a variable:

[93]:
first_sent ='''Normal mammalian hemopoietic tissue produces a continuous supply of differentiated blood cells whose functions are essential for life.'''
print(first_sent[3])

print(first_sent[0])

m
N

What you can see from this example is that Python starts indexing sequences at 0, not at 1. Also, does not automatically parse this string into words. Finally, note that the triple quotes act to escape line breaks, quotes, etc:

[34]:
first_sent ='''Normal mammalian hemopoietic tissue produces a continuous
supply of differentiated blood cells whose functions are essential for life.'''
print(first_sent)
Normal mammalian hemopoietic tissue produces a continuous
supply of differentiated blood cells whose functions are essential for life.

type() returns the datatype of a variable

[94]:
a = "42"
type(a)
[94]:
str

Python provides functions to convert between types with mostly the outcomes that you would expect:type() returns the datatype of a variable

[95]:
b = int(a)
print(b)
type(b)
42
[95]:
int
[9]:
int(42)
[9]:
42

Lots of times, you want to store more than one value or object; you want something like a collection or list. Python has the following four data types (or classes) that can hold more than one object:

  • list: holds arbitrary number of objects / values. can be altered (mutable). entries are ordered

  • tuple: same as list, except it cannot be altered (immutable)

  • dictionary: holds arbitrary number of objects / values. can be altered (mutable). entries are un-ordered but are indexed by name:value pairs. Mutable

  • set: same as dictionary except entries are unique

NB: These collections are allowed to contain values of different types. For now, let’s just look at lists and tuples.

Tuples

Tuples are created with the “,” operator:

[96]:
"Bill", "Ted", "Kermit", "Ernie"
[96]:
('Bill', 'Ted', 'Kermit', 'Ernie')
[97]:
"OnlyOne",
[97]:
('OnlyOne',)
[98]:
"OnlyOne"
[98]:
'OnlyOne'

Lists

Lists are created by enclosing comma separated sequences with square brackets:

[99]:
[10,20,30,40]
[99]:
[10, 20, 30, 40]
[100]:
[a, b, c]
[100]:
['42', 42, 'Spring']
[101]:
[a,b,b,192]
[101]:
['42', 42, 42, 192]
[102]:
type([])
[102]:
list
[103]:
type(())
[103]:
tuple

Collections are nestable.

[105]:
newList = [a,b,c]
biggerList = [newList, "Summer"]
print(biggerList)
[['42', 42, 'Spring'], 'Summer']

The builtin function len returns the lenght of a collection:

[106]:
len(newList)
[106]:
3
[107]:
len(biggerList)
[107]:
2

Remember that you can alter lists:

[40]:
newList = [a,b,c]
newList.append("quasiSpring")
print(newList)
['Winter', 'Spring', 'Spring', 'quasiSpring']
[108]:
newList[0] = "Fall"
print(newList)
['Fall', 42, 'Spring']
[109]:
newTuple = (a,b,c)
print(newTuple)
('42', 42, 'Spring')
[110]:
newTuple[0] = "quasiSpring"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-110-cb303cfb194e> in <module>
----> 1 newTuple[0] = "quasiSpring"

TypeError: 'tuple' object does not support item assignment

Logical operations

All programming languages have operators (functions) that return true or false values. There are 4 types of these operators in Python:

  • identity operators

  • comparison operators

  • membership operators

  • logical operatiors

Let’s start with identity operators first:

Identity operators

When you sue these, you want to know if two obects are equivalent in some sense. Here, we want to know whether variables point to the same object or not.

[111]:
newA = ["iPhone", "Ipad", 99]
newB = ["iPhone", "Ipad", 99]
newA is newB
[111]:
False
[112]:
newB = newA
newA is newB
[112]:
True

These comparisons can be counter-intuitive, but they are fast: just checking whether variables point to the same location in memory

Comparison operators

Most often, we want to know whether variables have the same value. We can use the typical comparison operators for this and related tasks:

[113]:
a = 10
b = 70
a == b
[113]:
False
[114]:
a < b
[114]:
True
[57]:
a > b, a == b, b <= a, a == a
[57]:
(False, False, False, True)
[115]:
a = 60
b = 60
a == b

[115]:
True
[116]:
a is b # What do you think should happen here???
[116]:
True

Membership operators

Very handy – we can quickly test whether a value is in a collection with membership operators:

[117]:
crazyList = ["blue", "pancake", "popcorn", "soda", 192, "gastrulation", 24]
3 in crazyList
[117]:
False
[118]:
"Albert" not in crazyList
[118]:
True

Remeber our long sentence from before? You can test for substring occurnces:

[119]:
print(first_sent)
"are" in first_sent
Normal mammalian hemopoietic tissue produces a continuous supply of differentiated blood cells whose functions are essential for life.
[119]:
True

Logical operators

  • and

  • or

  • not

and/or return Booleans if they occur in a boolean context (inside a conditional statement) or if their operands are boolean. Otherwise, they will return the operand that determined the result of the operation

[70]:
True and True
[70]:
True
[71]:
True and False
[71]:
False
[72]:
True or False
[72]:
True
[73]:
not True or False
[73]:
False

Control Flow

Most programs will alter their behavior based on input or outcomes of prior computations. And often, the same set of computations will be repeated. To implement these behaviors, Python has some flow control statements, including the following:

  • if

  • while

  • for

Let’s start with the if statment:

[74]:
if True:
    print("This is always true")
else:
    print("This is never true")
This is always true

The while executes a set of code (or suite) some number of times depending on the while’s boolean statement:

[75]:
a = 10
while a > 0:
    print(a)
    a = a - 1

10
9
8
7
6
5
4
3
2
1

the for statement uses in, and iterates over the values in the supplied collection:

[76]:
crazyList = ["blue", "pancake", "popcorn", "soda", 192, "gastrulation", 24]
for item in crazyList:
    print(item)

blue
pancake
popcorn
soda
192
gastrulation
24
[85]:
for item in crazyList:
    if isinstance(item, str):
        print("I do not like " + item)


I do not like blue
I do not like pancake
I do not like popcorn
I do not like soda
I do not like gastrulation
[ ]: