Skip to content

Commit 8682c50

Browse files
authored
Merge pull request #9 from fhdsl/Season_5
update
2 parents 5cb2e2e + 785c942 commit 8682c50

82 files changed

Lines changed: 6524 additions & 119 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

01-Fundamentals.qmd

Lines changed: 10 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,16 @@ You can add more key-value pairs via `my_dict[new_key] = new_value` syntax. If t
220220
sentiment['dog'] = 5
221221
```
222222

223+
### Dictionary vs. List
224+
225+
Both data types help you organize values, and they differ how you access the values. You access a list's values via a numerical index, and you access a dictionary's values via a key.
226+
227+
![](https://open.oregonstate.education/app/uploads/sites/6/2016/10/II.8_1_dict_list_indices_values.png#fixme)
228+
229+
![](https://open.oregonstate.education/app/uploads/sites/6/2016/10/II.8_2_dict_keys_values.png#fixme)
230+
231+
Source: <https://open.oregonstate.education/computationalbiology/chapter/dictionaries/>
232+
223233
### Application for Data Cleaning
224234

225235
Suppose that you want to do some data recoding. You want to look at the "case_control" column of `simple_df` and change "case" to "experiment" and "control" to "baseline". This correspondence relationship can be stored in a dictionary. You can use the `.replace()` method for Series objects with a dictionary as an input argument.
@@ -314,21 +324,3 @@ isinstance(simple_df, list)
314324
## Exercises
315325

316326
Exercise for week 1 can be found [here](https://colab.research.google.com/drive/1nskVV4XFDVjkN_6OIQJDtDettOEr-n5W?usp=sharing).
317-
318-
## Appendix: Why Dictionaries?
319-
320-
If we didn't have a tool such as Dictionary, we could have tried to implement the following via Pandas Dataframes:
321-
322-
```{python}
323-
sentiment_df = pd.DataFrame(data={'word': ["happy", "sad", "joy", "embarrassed", "restless", "apathetic", "calm"],
324-
'sentiment': [8, 2, 7.5, 3.6, 4.1, 3.8, 7]})
325-
sentiment_df
326-
```
327-
328-
But to access a word's sentiment value, you have to write a complex syntax:
329-
330-
```{python}
331-
sentiment_df.loc[sentiment_df.word == "joy", "sentiment"]
332-
```
333-
334-
Besides the cumbersome syntax, it is not very fast: the program has to find which row "joy" is at. Whereas, in the dictionary data structure, the lookup is immediate. The time it takes for dictionary to take a key and retrieve a value *does not depend on the size of the dictionary,* whereas it does for the Dataframe implementation.

02-Iteration.qmd

Lines changed: 49 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ It turns out that we can iterate over *many* types of data structures in Python.
2727

2828
- DataFrame
2929

30-
- Ranges (to be introduced in exercises)
30+
- Ranges
3131

3232
- Dictionary
3333

@@ -62,7 +62,7 @@ for rate in heartrates:
6262
print("Current heartrate:", rate)
6363
```
6464

65-
Here is what the Python interpretor is doing:
65+
Here is what the Python interpreter is doing:
6666

6767
1. Assign `heartrates` as a list.
6868
2. Enter For-Loop: `rate` is assigned to the next element of `heartrates`. If it is the first time, `rate` is assigned as the first element of `heartrates`.
@@ -142,110 +142,83 @@ Let's see this example step by step:
142142

143143
If it doesn't load properly, here is the [link](https://pythontutor.com/render.html#code=import%20math%0Aheartrates%20%3D%20%5B68,%2054,%2072,%2066,%2090,%20102,%2049%5D%0Aprint%28%22Before%3A%22,%20heartrates%29%0A%0Afor%20index,%20m%20in%20enumerate%28heartrates%29%3A%0A%20%20print%28%22Index%3A%22,%20index,%20%22%20%20%20m%3A%22,%20m%29%0A%20%20heartrates%5Bindex%5D%20%3D%20math.log%28m%29%0A%20%20%23heartrates%5Bindex%5D%20%3D%20math.log%28heartrates%5Bindex%5D%29%20%23this%20is%20okay%20also.%0A%20%20%0Aprint%28%22After%3A%22,%20heartrates%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=311&rawInputLstJSON=%5B%5D&textReferences=false).
144144

145-
## Conditional Statements
145+
## For-Loops on other iterable data structures
146146

147-
As we iterate through values in a data structure, it is quite common to want to have your code do different things depending on the value. For instance, suppose you are recoding `heartrates`, and the numerical values should be "low" if it is between 0 and 60, "medium" if it is between 60 and 100, "high" if it is above 100, and "unknown" otherwise (when it is below 0 or other data type).
147+
### Tuple
148148

149-
Stepping back, we have been working with a *linear* way of executing code - we have unconditionally executing every line of code in our program. Here, we are create a "control flow" via **conditional statements** in which the your code will run a certain section *if* some conditions are met.
149+
You can loop through a Tuple just like you did with a List, but remember that you can't modify it!
150150

151-
Here is how the syntax looks like for conditional statements:
151+
### String
152152

153-
```
154-
if <expression1>:
155-
block of code 1
156-
elif <expression2>:
157-
block of code 2
158-
else:
159-
block of code 3
160-
161-
block of code 4
162-
```
153+
You can loop through a String by iterating on each letter within the String.
163154

164-
There are three possible ways the code can run:
155+
```{python}
156+
message = "I am hungry"
157+
for text in message:
158+
print(text)
159+
```
165160

166-
1. If `<expression1>` is evaluated as `True`, then `block of code 1` will be run. When done, it will continue to `block of code 4`.
167-
2. If `<expression1>` is evaluated as `False`, then it will ask if `<expression2>` is `True` or not. If `True`, then `block of code 2` will be run. When done, it will continue to `block of code 4`.
168-
3. If `<expression1>` and `<expression2>` are both evaluated as `False`, then `block of code 3` is run. When done, it will continue to `block of code 4`.
161+
However, Strings are **immutable**, similar to Tuples. So if you iterate via `enumerate()`, you won't be able to modify the original String.
169162

170-
An important takeaway is that *only one block of code can be run*.
163+
### Dictionary
171164

172-
Let's see how this apply to the data recoding example. Let's just assume the data we want to recode is just a single value in a variable `rate`, not the entire list `heartrates`:
165+
When you loop through a Dictionary, you loop through the Keys of the Dictionary:
173166

174167
```{python}
175-
rate = heartrates[0]
176-
print(rate)
177-
178-
if rate > 0 and rate <= 60:
179-
rate = "low"
180-
elif rate > 60 and rate <= 100:
181-
rate = "medium"
182-
elif rate > 100:
183-
rate = "high"
184-
else:
185-
rate = "unknown"
186-
187-
print(rate)
168+
sentiment = {'happy': 8, 'sad': 2, 'joy': 7.5, 'embarrassed': 3.6, 'restless': 4.1, 'apathetic': 3.8, 'calm': 7}
169+
for key in sentiment:
170+
print("key:", key)
188171
```
189172

190-
You don't always need multiple `if`, `elif`, `else` statements when writing conditional statements. In its simplest form, a conditional statement requires only an `if` clause:
173+
The `.items()` method for Dictionary is similar to the `enumerate()` function: it returns a list of tuples, and within each tuple the first element is a key, and the second element is a value.
191174

192175
```{python}
193-
x = -12
176+
sentiment.items()
177+
```
194178

195-
if x < 0:
196-
x = x * -1
197-
198-
print(x)
179+
```{python}
180+
for key, value in sentiment.items():
181+
print(key, "corresponds to ", value)
199182
```
200183

201-
Then, you can add an `elif` or `else` statement, if you like. Here's `if`-`elif`:
184+
### Ranges
202185

203-
```{python}
204-
x = .25
186+
**Ranges** are a collection of sequential numbers, such as:
205187

206-
if x < 0:
207-
x = x * -1
208-
elif x >= 0 and x < 1:
209-
x = 1 / x
210-
211-
print(x)
188+
- 1, 2, 3, 4, 5
212189

213-
```
190+
- 1, 3, 5
214191

215-
Here's `if`-`else`. The `in` statement asks whether an element (102) is found in an iterable data structure `heartrates`, and returns `True` if so:
192+
- 10, 15, 20, 25, 30
216193

217-
```{python}
218-
if 102 in heartrates:
219-
print("Found 102.")
220-
else:
221-
print("Did not find 102.")
222-
```
194+
It seems natural to treat Ranges as Lists, but the neat thing about them is that only the bare minimum information is stored: the start, end, and step size. This could be a huge reduction in memory...if you need a sequence of numbers between 1 and 1 million, you can either store all 1 million values in a list, or you can just have a Range that holds the start: 1, the end: 1 million, and the step size: 1. That's a big difference!
223195

224-
Finally, let's put the data recoding example within a For-Loop:
196+
You can create a Range via the following ways:
225197

226-
```{python}
227-
heartrates = [68, 54, 72, 66, 90, 102]
198+
- `range(stop)` which starts at 0 and ends in `stop` - 1.
199+
200+
- `range(start, stop)` which starts at `start` and ends in `stop` - 1
201+
202+
- `range(start, stop, step)` which starts at `start` and ends in `stop` - 1, with a step size of `step`.
203+
204+
When you create a Range object, it just tells you what the input values you gave it.
228205

229-
for index, rate in enumerate(heartrates):
230-
if rate > 0 and rate <= 60:
231-
heartrates[index] = "low"
232-
elif rate > 60 and rate <= 100:
233-
heartrates[index] = "medium"
234-
elif rate > 100:
235-
heartrates[index] = "high"
236-
else:
237-
heartrates[index] = "unknown"
238-
239-
print(heartrates)
206+
```{python}
207+
range(5, 50, 5)
240208
```
241209

242-
Let's see this in action step by step:
210+
Convert to a list to see its actual values:
243211

244-
<iframe width="800" height="500" frameborder="0" src="https://pythontutor.com/iframe-embed.html#code=heartrates%20%3D%20%5B68,%2054,%2072,%2066,%2090,%20102%5D%0A%0Afor%20index,%20rate%20in%20enumerate%28heartrates%29%3A%0A%20%20if%20rate%20%3E%200%20and%20rate%20%3C%3D%2060%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22low%22%0A%20%20elif%20rate%20%3E%2060%20and%20rate%20%3C%3D%20100%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22medium%22%0A%20%20elif%20rate%20%3E%20100%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22high%22%0A%20%20else%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22unknown%22%0A%20%20%20%20%0Aprint%28heartrates%29&amp;codeDivHeight=400&amp;codeDivWidth=350&amp;cumulative=false&amp;curInstr=0&amp;heapPrimitives=nevernest&amp;origin=opt-frontend.js&amp;py=311&amp;rawInputLstJSON=%5B%5D&amp;textReferences=false">
212+
```{python}
213+
list(range(5, 50, 5))
214+
```
245215

246-
</iframe>
216+
To use Ranges in a For-Loop, it's straightforward:
247217

248-
If it doesn't load properly, you can find it [here](https://pythontutor.com/render.html#code=heartrates%20%3D%20%5B68,%2054,%2072,%2066,%2090,%20102%5D%0A%0Afor%20index,%20rate%20in%20enumerate%28heartrates%29%3A%0A%20%20if%20rate%20%3E%200%20and%20rate%20%3C%3D%2060%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22low%22%0A%20%20elif%20rate%20%3E%2060%20and%20rate%20%3C%3D%20100%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22medium%22%0A%20%20elif%20rate%20%3E%20100%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22high%22%0A%20%20else%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22unknown%22%0A%20%20%20%20%0Aprint%28heartrates%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=311&rawInputLstJSON=%5B%5D&textReferences=false)
218+
```{python}
219+
for i in range(5, 50, 5):
220+
print(i)
221+
```
249222

250223
## Exercises
251224

03-Conditionals.qmd

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Conditional Statements
2+
3+
As we develop more complex code, it is quite common to want to have your code do different things depending on the value. For instance, suppose you are recoding `heartrates`, and the numerical values should be "low" if it is between 0 and 60, "medium" if it is between 60 and 100, "high" if it is above 100, and "unknown" otherwise (when it is below 0 or other data type).
4+
5+
Stepping back, we have been working with a *linear* way of executing code - we have unconditionally executing every line of code in our program. Here, we are create a "control flow" via **conditional statements** in which the your code will run a certain section *if* some conditions are met.
6+
7+
Here is how the syntax looks like for conditional statements:
8+
9+
```
10+
if <expression1>:
11+
block of code 1
12+
elif <expression2>:
13+
block of code 2
14+
else:
15+
block of code 3
16+
17+
block of code 4
18+
```
19+
20+
There are three possible ways the code can run:
21+
22+
1. If `<expression1>` is evaluated as `True`, then `block of code 1` will be run. When done, it will continue to `block of code 4`.
23+
2. If `<expression1>` is evaluated as `False`, then it will ask if `<expression2>` is `True` or not. If `True`, then `block of code 2` will be run. When done, it will continue to `block of code 4`.
24+
3. If `<expression1>` and `<expression2>` are both evaluated as `False`, then `block of code 3` is run. When done, it will continue to `block of code 4`.
25+
26+
An important takeaway is that *only one block of code can be run*.
27+
28+
Let's see how this apply to the data recoding example. Let's just assume the data we want to recode is just a single value in a variable `rate`, not the entire list `heartrates`:
29+
30+
```{python}
31+
heartrates = [68, 54, 72, 66, 90, 102, 49]
32+
33+
rate = heartrates[0]
34+
print(rate)
35+
36+
if rate > 0 and rate <= 60:
37+
rate = "low"
38+
elif rate > 60 and rate <= 100:
39+
rate = "medium"
40+
elif rate > 100:
41+
rate = "high"
42+
else:
43+
rate = "unknown"
44+
45+
print(rate)
46+
```
47+
48+
You don't always need multiple `if`, `elif`, `else` statements when writing conditional statements. In its simplest form, a conditional statement requires only an `if` clause:
49+
50+
```{python}
51+
x = -12
52+
53+
if x < 0:
54+
x = x * -1
55+
56+
print(x)
57+
```
58+
59+
Then, you can add an `elif` or `else` statement, if you like. Here's `if`-`elif`:
60+
61+
```{python}
62+
x = .25
63+
64+
if x < 0:
65+
x = x * -1
66+
elif x >= 0 and x < 1:
67+
x = 1 / x
68+
69+
print(x)
70+
71+
```
72+
73+
Here's `if`-`else`. The `in` statement asks whether an element (102) is found in an iterable data structure `heartrates`, and returns `True` if so:
74+
75+
```{python}
76+
if 102 in heartrates:
77+
print("Found 102.")
78+
else:
79+
print("Did not find 102.")
80+
```
81+
82+
Finally, let's put the data recoding example within a For-Loop:
83+
84+
```{python}
85+
heartrates = [68, 54, 72, 66, 90, 102]
86+
87+
for index, rate in enumerate(heartrates):
88+
if rate > 0 and rate <= 60:
89+
heartrates[index] = "low"
90+
elif rate > 60 and rate <= 100:
91+
heartrates[index] = "medium"
92+
elif rate > 100:
93+
heartrates[index] = "high"
94+
else:
95+
heartrates[index] = "unknown"
96+
97+
print(heartrates)
98+
```
99+
100+
Let's see this in action step by step:
101+
102+
<iframe width="800" height="500" frameborder="0" src="https://pythontutor.com/iframe-embed.html#code=heartrates%20%3D%20%5B68,%2054,%2072,%2066,%2090,%20102%5D%0A%0Afor%20index,%20rate%20in%20enumerate%28heartrates%29%3A%0A%20%20if%20rate%20%3E%200%20and%20rate%20%3C%3D%2060%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22low%22%0A%20%20elif%20rate%20%3E%2060%20and%20rate%20%3C%3D%20100%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22medium%22%0A%20%20elif%20rate%20%3E%20100%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22high%22%0A%20%20else%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22unknown%22%0A%20%20%20%20%0Aprint%28heartrates%29&amp;codeDivHeight=400&amp;codeDivWidth=350&amp;cumulative=false&amp;curInstr=0&amp;heapPrimitives=nevernest&amp;origin=opt-frontend.js&amp;py=311&amp;rawInputLstJSON=%5B%5D&amp;textReferences=false">
103+
104+
</iframe>
105+
106+
If it doesn't load properly, you can find it [here](https://pythontutor.com/render.html#code=heartrates%20%3D%20%5B68,%2054,%2072,%2066,%2090,%20102%5D%0A%0Afor%20index,%20rate%20in%20enumerate%28heartrates%29%3A%0A%20%20if%20rate%20%3E%200%20and%20rate%20%3C%3D%2060%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22low%22%0A%20%20elif%20rate%20%3E%2060%20and%20rate%20%3C%3D%20100%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22medium%22%0A%20%20elif%20rate%20%3E%20100%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22high%22%0A%20%20else%3A%0A%20%20%20%20heartrates%5Bindex%5D%20%3D%20%22unknown%22%0A%20%20%20%20%0Aprint%28heartrates%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=311&rawInputLstJSON=%5B%5D&textReferences=false)
107+
108+
## Exercises
109+
110+
Exercise for week 3 can be found [here](https://colab.research.google.com/drive/1kPVyALVVn7__x0q6kfE9PKRpGNQgtu0a?usp=sharing).
File renamed without changes.

_quarto.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,10 @@ book:
1818
- index.qmd
1919
- 01-Fundamentals.qmd
2020
- 02-Iteration.qmd
21-
- 03-Functions.qmd
22-
- 04-Iteration_Styles.qmd
23-
- 05-Reference_vs_Copy.qmd
21+
- 03-Conditionals.qmd
22+
- 04-Functions.qmd
23+
- 05-Iteration_Styles.qmd
24+
- 06-Reference_vs_Copy.qmd
2425
- references.qmd
2526

2627
sidebar:

0 commit comments

Comments
 (0)