Python for Devops
Notes written when reading a Python for Devops book. (On-going.)
Basic Math
- Integer division operator: //.
5//2 = 2. - Modules operator: %.
5%2 = 2.
Comments
# one line
"""
Multi line
"""
'''
Also multi line
'''
Range
>>> range(10)
range(0, 10)
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(range(5, 10))
[5, 6, 7, 8, 9]
>>> list(range(5, 10, 3))
[5, 8]
>>> list(range(15,10,-2))
[15, 13, 11]
- Range is technically a type representing a sequence of numbers, not a function.
- With three arguments: the last value is the step. The step can be negative.
if/elif/else
if [CONDITION1]:
[DO_SOMETHING]
elif [CONDITION2]:
[DO_SOMETHING_ELSE]
else:
[DO_A_THIRD_THING]
for loop
for i in range(INT):
[DO_SOMETHING]
while loop
while [CONDITION]:
[DO_SOMETHING]
break and continue
-
continueskips the rest of the commands in the current iteration of a loop. -
breakterminates the loop. In a while loop it can be used as an alternative control statement:while True: if [SOMETHING]: break
Exceptions
try:
[DO_SOMETHING]
except [ERROR_TYPE] as [IDENTIFIER]:
[DO_SOMETHING_ELSE]
Type function
- Returns a variables type.
Classes and functions
class FancyCar():
wheels = 4
def driveFast(self):
print("Driving so fast")
my_car = FancyCar()
my_car.wheels
4
my_car.driveFast()
Driving so fast
Functions
def <FUNCTION NAME>(<PARAMETERS>):
'''A doc string.
'''
<CODE BLOCK>
-
If a string using multiline syntax is provided first in the indented block, it acts as documentation.
-
Arguments can be passed with keywords in addition to the usual method. When using keyword parameters, all parameters defined after a keyword parameter must be keyword parameters as well.
-
This allows default values to be specified and the values to be passed in any order.
>>> def keywords(first=1, second=2): ... print(f"first: {first}") ... print(f"second: {second}") >>> keywords(0) first: 0 second: 2 >>> keywords(second='one', first='two') first: two second: one -
All functions return a value. The
returnkeyword is used to set this value. If not set from a function definition, the function returnsNone. -
Functions are objects. They can be passed around, or stored in data structures.
>>> def double(input): ... return input*2 ... >>> double <function double at 0x107d34ae8> >>> type(double) <class 'function'> >>> def triple(input): ... return input*3 ... >>> functions = [double, triple] >>> for function in functions: ... print(function(3)) ... ... 6 9 -
lambda functions are unnamed (anonymous) functions.
lambda <PARAM>: <RETURN EXPRESSION> -
These functions should be very short and should only be usually only be used when calling another function.
>>> items = [[0, 'a', 2], [5, 'b', 0], [2, 'c', 1]] >>> sorted(items, key=lambda item: item[1]) # sort by the second list value [[0, 'a', 2], [5, 'b', 0], [2, 'c', 1]] >>> sorted(items, key=lambda item: item[2]) # sort by the third list value [[5, 'b', 0], [2, 'c', 1], [0, 'a', 2]]
Sequences
- list, tuple, range, string and binary types.
Lists
- Ordered collection of items of any time.
- Items can be of different types.
- Square brackets indicate a list.
>>> list()
[]
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list("Henry Miller")
['H', 'e', 'n', 'r', 'y', ' ', 'M','i', 'l', 'l', 'e', 'r']
List methods
list.append([SOMETHING])to add an item to the end of a list.list.insert([INDEX],[SOMETHING])to add an item at a specific index.list1.extend(list2)append list2 to the end of list1.list.pop()pop the last time from a list.list.pop(1)pop the item at index 1 from a list. Inefficient.list.remove([SOMETHING])remove the first occurrence of an item from a list.
List comprehension
- Populate a list in a concise manner.
- What would be the inner block content if put first.
- Filtering can be done using if statements.
>>> squares = [i*i for i in range(10)]
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> squares = [i*i for i in range(10) if i%2==0]
>>> squares
[0, 4, 16, 36, 64] >>>
Strings
-
UTF-8 encoding by default.
-
Strings are immutable.
-
str()creates an empty string. -
Single or double quotes.
-
Use
str(OBJECT)to turn an object into a string. -
Use triple quotes for multi-line strings.
>>> multi_line = """This is a ... multi-line string, ... which includes linebreaks. ... """ -
string.strip()removes whitespace from the beginning and end of a string. -
string.rstrip()removes whitespace from the end of a string. -
string.lstrip()removes whitespace from the beginning of a string. -
string.ljust(x)if the string’s length is less than x, adds whitespace to the end of the string, to bring the length to x. -
string.rjust(y,[CONSTANT])if the string’s length is less than y, adds a constant to the beginning of the string, to bring the length to y. -
string.split()split a string into a list. The default delimiter used is spaces. An alternative delimiter can be specified inside the parentheses.>>> text = "Mary had a little lamb" >>> text.split() ['Mary', 'had', 'a', 'little', 'lamb'] >>> url = "gt.motomomo.io/v2/api/asset/143" >>> url.split('/') ['gt.motomomo.io', 'v2', 'api', 'asset', '143'] -
string1.join(sequence)can be used to create a new string, where string1 is a delimiter between items in the list.>>> items = ['cow', 'milk', 'bread', 'butter'] >>> " and ".join(items) 'cow and milk and bread and butter' -
string.capitalize()capitalise the first letter. -
string.upper()all characters to uppercase. -
string.title()the first character of every word to uppercase. -
string.swapcase()toggle the case of every character. -
string.lower()all characters to lowercase. -
string1.startswith(string2)returns True if a string starts with a certain substring. -
string1.endswith(string2)returns True if a string ends with a certain substring. -
string.isalnum()returns True if a string contains only alphanumeric characters. -
string.isalpha()returns True if a string contains only alphabetic characters. -
string.isnumeric()returns True if a string contains only numeric characters. -
string.istitle()returns True if the first character in every worse is capitalised. -
string.islower()returns True if all alphabetic characters in the string are lowercase. -
string.isupper()returns True if all alphabetic characters in the string are uppercase. -
The old printf equivalent:
>>> "%s + %s = %s" % (1, 2, "Three") '1 + 2 = Three' >>> "%.3f" % 1.234567 '1.235'- Can cause errors, so not recommended.
-
Recommended alternative is string.format:
>>> '{} comes before {}'.format('first', 'second') 'first comes before second' >>> '{1} comes after {0}, but {1} comes before {2}'.format('first', 'second','third') 'second comes after first, but second comes before third' -
Dict values can also be used:
>>> '''{country} is an island. ... {country} is off of the coast of ... {continent} in the {ocean}'''.format(ocean='Indian Ocean', ... continent='Africa', ... country='Madagascar') 'Madagascar is an island. Madagascar is off of the coast of Africa in the Indian Ocean' >>> values = {'first': 'Bill','last': 'Bailey'} >>> "Won't you come home {first} {last}?".format(**values) "Won't you come home Bill Bailey?" -
Format specifications are done using the format specification mini-language:
>>> text = "|{0:>22}||{0:<22}|" >>> text.format('O','O') '| O||O |' >>> text = "|{0:<>22}||{0:><22}|" >>> text.format('O','O') '|<<<<<<<<<<<<<<<<<<<<<O||O>>>>>>>>>>>>>>>>>>>>>|' -
Python f-strings use the same formatting language as the format method, but offer a more straightforward and intuitive mechanism for using them.
>>> a = 1 >>> b = 2 >>> f"a is {a}, b is {b}. Adding them results in {a + b}" 'a is 1, b is 2. Adding them results in 3' >>> count = 43 >>> f"|{count:5d}" '| 43' >>> padding = 10 >>> f"|{count:{padding}d}" '| 43'
Dictionaries
-
It’s possible to convert a nested list to a dict:
```
kv_list = [[‘key-1’, ‘value-1’], [‘key-2’, ‘value-2’]] dict(kv_list) {‘key-1’: ‘value-1’, ‘key-2’: ‘value-2’}
-
dict.value()returns all dict values. -
dict.keys()returns all keys. -
Dict comprehension:
>>> letters = 'abcde' >>> # mapping individual letters to their upper-case representations >>> cap_map = {x: x.upper() for x in letters} >>> cap_map['b'] 'B'
Tuples
- Tuples are ordered and immutable.
- Tuples are defined using parentheses.
- An empty tuple can be created with
()ortuple().
Sequence operations
in/not in operators
>>> 2 in [1,2,3]
True
>>> 'a' not in 'cat'
False
>>> 10 in range(12)
True
>>> 10 not in range(2, 4)
True
Referencing a sequence
- Square brackets and an integer, like most other languages.
- 0 is the first item.
- -1 is the last item.
- -2 is the second to last item.
>>> my_sequence = "Bill Cheatham"
>>> my_sequence[–1]
'm'
>>> my_sequence[–2]
'a'
>>> my_sequence[–13]
'B'
index method
- Searches a sequence for the first occurrence of an constant.
- The second and third arguments define a sub-range to search.
>>> my_sequence = "Bill Cheatham"
>>> my_sequence.index('C')
5
>>> my_sequence.index('a',9, 12)
11
Slicing
the_sequence[start:stop:step]- If values aren’t specified, defaults are used.
- Defaults are: [0:sequence.length:1]
>>> my_sequence = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_sequence[2:5]
['c', 'd', 'e']
>>> my_sequence[:5]
['a', 'b', 'c', 'd', 'e']
>>> my_sequence[3:]
['d', 'e', 'f', 'g']
- Negative numbers can be used to index backwards:
>>> my_sequence[–6:]
['b', 'c', 'd', 'e', 'f', 'g']
>>> my_sequence[3:–1]
['d', 'e', 'f']
>>> my_sequence[3:–1]
['d', 'e', 'f']
length, min max
- Min and max only works on sequences with items that are comparable.
- len(my_sequence)
- min(my_sequence)
- max(my_sequence)
Regular expressions
-
import re -
re.search(r'regex',[WHAT_TO_SEARCH])searches for text matching a regular expressionimport re >>>> re.search(r'Rostam', cc_list) <re.Match object; span=(32, 38), match='Rostam'> -
re.searchcan be used as a condition in an if statement – to test for whether a value was matched. - Standard regex syntax:
[R,B]– R or B.[i,y]– i or y.[a-z]any single lowercase alphabetic character.[A-Za-z]any single alphabetic character.+multiplier for one or more.\is the escape character.\wis the equivalent of[a-zA-Z0-9_].\dis the equivalent of[0-9]
-
Groups can be defined with parentheses and access using re.group(INT). re.group(0) is the whole match.
>>> matched = re.search(r'(\w+)\@(\w+)\.(\w+)', cc_list) >>> matched.group(0) 'ekoenig@vpwk.com' >>> matched.group(1) 'ekoenig' >>> matched.group(2) 'vpwk' >>> matched.group(3) 'com' -
Names can be supplied for the groups by adding
?P<Name>in the group definition. Groups can then be accessed by name instead of a number:matched = re.search(r’(?P
\w+)\@(?P \w+)\.(?P \w+)', cc_list) matched.group('name') 'ekoenig' print(f'''name: {matched.group("name")} ... Secondary Level Domain: {matched.group("SLD")} ... Top Level Domain: {matched.group("TLD")}''') name: ekoenig Secondary Level Domain: vpwk Top Level Domain: com -
findallcan be used to return all of the matches as a list of strings:>>> matched = re.findall(r'\w+\@\w+\.\w+', cc_list) >>> matched ['ekoenig@vpwk.com', 'rostam@vpwk.com', 'ctomson@vpwk.com', 'cbaio@vpwk.com'] >>> matched = re.findall(r'(\w+)\@(\w+)\.(\w+)', cc_list) >>> matched [('ekoenig', 'vpwk', 'com'), ('rostam', 'vpwk', 'com'), ('ctomson', 'vpwk', 'com'), ('cbaio', 'vpwk', 'com')] >>> names = [x[0] for x in matched] >>> names ['ekoenig', 'rostam', 'ctomson', 'cbaio'] -
For dealing with large input, such as logs, it may be necessary to use
finditerto iterate over each line:>>> matched = re.finditer(r'\w+\@\w+\.\w+', cc_list) >>> matched <callable_iterator object at 0x108e68748> >>> next(matched) <re.Match object; span=(13, 29), match='ekoenig@vpwk.com'> >>> next(matched) <re.Match object; span=(51, 66), match='rostam@vpwk.com'> >>> next(matched) <re.Match object; span=(83, 99), match='ctomson@vpwk.com'>The iterator object, matched, can be used in a for loop as well:
>>> matched = re.finditer("(?P<name>\w+)\@(?P<SLD>\w+)\.(?P<TLD>\w+)", cc_list) >>> for m in matched: ... print(m.groupdict()) ... ... {'name': 'ekoenig', 'SLD': 'vpwk', 'TLD': 'com'} {'name': 'rostam', 'SLD': 'vpwk', 'TLD': 'com'} {'name': 'ctomson', 'SLD': 'vpwk', 'TLD': 'com'} {'name': 'cbaio', 'SLD': 'vpwk', 'TLD': 'com'} -
Regex can also be used to substitute:
>>> re.sub("\d", "#", "The passcode you entered was 09876") 'The passcode you entered was #####' >>> users = re.sub("(?P<name>\w+)\@(?P<SLD>\w+)\.(?P<TLD>\w+)", "\g<TLD>.\g<SLD>.\g<name>", cc_list) >>> print(users) Ezra Koenig <com.vpwk.ekoenig>, Rostam Batmanglij <com.vpwk.rostam>, Chris Tomson <com.vpwk.ctomson, Chris Baio <com.vpwk.cbaio -
If the same match is going to happen many times,
re.compilecan be used to compile the regex, which is more efficient for continuous use:>>> regex = re.compile(r'\w+\@\w+\.\w+') >>> regex.search(cc_list) <re.Match object; span=(13, 29), match='ekoenig@vpwk.com'>
Lazy Evaluation
- Lazy evaluation is the idea that, especially when dealing with large amounts of data, you do not want process all of the data before using the results.
- You have already seen this with the range type, where the memory footprint is the same, even for one representing a large group of numbers.
Generators
-
You can use generators in a similar way as range objects. They perform some operation on data in chunks as requested. They pause their state in between calls. This means that you can store variables that are needed to calculate output, and they are accessed every time the generator is called.
-
To write a generator function, use the
yieldkeyword rather than a return statement. Every time the generator is called, it returns the value specified byyieldand then pauses its state until it is next called.>>> def count(): ... n = 0 ... while True: ... n += 1 ... yield n ... ... >>> counter = count() >>> counter <generator object count at 0x10e8509a8> >>> next(counter) 1 >>> next(counter) 2 >>> next(counter) 3 -
Note that the generator keeps track of the value of n.
-
Generators can also be in used for loops:
>>> def fib(): ... first = 0 ... last = 1 ... while True: ... first, last = last, first + last ... yield first ... >>> f = fib() >>> for x in f: ... print(x) ... if x > 12: ... break ... 1 1 2 3 5 8 13
Generator Comprehensions
-
We can use generator comprehensions to create one-line generators.
-
They are created using a syntax similar to list comprehensions, but parentheses are used rather than square brackets:
>>> list_o_nums = [x for x in range(100)] >>> gen_o_nums = (x for x in range(100)) >>> list_o_nums [0, 1, 2, 3, ... 97, 98, 99] >>> gen_o_nums <generator object <genexpr> at 0x10ea14408> # Memory consumption: >>> import sys >>> sys.getsizeof(list_o_nums) 912 >>> sys.getsizeof(gen_o_nums) 120
iPython
- To run a shell command prepend an exclamation mark:
ls = !ls. - The output of the command is assigned to a Python variable
ls.
iPython grep, fields and sort
-
The type of this variable is IPython.utils.text.SList. The SList type converts a regular shell command into an object that has three main methods: fields, grep, and sort.
In [6]: df = !df In [7]: df.sort(3, nums = True) In [10]: ls = !ls -l /usr/bin In [11]: ls.grep("kill")
Magic commands
-
Magic commands use two percentage signs and are used to run something external within the iPython shell.
-
Writing a quick bash script:
In [13]: %%bash ...: uname -a ...: ...: Darwin nogibjj.local 18.5.0 Darwin Kernel Version 18.5.0: Mon Mar ... -
Creating a python script file:
In [16]: %%writefile print_time.py ...: #!/usr/bin/env python ...: import datetime ...: print(datetime.datetime.now().time()) ...: ...: ...: Writing print_time.py In [18]: !python print_time.py 19:06:00.594914 -
Checking what is loaded into memory:
In [20]: %who df ls var_ls
Automating Files and the Filesystem
open
-
You can use the open function to create a file object that can read and write files.
-
It takes two arguments, the path of the file and the mode (mode optionally defaults to reading).
-
You use the mode to indicate, among other things, if you want to read or write a file and if it is text or binary data.
-
You can open a text file using the mode r to read its contents.
-
The file object has a read method that returns the contents of the file as a string:
In [1]: file_path = 'bookofdreams.txt' In [2]: open_file = open(file_path, 'r') In [3]: text = open_file.read() In [4]: len(text) Out[4]: 476909 In [5]: text[56] Out[5]: 's' In [6]: open_file Out[6]: <_io.TextIOWrapper name='bookofdreams.txt' mode='r' encoding='UTF-8'> In [7]: open_file.close() -
It is a good practice to close a file when you finish with it. Python closes a file when it is out of scope, but until then the file consumes resources and may prevent other processes from opening it.
readlines
-
You can also read a file using the readlines method.
-
This method reads the file and splits its contents on newline characters. It returns a list of strings. Each string is one line of the original text:
In [8]: open_file = open(file_path, 'r') In [9]: text = open_file.readlines() In [10]: len(text) Out[10]: 8796 In [11]: text[100] Out[11]: 'science, when it admits the possibility of occasional hallucinations\n' In [12]: open_file.close()
with
-
A handy way of opening files is to use with statements.
-
You do not need to close a file explicitly in this case. Python closes it and releases the file resource at the end of the indented block:
In [13]: with open(file_path, 'r') as open_file: ...: text = open_file.readlines() ...: In [14]: text[101] Out[14]: 'in the sane and healthy, also admits, of course, the existence of\n' In [15]: open_file.closed Out[15]: True
Windows vs Unix line breaks
-
Different operating systems use different escaped characters to represent line endings.
-
Unix systems use
\nand Windows systems use\r\n. -
Python converts these to
\nwhen you open a file as text. If you are opening a binary file, such as a .jpeg image, you are likely to corrupt the data by this conversion if you open it as text. -
You can, however, read binary files by appending a b to mode:
In [15]: file_path = 'bookofdreamsghos00lang.pdf' In [16]: with open(file_path, 'rb') as open_file: ...: btext = open_file.read() ...: In [17]: btext[0] Out[17]: 37 In [18]: btext[:25] Out[18]: b'%PDF-1.5\n%\xec\xf5\xf2\xe1\xe4\xef\xe3\xf5\xed\xe5\xee\xf4\n18' -
Adding this opens the file without any line-ending conversion.
write mode
-
To write to a file, use the write mode, represented as the argument w.
-
The tool direnv is used to automatically set up some development environments.
-
You can define environment variables and application runtimes in a file named .envrc; direnv uses it to set these things up when you enter the directory with the file.
-
You can set the environment variable STAGE to PROD and TABLE_ID to token-storage-1234 in such a file in Python by using open with the write flag:
In [19]: text = '''export STAGE=PROD ...: export TABLE_ID=token-storage-1234''' In [20]: with open('.envrc', 'w') as opened_file: ...: opened_file.write(text) ...: In [21]: !cat .envrc export STAGE=PROD export TABLE_ID=token-storage-1234