Leanpub: Publish Early, Publish Often

5. Objects 201

Python objects are the basic abstraction over data in Python; every value is an object in Python. Every object has an identity, a type and a value. Once the interpreter creates an object, its identity never changes. The id(obj) function returns an integer representing the obj's identity. The is operator compares the identity of two objects returning a boolean. In CPython, the id() function returns an integer that is a memory location of the object thus uniquely identifying such object. This value is an implementation detail, and implementations of Python are free to return whatever value uniquely identifies objects within the interpreter.

The type() function returns an object’s type; the type of an object is also an object itself. An object’s type is also normally unchangeable. An object’s type determines the operations that the object supports and also defines the possible values for objects of that type. Python is a dynamic language because types are not associated with variables so a variable, x, may refer to a string and later refer to an integer as shown in the following example.

Listing 4.0: A Python variable that is reassigned

1 x = 1
2 x = "Nkem"

However, Python, unlike dynamic languages such as Javascript is strongly typed because the interpreter will never change the type of an object. Strong typing means that actions such as adding a string to a number will cause an exception in Python, as shown in the following snippet:

Listing 4.1: String typing in Python

1 >>> x = "Nkem"
2 >>> x + 1
3 Traceback (most recent call last):
4   File "<stdin>", line 1, in <module>
5 TypeError: Can't convert 'int' object to str implicitly

Listing 4.1 is unlike Javascript where the above succeeds because the interpreter implicitly converts the integer to a string and then adds it to the string.

Python objects are either:

Mutable objects: These are objects that can change value. For example, a list is a mutable data structure as we can grow or shrink the list at will as in Listing 4.2.

Listing 4.2: Mutating a Python object

1 >>> x = [1, 2, 4]
2 >>> y = [5, 6, 7]
3 >>> x = x + y
4 >>> x
5 [1, 2, 4, 5, 6, 7]
6 >>>

Programmers new to Python from other languages may find some behaviour of mutable object puzzling. Python is a pass-by-object-reference language which means that the values of object references are the values passed to function or method calls and names bound to variables refer to these reference values. For example, consider the snippets shown in Listing 4.3.

Listing 4.3: Mutating an object referenced by multiple names

1 >>> x
2 [1, 2, 3]
3 # now x and y refer to the same list
4 >>> y = x
5 # a change to x will also be reflected in y
6 >>> x.extend([4, 5, 6])
7 >>> y
8 [1, 2, 3, 4, 5, 6]

y and x refer to the same object, so a change to either x or y will show up in the other. This is because the variable, x does not hold the list, [1, 2, 3], instead it has a reference that points to the location of that object. So when the variable, y is bound to the value contained referenced by x, it now also holds the reference to the original list, [1, 2, 3]. Any operation on x finds the list that x refers to and carries out the operation on the list; y also refers to the same list thus the change is also reflected in the variable, y.

Immutable objects: These objects have values that cannot be changed. A tuple is an example of an immutable data structure because the constituent objects cannot change once the tuple is created as in Listing 4.4.

Listing 4.4: An immutable Python type

1 >>> x = (1, 2, 3, 4)
2 >>> x[0]
3 1
4 >>> x[0] = 10
5 Traceback (most recent call last):
6     File "<stdin>", line 1, in <module>
7 TypeError: 'tuple' object does not support item assignment
8 >>>

However, if an immutable object contains a mutable object, the mutable object can have its value changed even if it is part of an immutable object. For example, a tuple is immutable, but if it contains a list object, a mutable object, then we can change the value of the list object as shown in the snippet in Listing 4.5.

Listing 4.5: Mutating an object referenced by multiple names

 1 >>> t = [1, 2, 3, 4]
 2 >>> x = t,
 3 >>> x
 4 ([1, 2, 3, 4],)
 5 >>> x[0]
 6 [1, 2, 3, 4]
 7 >>> x[0].append(10)
 8 >>> x
 9 ([1, 2, 3, 4, 10],)
10 >>>

5.1 Strong and Weak Object References

Python objects get references when they are bound to names. This binding can be in the form of an assignment, a function or method call that binds objects to argument names etc. Every time an object gets a reference, the object’s reference count increases. We can get the reference count for an object using the sys.getrefcount method, as shown in Listing 4.6.

Listing 4.6: Getting the reference count of an object

1 >>> import sys
2 >>> l = []
3 >>> m = l
4 # note that there are 3 references to the list object, l, m and the binding 
5 # to the object argument for sys.getrefcount function
6 >>> sys.getrefcount(l)
7 3

Python has two kinds of references, strong and weak references, but when discussing references, we are almost certainly referring to a strong reference. The example in Listing 4.6, for instance, has three references and these are all strong references. The defining characteristic of a strong reference in Python is that whenever the interpreter creates a new strong reference, the reference count for the referenced object increments by 1. This means that the garbage collector will never collect an object that has a strong reference because the garbage collector collects only objects that have a reference count of 0. Weak references, on the other hand, do not increase the reference count of the referenced object; one can create a weak reference using the weakref module. The snippet in Listing 4.7 shows a weak reference in action.

Listing 4.7: Creating a weak reference

 1 >>> class Foo:
 2 ...     pass
 3 ... 
 4 >>> a = Foo()
 5 >>> b = a
 6 >>> sys.getrefcount(a)
 7 3
 8 >>> c = weakref.ref(a)
 9 >>> sys.getrefcount(a)
10 3
11 >>> c()
12 <__main__.Foo object at 0x1012d6828>
13 >>> type(c)
14 <class 'weakref'>

The weakref.ref function returns an object that when called returns the weakly referenced object. The weakref module the weakref.proxy is an alternative to the weakref.ref function for creating weak references. This method creates a proxy object that can be used just like the original object without the need for a call as in Listing 4.8.

Listing 4.8: Creating a weak reference using the weakref.proxy.

1 >>> d = weakref.proxy(a)
2 >>> d
3 <weakproxy at 0x10138ba98 to Foo at 0x1012d6828>
4 >>> d.__dict__
5 {}

When all the strong references to an object are out of scope, the weak reference loses its reference to the object, and the object is now ready for garbage collection. Any further attempt to make use of the weak reference will result in an exception, as shown in Listing 4.9.

Listing 4.9: Mutating an object referenced by multiple names

 1 >>> del a 
 2 >>> del b
 3 >>> d
 4 <weakproxy at 0x10138ba98 to NoneType at 0x1002040d0>
 5 >>> d.__dict__
 6 Traceback (most recent call last):
 7     File "<stdin>", line 1, in <module>
 8 ReferenceError: weakly-referenced object no longer exists
 9 >>> c()
10 >>> c().__dict__
11 Traceback (most recent call last):
12     File "<stdin>", line 1, in <module>
13 AttributeError: 'NoneType' object has no attribute '__dict__'

5.2 The Type Hierarchy

Python comes with its own set of built-in types, and these built-in types broadly fall into one of the following categories:

`None` Type

The None type is a singleton object that has a single value, and this value is accessed through the built-in name None. It signifies the absence of a value in many situations, e.g., it is returned by functions that don’t explicitly return a value as illustrated below:

Listing 4.10: The None type

1 >>> def print_name(name):
2 ...     print(name)
3 ... 
4 >>> name = print_name("nkem")
5 nkem
6 >>> name
7 >>> type(name)
8 <class 'NoneType'>
9 >>>

The None type has a truth value of false.

`NotImplemented` Type

The NotImplemented type is another singleton object that has a single value. The value of this object is accessed through the built-in name NotImplemented. This object should be returned when we want to delegate the search for the implementation of a method to the interpreter rather than throwing a runtime NotImplementedError exception. For example, consider the two types, Foo and Bar below:

Listing 4.11: The NotImplemented Type

 1 class Foo:
 2     def __init__(self, value):
 3         self.value = value
 4 
 5     def __eq__(self, other):
 6         if isinstance(other, Foo):
 7             print('Comparing an instance of Foo with another instance of Foo')
 8             return other.value == self.value
 9         elif isinstance(other, Bar):
10             print('Comparing an instance of Foo with an instance of Bar')
11             return other.value == self.value
12         print('Could not compare an instance of Foo with the other class')
13         return NotImplemented
14 
15 class Bar:
16     def __init__(self, value):
17         self.value = value
18 
19     def __eq__(self, other):
20         if isinstance(other, Bar):
21             print('Comparing an instance of Bar with another instance of Bar')
22             return other.value == self.value
23         print('Could not compare an instance of Bar with the other class')
24         return NotImplemented

When we attempt to compare both objects in Listing 4.11, we can observe the effect of returning NotImplemented. In Python, a == b results in a call to a.__eq__(b). In this example, the instances of Foo and Bar have implementations for comparing themselves to other instance of the same class as shown in Listing 4.12:

Listing 4.12: Comparing instances of Foo and Bar class

 1 >>> f = Foo(1)
 2 >>> b = Bar(1)
 3 >>> f == b
 4 Comparing an instance of Foo with an instance of Bar
 5 True
 6 >>> f == f
 7 Comparing an instance of Foo with another instance of Foo
 8 True
 9 >>> b == b
10 Comparing an instance of Bar with another instance of Bar
11 True
12 >>>

What happens when we compare f with b? The implementation of __eq__() in Foo checks that the other argument is an instance of Bar and handles it accordingly returning a value of True; see Listing 4.13.

Listing 4.13: Comparing two objects

1 >>> f == b
2 Comparing an instance of Foo with an instance of Bar
3 True

When b is compared with f, b.__eq__(f) is invoked and the NotImplemented object is returned because the implementation of __eq__() in Bar only supports comparison with a Bar instances. However, Listing 4.14 shows that the operation succeeds when we swap the order of objects - what has happened?

Listing 4.14: Comparing two objects

1 >>> b == f
2 Could not compare an instance of Bar with the other class
3 Comparing an instance of Foo with an instance of Bar
4 True
5 >>>

The call to b.__eq__(f) method returned NotImplemented causing the interpreter to invoke the __eq__() method in Foo and since a comparison between Foo and Bar is defined in the implementation of the __eq__() method in Foo the correct result, True, is returned.

The NotImplmented object has a truth value of true.

`Ellipsis` Type

This is another singleton object type that has a single value. The value of this object is accessed through the literal ... or the built-in name Ellipsis. The Ellipsis object has a truth value of True. The Ellipsis object is used mainly in numeric Python for indexing and slicing matrices. The numpy documentation provides more insight into how the Ellipsis object is used.

`Numeric Type`

Python numeric types, also referred to as numbers are immutable types that fall into one of the following categories:

Integers: These represent elements from the set of positive and negative integers. These fall into one of the following types:
1. Plain integers: These are numbers in the range of -2147483648 through 2147483647 on a 32-bit machine; the range value is machine word size-dependent. Long integers are returned when results of operations fall outside the range of plain integers and in some cases, the exception OverflowError is raised. For the purpose of shift and mask operations, integers are assumed to have a binary, 2’s complement notation using 32 or more bits, and hiding no bits from the user.
2. Long integers: Long integers hold integer values that are as large as the system’s virtual memory on can support. Listing 4.15 is an example of a long integer.

Listing 4.15: Mutating an object referenced by multiple names

 1 >>> 238**238
 2 422003234274091507517421795325920182528086611140712666297183769 390925685510755057402680778036236427150019\
 3 987694212157636287196
 4 316333783750877563193837256416303318957733860108662430281598286
 5 073858990878489423027387093434036402502753142182439305674327314
 6 588077348865742839689189553235732976315624152928932760343933360
 7 660521328084551181052724703073395502160912535704170505456773718
 8 101922384718032634785464920586864837524059460946069784113790792
 9 337938047537052436442366076757495221197683115845225278869129420
10 5907022278985117566190920525466326339246613410508288691503104L

Note that from the perspective of a user, there is no difference between the plain and long integers as all conversions happen without any user input.

Booleans: These represent the truth values False and True. The Boolean type is a subtype of plain integers. The False and True Boolean values behave like 0 and 1 values respectively except when converted to a string, then the strings “False” or “True” are returned respectively as in Listing 4.16

Listing 4.16: The Boolean type

 1 >>> x = 1
 2 >>> y = True
 3 >>> x + y
 4 2
 5 >>> a = 1
 6 >>> b = False
 7 >>> a + b
 8 1
 9 >>> b == 0
10 True
11 >>> y == 1
12 True
13 >>> 
14 >>> str(True)
15 'True'
16 >>> str(False)
17 'False'

Float: These represent machine-level only double-precision floating-point numbers. The underlying machine architecture and specific Python implementation determine the accepted range and the handling of overflow, so the underlying C language will limit ‘CPython. In contrast Jython will be limited by the underlying Java` language.
Complex Numbers: These represent complex numbers as a pair of machine-level double-precision floating-point numbers. The same caveats apply as for floating-point numbers. Complex numbers can be created using the complex keyword, as in Listing 4.17.

Listing 4.17: Complex numbers

1 >>> complex(1,2)
2 (1+2j)
3 >>>

We can also create complex numbers by using a number literal followed by a j character. For instance, the previous complex number example can be created by the expression, 1+2j. The real and imaginary parts of a complex number z are retrieved through the read-only attributes z.real and z.imag.

Sequence Type

Sequence types are finite ordered collections of objects that can be indexed by integers; using negative indices in Python is legal. Sequences fall into two categories - mutable and immutable sequences.

Immutable sequences: An immutable sequence type object is one whose value cannot change once it is created. This means that the collection of objects that are directly referenced by an immutable sequence is fixed. The collection of objects referenced by an immutable sequence may be composed of mutable objects whose value may change at runtime, but the mutable object itself that is directly referenced by an immutable sequence cannot be changed. For example, a tuple is an immutable sequence but if one of the elements in the tuple is a list, a mutable sequence, then the list can change, but the reference to the list object that tuple holds cannot be changed as in Listing 4.18.

Listing 4.18: Immutable collection of mutable objects

 1 >>> t = [1, 2, 3], "obi", "ike"
 2 >>> type(t)
 3 <class 'tuple'>
 4 >>> t[0].append(4) # mutate the list
 5 >>> t
 6 ([1, 2, 3, 4], 'obi', 'ike')
 7 >>> t[0] = [] # attempt to change the reference in tuple
 8 Traceback (most recent call last):
 9     File "<stdin>", line 1, in <module>
10 TypeError: 'tuple' object does not support item assignment

The following are built-in immutable sequence types:

Strings: A string is an immutable sequence of Unicode code points or more informally an immutable sequence of characters. There is no char type in Python, so a character is just a string of length, 1. Strings in Python can represent all Unicode code points in the range U+0000 - U+10FFFF. All text in Python is Unicode, and the type of the objects used to hold such text is str.
Bytes: A bytes object is an immutable sequence of 8-bit bytes. Each byte is represented by an integer in the range 0 <= x < 256. Bytes literals such as b'abc' and the built-in function bytes() are used to create bytes objects. Bytes object has an intimate relationship with strings. Strings are abstractions over text representation used in the computer; the text is represented internally using binary or bytes. Strings are just sequences of bytes that have been decoded using an encoding such as UTF-8. The abstract characters of a string can also be encoded using available encodings such as UTF-8 to get the binary representation of the string in bytes objects. Listing 4.19 illustrates the relationship between bytes and strings.

Listing 4.19: Mutating an object referenced by multiple names

 1 >>> b = b'abc'
 2 >>> b
 3 b'abc'
 4 >>> type(b)
 5 <class 'bytes'>
 6 >>> b = bytes('abc', 'utf-16') # encode a string to bytes using UTF-16 encoding
 7 >>> b
 8 b'\xff\xfea\x00b\x00c\x00'
 9 >>> b
10 b'\xff\xfea\x00b\x00c\x00'
11 >>> b.decode("utf-8") # decoding fails as encoding has been done with utf-16
12 Traceback (most recent call last):
13     File "<stdin>", line 1, in <module>
14 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
15 >>> b.decode("utf-16") # decoding to string passes
16 'abc'
17 >>> type(b.decode("utf-16"))
18 <class 'str'>

Tuple: A tuple is a sequence of arbitrary Python objects. A tuple of two or more items is created by comma-separated lists of expressions. A tuple of one item is formed by affixing a comma to an expression while an empty pair of parentheses form an empty tuple. Listing 4.20 illustrates the various ways we create a tuple.

Listing 4.20: Mutating an object referenced by multiple names

 1 >>> names = "Obi",  # tuple of 1
 2 >>> names
 3 ('Obi',)
 4 >>> type(names)
 5 <class 'tuple'>    
 6 
 7 >>> names = ()  # tuple of 0
 8 >>> names
 9 ()
10 >>> type(names)
11 <class 'tuple'>  
12 
13 
14 >>> names = "Obi", "Ike", 1  # tuple of 2 or more
15 >>> names
16 ('Obi', "Ike", 1)
17 >>> type(names)
18 <class 'tuple'>

Mutable sequences: An immutable sequence type is one whose value can change after it has created. There are currently two built-in mutable sequence types - byte arrays and lists.
1. Byte Arrays: Bytearray objects are mutable arrays of bytes. Byte arrays are created using the built-in bytearray() constructor. Apart from being mutable and thus unhashable, byte arrays provide the same interface and functionality as immutable byte objects. Bytearrays are very useful when the efficiency offered by their mutability is required. For example, when receiving an unknown amount of data over a network, byte arrays are efficient because the array can be extended as more data is received without having to allocate new objects as is the case with an immutable byte type.
2. Lists: Lists are a sequence of arbitrary Python objects. Lists are created by placing a comma-separated list of expressions in square brackets. The empty list is formed with the empty square bracket, []. A list can be created from any iterable by passing such iterable to the list method. The list data structure is one of the most widely used data type in Python.

Sequence types have some operations that are common to all sequence types. These are described in the following table; x is an object, s and t are sequences and n, i, j, k are integers.

Operation	Result
`x in s`	True if an item of s is equal to x, else False
`x not in s`	False if an item of s is equal to x, else True
s + t	the concatenation of s and t
s * n or n * s	n shallow copies of s concatenated
s[i]	ith item of s, origin 0
s[i:j]	slice of s from i to j
s[i:j:k]	slice of s from i to j with step k
len(s)	length of s
min(s)	smallest item of s
max(s)	largest item of s
s.index(x[, i[, j]])	index of the first occurrence of x in s (at or after index i and before index j)
s.count(x)	total number of occurrences of x in s

Note

Values of n that are less than 0 are treated as 0 and this yields an empty sequence of the same type as s such as below:

Listing 4.21: Creating a string

1 >>> x = "obi"
2 >>> x*-2
3 ''

Copies made using the * operation are shallow copies; any nested structures are not copied. This can result in confusion when trying to create copies of a structure such as a nested list as in Listing 4.22.

Listing 4.22: Shallow copying

1 >>> lists = [[]] * 3 # shallow copy
2 >>> lists
3 [[], [], []] # all three copies reference the same list
4 >>> lists[0].append(3)
5 >>> lists
6 [[3], [3], [3]]

To avoid shallow copies when dealing with nested lists, follow the example in Listing 4.23.

Listing 4.23: Creating copies of a nested list

1 >>> lists = [[] for i in range(3)]
2 >>> lists[0].append(3)
3 >>> lists[1].append(5)
4 >>> lists[2].append(7)
5 >>> lists
6 [[3], [5], [7]]

When i or j is negative, the index is relative to the end of the string thus len(s) + i or len(s) + j is substituted for the negative value of i or j.
Concatenating immutable sequences such as strings always results in a new object, as shown in Listing 4.24:

Listing 4.24: Mutating an object referenced by multiple names

1 >>> name = "Obi"
2 >>> id(name)
3 4330660336
4 >>> name += "Obi" + " Ike-Nwosu"
5 >>> id(name)
6 4330641208

Python defines the interfaces (that’s the closest word that can be used) - Sequences and MutableSequences in the collections library and these define all the methods a type must implement to be considered a mutable or immutable sequence. The concept of interfaces will become more apparent when we discuss abstract base classes.

Set

These are unordered, finite collection of unique Python objects. Sets are unordered so they cannot be indexed by integers. The members of a set must be hashable, so only immutable objects can be members of a set. This is so because sets in Python are implemented using a hash table; a hash table uses some kind of hash function to compute an index into a slot. If a mutable value is used, then the index calculated will change when this object changes thus mutable values are not allowed in sets. Sets provide efficient solutions for membership testing, de-duplication, computing of intersections, union and differences. Sets can be iterated over, and the built-in function len() returns the number of items in a set. There are currently two intrinsic set types:- the mutable set type and the immutable frozenset type. Both have several common methods shown in Table 4.0.

Table 4.0: Mutable and Immutable Set operations
Method	Description
len(s)	return the cardinality of the set, s.
x in s	Test x for membership in s.
x not in s	Test x for non-membership in s.
isdisjoint(other)	Return True if the set has no elements in common with other. Sets are disjoint if and only if their intersection is the empty set.
issubset(other), set <= other	Test whether every element in the set is in other.
set < other	Test whether the set is a proper subset of other, that is, set <= other and set ! other.
issuperset(other), set >= other	Test whether every element in other is in the set.
set > other	Test whether the set is a proper superset of other, that is, set >= other and set != other.
union(other, …), set \| other \| …	Return a new set with elements from the set and all others.
intersection(other, …), set & other & …	Return a new set with elements common to the set and all others.
difference(other, …), set - other - …	Return a new set with elements in the set that are not in the others.
symmetric_difference(other), set ^ other	Return a new set with elements in either the set or other but not both.
copy()	Return a new set with a shallow copy of s.

Frozen set: This represents an immutable set. A frozen set is created by the built-in frozenset() constructor. A frozenset is immutable and thus hashable so it can be used as an element of another set, or as a dictionary key.
Set: This represents a mutable set, and it is created using the built-in set() constructor. The mutable set is not hashable and cannot be part of another set. A set can also be created using the set literal {}. Methods unique to the mutable set are in Listing 4.1:

Table 4.1 Mutable Set operations
Method	Description
update(other, …), set \|= other \| …	Update the set, adding elements from all others.
intersection_update(other, …), set &= other & …	Update the set, keeping only elements found in it and all others.
difference_update(other, …), set -= other \| …	Update the set, removing elements found in others.
symmetric_difference_update(other), set ^= other	Update the set, keeping only elements found in either set, but not in both.
add(elem)	Add element elem to the set.
remove(elem)	Remove element, elem, from the set. Raises KeyError if elem is not contained in the set.
discard(elem)	Remove element, elem, from the set if it is present.
pop()	Remove and return an arbitrary element from the set. Raises KeyError if the set is empty.
clear()	Remove all elements from the set.

Mapping

A Python mapping is a finite set of objects (values) indexed by a set of immutable Python objects (keys). The keys in the mapping must be hashable for the same reason given previously in describing set members thus eliminating mutable types like lists, frozensets, mappings etc. The expression, a[k], selects the item indexed by the key, k, from the mapping a and can be used as in assignments or del statements. The dictionary mostly called dict for convenience is the only intrinsic mapping type built into Python:

Dictionary: Dictionaries are created by placing a comma-separated sequence of key: value pairs within braces, for example: {'name': "obi", 'age': 18}, or by the dict() constructor. The main operations supported by the dictionary type is the addition, deletion and selection of values using a given key. Adding a key that already exists to a dict replaces the old value associated with the key. Attempting to access a value with a non-existent key will result in a KeyError exception. Dictionaries are perhaps one of the most important types within the interpreter. Without explicitly making use of a dictionary, the interpreter is already using them in several different places. For example, the namespaces, namespaces are discussed in a subsequent chapter, in Python are implemented using dictionaries; this means that every time a symbol is referenced within a program, a dictionary access occurs. Objects built on dictionaries in Python; all attributes of Python objects are stored in a dictionary attribute, __dict__. These are but a few applications of this type within the python interpreter.

Python supplies more advanced forms of the dictionary type in its collections library. These include the OrderedDict that introduces order into a dictionary thus remembering the insert order of items and the defaultdict that takes a factory function that is called to produce a value when a key is missing. Listing 4.5 illustrates the use of a defaultdict,

Listing 4.25: Using a defaultdict

1 >>> from collections import defaultdict
2 >>> d = defaultdict(int)
3 >>> d
4 defaultdict(<class 'int'>, {})
5 >>> d[7]
6 0
7 >>>d
8 defaultdict(<class 'int'>, {7: 0})

Callable Types

These are types that support the function call operation. The function call operation is the use of () after the type name. Functions are not the only callable types in Python; any object type that implements the __call__ special method is callable. The function, callable(type), checks that a given type is callable. The following are built-in callable Python types:

User-defined functions: these are functions that a user defines with the def statement, such as the print_name function from the previous section.
Methods: these are functions defined within a class and accessible within the scope of the class or a class instance. These methods could either be instance methods, static or class methods.
Built-in functions: These are functions available within the interpreter core such as the len function.
Classes: Classes are also callable types. The process of creating a class instance involves calling the class, such as Foo().

We discuss each of the above types in detail in subsequent chapters.

Custom Type

Custom types are created using the class statements. Custom class objects have a type of type. These are types created by user-defined programs, and they are discussed in the chapter on object-oriented programming.

Module Type

A module is one of the organizational units of Python code just like functions or classes. A module is also an object just like every other value in Python. The module type is created by the import system as invoked either by the import statement or by calling functions such as importlib.import_module() and built-in __import__().

File/IO Types

A file object represents an open file. Files are created using the open built-in functions that open and return a file object on the local file system; the file object can be open in either binary or text mode. Other methods for creating file objects include:

os.fdopen that takes a file descriptor and create a file object from it. The os.open method not to be confused with the open built-in function is used to create a file descriptor that can then be passed to the os.fdopen method to create a file object as shown in the following example.

Listing 4.26: Opening a file

 1 >>> import os
 2 >> fd = os.open("test.txt", os.O_RDWR|os.O_CREAT)
 3 >>> type(fd)
 4 <class 'int'>
 5 >>> fd
 6 3
 7 >>> fo = os.fdopen(fd, "w")
 8 >>> fo
 9 <_io.TextIOWrapper name=3 mode='w' encoding='UTF-8'>
10 >>> type(fo)
11 <class '_io.TextIOWrapper'>

os.popen(): this is marked for deprecation.
makefile() method of a socket object that opens and returns a file object that is associated with the socket on which it was called.

The built-in objects, sys.stdin, sys.stdout and sys.stderr, are also file objects corresponding to the python interpreter’s standard input, output and error streams.

Built-in Types

These are objects used internally by the python interpreter but accessible by a user program. They include traceback objects, code objects, frame objects and slice objects.

Code Objects

Code objects represent compiled executable Python code or bytecode. Code objects are machine code for the python virtual machine along with all that is necessary for the execution of the bytecode they represent. They are normally created when a block of code is compiled. This executable piece of code can only be executed using the exec or eval Python methods. Listing 4.27 defines a simple function and shows how we obtain the code object associated with that function.

Listing 4.27: Function code objects

1 >>>def return_author_name():
2        	return "obi Ike-Nwosu"
3 >>> return_author_name.__code__
4         <code object return_author_name at 0x102279270, file "<stdin>", line 1>

We can go further as in Listing 4.28 and inspect the code object using the dir function to see the attributes of the code object.

Listing 4.28: Code object attributes

1 >>> dir(return_author_name.__code__)
2 ['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '\
3 __gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '\
4 __repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount', 'co_cellvars', 'co_c\
5 ode', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_kwonlyargcount', 'co_ln\
6 otab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']

Of particular interest to us at this point are the non-special methods that are methods that do not start with an underscore. We give a brief description of each of these non-special methods in the following table.

Table 4.2: Code object attributes
Method	Description
co_argcount	number of arguments (not including * or ** args)
co_code	string of raw compiled bytecode
co_consts	tuple of constants used in the bytecode
co_filename	name of the file in which this code object was created
co_firstlineno	number of the first line in Python source code
co_flags	bitmap: 1=optimized \| 2=newlocals \| 4=arg \| 8=*arg
co_lnotab	encoded mapping of line numbers to bytecode indices
co_name	name with which this code object was defined
co_names	tuple of names of local variables
co_nlocals	number of local variables
co_stacksize	virtual machine stack space required
co_varnames	tuple of names of arguments and local variables

We can view the bytecode string for the function using the co_code method of the code object, as shown in Listing 4.29.

Listing 4.29: Mutating an object referenced by multiple names

1 >>> return_author_name.__code__.co_code
2 b'd\x01\x00S'

The bytecode returned, however, is of no use to someone investigating code objects. This is where thedis module comes into play. The dis module is used to generate a human-readable version of the code object. We use the dis function from the dis module to generate the code object for return_author_name function in Listing 4.30, and we can see the instructions that the interpreter executes.

Listing 4.30: Mutating an object referenced by multiple names

1 >>> dis.dis(return_author_name)
2     2           0 LOAD_CONST               1 ('obi Ike-Nwosu')
3                 3 RETURN_VALUE

The LOAD_CONST instruction reads a value from the co_consts tuple and pushes it onto the top of the stack (the CPython interpreter is a stack-based virtual machine). The RETURN_VALUE instruction pops the top of the stack and returns this to the calling scope signalling the end of the execution of that python code block.

Code objects serve several purposes while programming. They contain information that can aid in interactive debugging while programming and can provide us with readable tracebacks during an exception.

Frame Objects

Frame objects represent execution frames. Python code blocks are executed within execution frames. The call stack of the interpreter stores information about currently executing subroutines and the call stack is made up of stack frame objects. Frame objects on the stack have a one-to-one mapping with subroutine calls by the program executing or the interpreter. The frame object contains code objects and all necessary information, including references to the local and global namespaces, necessary for the runtime execution environment. The frame objects are linked together to form the call stack. To simplify how this all fits together a bit, we can think of call stack as a stack data structure (it is), every time a subroutine is called, a frame object is created and inserted into the stack and then the code object contained within the frame is executed. Some special read-only attributes of frame objects include:

f_back is to the previous stack frame towards the caller, or None if this is the bottom stack frame.
f_code is the code object being executed in this frame.
f_locals is the dictionary used to look up local variables.
f_globals is used for global variables.
f_builtins is used for built-in names.
f_lasti gives the precise instruction - it is an index into the bytecode string of the code object.

Some special writable attributes include:

f_trace: If this is not None, this is a function called at the start of each source code line.
f_lineno: This is the current line number of the frame. Writing to this from within a trace function jumps to the given line only for the bottom-most frame. A debugger can implement a Jump command by writing to f_lineno.

Frame objects support one method:

frame.clear(): This method clears all references to local variables held by the frame. If the frame belonged to a generator, the generator is finalized. This helps break reference cycles involving frame objects. A RuntimeError is raised if the frame is currently executing.

Traceback Objects

Traceback objects represent the stack trace of an exception. A traceback object is created when an exception occurs. The interpreter searches for an exception handler by continuously popping the execution stack and inserting a traceback object in front of the current traceback for each frame popped. When an exception handler is encountered, the stack trace is made available to the program. The stack trace object is accessible as the third item of the tuple returned by sys.exc_info(). When the program contains no suitable handler, the stack trace is written to the standard error stream; if the interpreter is interactive, it is also made available to the user as sys.last_traceback. A few important attributes of a traceback object is shown in the following table.

Table 4.3: Traceback object attributes
Method	Description
tb_next	is the next level in the stack trace (towards the frame where the exception occurred), or None if there is no next level
tb_frame	points to the execution frame of the current level; tb_lineno gives the line number where the exception occurred
tb_lasti	indicates the precise instruction. The line number and last instruction in the traceback might differ from the line number of its frame object if the exception occurred in a try statement with no matching except clause or with a final clause.

Slice Objects

Slice objects represent slices for __getitem__() methods of sequence-like objects (more on special methods such as __getitem__() in the chapter on object-oriented programming). Slice object returns a subset of the sequence they are applied to as shown in Listing 4.31.

Listing 4.31: Slicing objects

1 >>> t = [i for i in range(10)]
2 >>> t
3 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
4 >>> t[:10:2]
5 [0, 2, 4, 6, 8]
6 >>>

They are also created by the built-in slice([start,], stop [,step]) function as in Listing 4.32. The returned object can be used in between the square brackets as a regular slice object.

Listing 4.32: Creating slice objects using the slice keyword

 1 >>> t = [i for i in range(10)]
 2 >>> t
 3 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 4 >>> t[:10:2]
 5 [0, 2, 4, 6, 8] 
 6 >>> s = slice(None, 10, 2)
 7 >>> s
 8 slice(None, 10, 2)
 9 >>> t[s]
10 [0, 2, 4, 6, 8]

Slice object read-only attributes include:

Table 4.4: Slice object attributes
Attribute	Description
`start`	which is the lower bound;
`stop`	the optional upper bound;
`step`	the optional step value;

Each of the optional attributes is None if omitted. Slices can take several forms in addition to the standard slice(start, stop [, step]). Other forms include

Listing 4.33: Slice object forms

1 a[start:end] # items start to end-1 equivalent to slice(start, stop)
2 a[start:]    # items start to end-1 equivalent to slice(start)
3 a[:end]      # items from the beginning to end-1 equivalent to slice(None, end)
4 a[:]         # a shallow copy of the whole array equivalent to slice(None, None)

The start or end values may also be negative in which case we count from the end of the array, as shown below:

Listing 4.34: Slice objects with negative indices

1 a[-1]    # last item in the array equivalent to slice(-1)
2 a[-2:]   # last two items in the array equivalent to slice(-2)
3 a[:-2]   # everything except the last two items equivalent to slice(None, -2)

Slice objects support one method:

slice.indices(self, length): This method takes a single integer argument, length, and returns a tuple of three integers - (start, stop, stride) that indicates how the slice would apply to the given length. The start and stop indices are actual indices they would be in a sequence of length given by the length argument. An example is shown in Listing 4.35.

Listing 4.35: Mutating an object referenced by multiple names

 1 >>> s = slice(10, 30, 1) 
 2 # applying slice(10, 30, 1) to sequence of length 100 gives [10:30]
 3 >>> s.indices(100)  
 4 (10, 30, 1)
 5 # applying slice(10, 30, 1) to sequence of length 15 gives [10:15]
 6 >>> s.indices(15)
 7 (10, 15, 1)
 8 # applying slice(10, 30, 1) to sequence of length 1 gives [1:1]
 9 >>> s.indices(1)
10 (1, 1, 1)
11 >>> s.indices(0)
12 (0, 0, 1)

Generator Objects

The invocation of generator functions creates generator objects. Generator functions are functions that use the keyword, yield. This type of function is discussed in detail in the chapter on Sequences and Generators.

With a strong understanding of the built-in type hierarchy, the stage is set for examining object-oriented programming and how users can create their type hierarchy and even make such types behave like built-in types.

Up next

6. Object Oriented Programming