Friday, March 13, 2020

Python - Exceptions


Handling Exceptions

Ø Exception handling is a mechanism for stopping normal program flow and continuing at some surrounding context or code block.

Ø The event of interrupting normal flow is called the act of raising an exception.

Ø In some enclosing context the raised exception must be handled upon which control flow if transferred to the exception handler. If an exception propagates up the call stack to the start of the program, then an unhandled exception will cause the program to terminate.

Ø And exception object containing information about where and why an exceptional event occurred is transported from the point at which the exception was raised to the exception handler so that the handler can interrogate the exception object and take appropriate action.

Ø Try- Except construct can be used to handle exception. Both the try and except keywords introduce new blocks. The try block contains code that could raise an exception, and the except block contains the code which performs error handling in the event that an exception is raised.

Ø Each try block can have multiple corresponding except blocks, which intercept exceptions of different types.

Ø When multiple exception handlers have same code duplication we can collapsing them into one using the ability of the except statement to accept a tuple of exception types.

def convert(s):
    try:
        x = int(s)
        print("conversion success")
    except(ValueError,TypeError):
        print("Conversion failed")

Ø Almost anything that goes wrong with the Python program results in an exception, but some such as IndentationError, SyntaxError, and NameError are the result of programmer errors, which should be identified and corrected during development rather than handled at runtime. The fact that these things are exceptions is mostly useful if you're creating a Python development tool such as a Python IDE, embedding Python itself in a larger system to support application scripting, or designing a plug-in system, which dynamically loads code.

Ø The pass Statement: It’s a special statement which does precisely nothing. It's a NOOP, and its only purpose is to allow us to construct syntactically permissible blocks which are semantically empty.

Ø Named Reference to exception Object: To get ahold of the exception object and interrogate it for more details of what went wrong, we can get a named reference to the exception object by tacking an “as” clause onto the end of the except statement.
Ex:
In [108]: import sys
     ...: def convert(s):
     ...: try:
     ...: x = int(s)
     ...: print("conversion success")
     ...: except(ValueError,TypeError) as e:
     ...: print("Conversion failed: {}".format(str(e)),file=sys.stderr)
     ...: return -1

In [109]: convert("Sukul")
Conversion failed: invalid literal for int() with base 10: 'Sukul'
Out[109]: -1

Above shows how to print to standard error. First we import sys module and pass sys.stderr as the keyword argument called file to print function.
Also note that exception objects can be converted to strings using the str constructor.

Ø Re-raising Exceptions: We can re-raise the exception object we're currently handling simply by using the ‘raise’ statement at the end. Without a parameter, raise simply re-raises the exception that is being currently handled.
This can be useful when we want to log some information before raising the exception.

Ø Exceptions are part of API of the function: Exceptions form an important aspect of the API of a function. Callers of a function need to know which exceptions to expect under various conditions so that they can ensure appropriate exception handlers are in place. In fact we should also modify the docstring to make it plain which exception type will be raised  and under what circumstances. The exceptions which are raised are as much a part of a function's specification as the arguments it accepts, and as such must be implemented and documented appropriately.

Ø Standard Python Exceptions: Python provides us with several standard exception types to signal common errors. If a function parameter is supplied with an illegal value, it is customary to raise a ValueError. We can do this by using the raise keyword with a newly created exception object, which we can create by calling the ValueError constructor. The ValueError constructor accepts an error message.

In [114]: import sys

In [115]: def cubeme(x):
     ...: if x < 0:
     ...: raise ValueError("Dont want to work with negative numbers")
     ...: return x * x * x

In [116]: try:
     ...: cubeme(1)
     ...: cubeme(97)
     ...: cubeme(-1)
     ...: except ValueError as v:
     ...: print(v,file=sys.stderr)
Dont want to work with negative numbers

Ø There are a handful of common exception types in Python, and usually when we need to raise an exception in our own code one of the built-in types is a good choice.
o   IndexError is raised when an integer index is out of range. You can see this when you index pass the end of a list.
o   ValueError is raised when the object is of the right type, but contains an inappropriate value.
o   KeyError is raised when a look-up in a mapping fails

Ø Do not guard against Type Errors:  doing so runs against the grain of dynamic typing in Python and limits the reuse potential of the code that we write.
If a function works with a type, even one you couldn't have known about when you designed your function, then that's all to the good. If not, execution will probably result in a TypeError anyway.

Ø EAFP vs LYBL: Only two approaches to dealing with a program operation that might fail.
o   The first approach is to check that all the preconditions for a failure-prone operation are met in advance of attempting the operation.
o   The second approach is to perform the operation but be prepared to deal with the consequences if it doesn't work out.

In Python culture, these two philosophies are known as
o   Look Before You Leap, LBYL, and
o   It's Easier to Ask Forgiveness than Permission, EAFP

Python is strongly in favor of EAFP because it puts primary logic for the happy path in its most readable form with deviations from the normal flow handled separately rather than interspersed with the main flow.

Problem with LYBL is that we need to think of all the preemptive checks before performing the risky operation.Also, there is a chance of a race condition (atomicity issue). Things might change between the check and the actual risky operation. 
Ex: we may check for file existence with a pre-emptive test, however file may get deleted by another process between the check and actual use of the file in our code.

With Pythonic EAFP approach, we simply attempt the operation without checks in advance, but we have an exception handler in place to deal with any problems. We don't even need to know in a lot of detail exactly what might go wrong.
EAFP is standard in Python, and that philosophy is enabled by exceptions. 

Without exceptions, that is using error codes instead, you are forced to include error handling directly in the main flow of the logic. Since exceptions interrupt the main flow, they allow you to handle exceptional cases non-locally. Exceptions coupled with EAFP are also superior because unlike error codes exceptions cannot be easily ignored. By default, exceptions have a big effect whereas error codes are silent by default.

Ø Try-finally: Code in the finally-block is executed whether execution leaves the try-block normally by reaching the end of the block or exceptionally by an exception being raised. So finally block can be used to perform a cleanup action irrespective of whether an operation succeeds.

v Summary:

Ø The raising of an exception interrupts normal program flow and transfers control to an exception handler.

Ø Exception handlers are defined using the try…except construct. Try blocks define a context in which exceptions can be detected. Corresponding except blocks define handlers for specific types of exceptions.

Ø Except blocks can capture an exception object, which is often of a standard type such as a ValueError, KeyError, or IndexError.

Ø Programmer errors such as indentation error and syntax error should not normally be handled.

Ø Exceptional conditions can be signaled using the raise keyword, which accepts a single parameter of an exception object. Raise without an argument with an except block re-raises the exception which is currently being processed.

Ø We tend to not to routinely check for TypeErrors. To do so would negate the flexibility afforded to us by Python's dynamic type system.

Ø Exception objects can be converted to strings using the str() constructor for the purposes of printing message payloads.

Ø The exceptions thrown by a function form part of its API and should be appropriately documented.

Ø When raising exceptions, prefer to use the most appropriate built-in exception type.

Ø Cleanup and restorative actions can be performed using the try…finally construct, which may optionally be used in conjunction with except blocks.

Ø Output of the print() function can be directed to standard error using the optional file argument

Saturday, March 7, 2020

Python - Files


Files and resource Management

Ø To open a file in Python, we call the built-in open() function. 
Common arguments:
1) File, the path to the file(required)
2) Mode, which specifies read/write/append, and binary or text mode. This is optional, but we always recommend specifying it for clarity. Explicit is better than implicit.
3) Encoding. If the file contains encoded text data, this is the text encoding to use. It's often a good idea to specify this. If you don't specify it, Python will choose a default encoding for you.

The exact type of the object returned by open depends on how the file was opened, dynamic typing in action. However, know that the object returned is a file-like object.

Ø At the file system level, files contain only a series of bytes. Python distinguishes between files opened in binary and text modes even when the underlying operating system doesn't.

o   Files opened in binary mode return and manipulate their contents as bytes objects without any decoding. Binary mode files reflect the raw data in the file.

o   A file opened in text mode treats its contents as if it contains text strings of the str type, the raw bytes having first been decoded using a platform dependent encoding or using the specified encoding if given.      
By default, text mode also engages support for Python's universal newlines. This causes translation between a single portable newline character in our program strings, /n, and a platform-dependent newline representation in the raw bytes stored in the file system, for example carriage return newline /r/n on Windows.

Ø Default Encoding: Getting the encoding right is crucial for correctly interpreting the contents of a text file. If you don't specify an encoding, Python will use the default from sys.getdefaultencoding.

Ø File open Modes: The mode argument in open builtin function All mode strings should consist of a read, write, or append mode. One of R, W, or A with the optional plus modifier should be combined with a selective text or binary mode T or B. is a string containing letters with different meanings.

f=open(‘wasteland.txt’,mode=’wt’,encoding=’utf-8’)

Both parts of the mode code support defaults, its recommended being explicit for the sake of readability.

Ø The write method: used to write to a file. The write call returns the number of codepoints or characters written to the file. It is the caller's responsibility to provide newline characters where they are needed. There is no writeline method.
When we finish writing, we should remember to close the file by calling the close method.

Ø The size of the files written on windows and linux may be different. The difference is because Python's universal newline behavior for files has translated the line endings to your platform's native endings. (on windows \n will be translated by python to \r\n).
The number returned by the write method is the number of codepoints or characters in the string passed to write, not the number of bytes written to the file after encoding a universal newline translation. This means when working with text files, you cannot sum the quantities returned by write to determine the length of the file in bytes.

In [22]: f1=open('wasteland.txt',mode='wt',encoding='utf-8')
In [28]: type(f1)
Out[28]: _io.TextIOWrapper
In [23]: f1.write("This is a crazy world\n")
Out[23]: 22
In [24]: f1.write("filled with stupid ppl")
Out[24]: 22
In [25]: f1.close()

Ø The Read Function:
o   If we know how many bytes to read or if we want to read the whole file, we can use the read function. In text mode the read method accepts the number of characters to read from the file, not the number of bytes.
o   The call returns the text and advances the file pointer to the end of what was read. Subsequent read call will read next piece of data.
o   In text Mode, the return type is str. In Binary mode, the return type is bytes.(.i.e no encoding)
o   To read all the remaining data in the file, we can call read without an argument. This gives us multiple lines in one string with newline characters embedded in middle.
o   At the end of the file, further calls to read return an empty string.

In [45]: f2=open('wasteland.txt',mode='rt',encoding='utf-8')
In [46]: type(f2)
Out[46]: _io.TextIOWrapper
In [47]: f3=open('wasteland.txt',mode='rb')
In [48]: s1=f2.read(5)
In [49]: print(s1)
This
In [50]: type(s1)
Out[50]: str
In [51]: b1=f3.read(5)
In [52]: print(b1)
b'This '
In [53]: type(b1)
Out[53]: bytes
In [54]: s2=f2.read()
In [55]: print(s2)
is a crazy world
filled with stupid ppl
In [56]: print(f2.read())

Ø The seek method can be used to move the file pointer to any location. Use 0 offset to move it to start of the file. We can use this to go over the file repeatedly without having to closing and reopening.

Ø Use readline() function to read file line by line. The returned lines are terminated by a single newline character if there is one present in the file. The last line does not terminate with a newline because there is no newline sequence at the end of the file.

Again, the universal newline support will have translated to \n from whatever the platform native newline sequence is. This means on windows \r\n will be translated by python to \n.

Once we reach the end of the file, further calls to readline return an empty string.(Similar to read() method)

Ø Use readlines() method to read all lines into a list. Note that memory may be an issue. This is particularly useful if pausing the file involves hopping backwards and forwards between lines.

In [57]: f2.seek(0)
Out[57]: 0
In [58]: f2.readline()
Out[58]: 'This is a crazy world\n'
In [59]: f2.readline()
Out[59]: 'filled with stupid ppl'
In [60]: f2.readline()
Out[60]: ''
In [61]: f2.seek(0)
Out[61]: 0
In [62]: f2.readlines()
Out[62]: ['This is a crazy world\n', 'filled with stupid ppl']

Ø To append to an existing file, we can open the file with mode a, which opens the file for writing, appending to the end of the file if it already exists.

There is no writeline method in Python, there is a writelines method, which writes an iterable series of strings to a stream. If you want line endings on your strings, you must provide them yourself.

In [66]: f2=open('wasteland.txt',mode='rt',encoding='utf-8')
    ...: f2.readlines()
    ...: f2.close()
In [67]: f2=open('wasteland.txt',mode='rt',encoding='utf-8')
    ...: print(f2.readlines())
    ...: f2.close()
['This is a crazy world\n', 'filled with stupid ppl']
In [68]: f3=open('wasteland.txt',mode='at',encoding='utf-8')
In [69]: f3.writelines(['most of which want to\n','watch world burn'])
In [70]: f3.close()
In [71]: f2=open('wasteland.txt',mode='rt',encoding='utf-8')
    ...: print(f2.readlines())
    ...: f2.close()
['This is a crazy world\n', 'filled with stupid pplmost of which want to\n', 'watch world burn']

Ø File objects support the iterator protocol with each iteration yielding the next line in the file. This means they can be used in for loops and any other place where an iterator can be used.

In [74]: f2=open('wasteland.txt',mode='rt',encoding='utf-8')
In [75]: for i in f2:
    ...: print(i)
This is a crazy world

filled with stupid pplmost of which want to

watch world burn

The double line spacing occurs because each line of the file is terminated by a newline, and then print adds its own. To fix that we could use the strip method to remove the whitespace from the end of each line prior to printing.
Instead we can use the write method of the standard out stream. Files and streams are closely related and can be used because the stream is a file-like object. We can get hold of a reference to the standard out stream from the sys module.

In [76]: import sys
    ...: f2=open('wasteland.txt',mode='rt',encoding='utf-8')
    ...: for i in f2:
    ...: sys.stdout.write(i)
This is a crazy world
filled with stupid pplmost of which want to
watch world burn

Ø Context Managers: When working with files, the close method call is important. It informs the underlying OS that we are done working with a file. If we don't close a file, it's possible to lose data. There may be pending rights buffered up, which might not get written completely.

Many a times during exceptions , the close call is never executed.

Furthermore, if you're opening lots of files, your system may run out of resources.

One option to make sure that files are closed no matter what, is to make use of try-finally clause. The finally block will make sure the close call is executed every time (irrespective of how execution exits the try block)

To ease the need for resource cleanup, Python implements a control flow structure called with-block to support it. With-blocks can be used with any object which supports the context-manager protocol, and that includes the file objects returned by open().

We no longer need to call close explicitly because the with construct will call it for us when and by whatever means execution exits the block. This also removes the need for an explicit close.

The with-block syntax is so-called syntactic sugar for a much more complex arrangement of try/except and try/finally blocks.

Ø Working with Binary Files: We open the file for write in binary mode using the 'wb' mode string. With Binary files we don't specify an encoding as that makes no sense for raw binary files. To the write method we should pass bytes object as the file is opened in binary mode. To convert things to bytes, use the bytes constructor and use b’’ for byte literals. Ex: b’\x01’

Ø Bitwise operators to work on bytes:
& - bitwise and (Remember than python uses ‘and’ for logical and)
| - bitwise OR
>> right-shift
<< left-shift


v Summary:

Ø Files are opened using the built-in open() function, which accepts a file mode. This controls read/write/append behavior and also whether the file is treated as binary or encoded text data.

Ø For text data, it's good practice to always specify an encoding.

Ø Text files differ from binary files by dealing with string objects and performing universal newline translation and string encoding. Binary files deal with bytes objects with no newline translation or encoding.

Ø When you write text files, it's up to us to provide newline characters for line breaks.

Ø Files should always be closed after use to prevent resource leaks and to ensure that all data has been committed to the file system.
 
Ø Files provide various convenient methods for working with lines, but are also iterators, which yield values line-by-line.

Ø Files are also context mangers and can be used with the with-statement. This ensures that cleanup operations such as closing the files are performed.

Ø Context managers aren't restricted to file-like objects. We can use the tools in the contextlib standard library module such as the closing() wrapper to create our own context managers.

Ø Python supports bitwise operators bitwise &, bitwise or, and left- and right-bitwise shifts.