===========
Programming
===========

Python functions are Python modules that can be created using the
standard CREATE FUNCTION syntax::

 CREATE FUNCTION funcname(args)
 RETURNS return_type LANGUAGE python AS
 $python$
 import Postgres

 def main(*args):
 	return value
 $python$;

When a Python function is called using SQL, the function's module is imported,
and the entry point is called--normally ``main``.

.. note::
   The term "Python function" is often used in reference to an SQL function
   object whose language is ``python``.


.. _programming_builtins:

Builtins
========

The interface module, `Postgres`, provides all of the
Python types and interface functions created by pg-python. In order to
provide convenient access to often used objects, some
additional built-ins are provided by pg-python.

 ``@pytypes``
  Alias to :ref:`pg_pytypes`.

 ``prepare(sql_statement)``
  Alias to :ref:`pg_statement`

 ``proc(procedure_id)``
  Function that takes ``pg_catalog.regprocedure`` text or a
  function ``Oid`` and returns the corresponding :ref:`pg_function`.

 ``sqleval(sql_expr, *args)``
  Alias to :ref:`pg_eval`.

 ``sqlexec(str_of_sql_statements)``
  Execute multiple SQL statements. Alias to :ref:`pg_execute`.

 ``xact()``
  Context manager that starts and ends an internal subtransaction.
  Alias to :ref:`pg_transaction`.

Normal Functions
================

Normal functions are executed by invoking the ``main``
entry point defined by the function module. If it does not exist, a Python
``AttributeError`` will be raised as a Postgres error.

The ``main`` entry point must be able to
take the same number of parameters as declared by the ``CREATE FUNCTION``
statement that created the Postgres function. All arguments are given as
positional arguments to the ``main`` object.

.. tip::
   ``main`` does not have to be a regular function object.
   Any callable object will suffice.

The following example shows the creation of a simple addition function::

 CREATE FUNCTION add(n1 int, n2 int)
 RETURNS int LANGUAGE python AS
 $python$
 """
 Return the sum of the two parameters.
 """

 @pytypes
 def main(n1, n2):
     return n1 + n2
 $python$;

 SELECT add(1,2);
 -- add 
 -------
 --   3
 --(1 row)

.. note::
   The above example makes use of the ``pytypes`` decorator.
   This wrapper converts a few of the common data objects into standard
   Python objects. See `Postgres.pytypes` for more details.

Set Returning Functions
=======================

Set returning functions are very much like normal functions. The difference
is that the result is expected to be an iterable of objects that can be
coerced into the result type.

The exhaustion of the iterable will signal the termination of the set
produced by the function::

 CREATE FUNCTION simple_strings()
 RETURNS SETOF text LANGUAGE python AS
 $python$
 """
 Return an iterable of strings.
 """

 def main():
     return ("Hello", "World", "!")
 $python$;

 SELECT s FROM simple_strings() AS t(s);
 --   s   
 ---------
 -- Hello
 -- World
 -- !
 --(3 rows)

Any iterable can be returned. Notably, main can be a generator yielding the
objects to be coerced.


.. _programming_trf:

Trigger Returning Functions
===========================

Trigger functions, like normal functions, are Python modules. However,
unlike normal functions, trigger functions don't use the
``main`` entry point. Rather, trigger functions have
entry points for each possible event type--with respect to orientation,
timing, and manipulation. Multiple entry points can be defined in order to
allow the trigger function to handle multiple event types.

Trigger returning functions are used by ``CREATE TRIGGER`` statements to
designate the procedure that should be executed when the described event occurs.

Trigger functions are declared in the same fashion as normal functions,
using ``CREATE FUNCTION``::

 CREATE FUNCTION funcname()
 RETURNS TRIGGER
 LANGUAGE python AS
 $python$
 """
 A trigger function that does nothing.
 """
 
 def before_insert(td, new):
     pass
 $python$;


When a trigger function's entry point is invoked, it is given
*at least* one argument, the :ref:`pg_triggerdata` describing the event.
Additional arguments given to the entry point, if any, depend on the entry
point--each entry point takes a certain number of arguments that suites the
event.

Row Triggers
------------

Row triggers can be fired for each row that is inserted, updated, or
deleted. The additional arguments given to a row trigger entry points
are records of the target table's type--a subclass of
:ref:`pg_types_record`.
These arguments provide access the row being inserted, updated, or deleted.

Row Trigger Entry Points:

 ``def before_insert(td, new):``
  The ``new`` argument is the row that will be inserted.

  Executed for each row, before the insertion is actually made. This entry
  point can provide a substitute row or suppress the manipulation
  entirely.

 ``def before_update(td, old, new):``
  The ``old`` argument is the row that is being updated
  into the ``new`` argument.
  
  Executed for each row, before the update is actually made. This entry
  point can provide a substitute row or suppress the manipulation
  entirely.

 ``def before_delete(td, old):``
  The ``old`` argument is the row being removed.

  Executed for each row, before the row is actually removed. This entry
  point can only suppress the manipulation.

 ``def after_insert(td, new):``
  Executed for each row, after the insert occurs. The
  ``new`` argument is the row that was inserted.
  ``None`` *must* be returned.

 ``def after_update(td, old, new):``
  Executed for each row, after the update occurs. The
  ``old`` argument is the row that was altered into the
  ``new`` argument. ``None`` *must* must be returned.

 ``def after_delete(td, old):``
  Executed for each row, after the delete occurs. The
  ``old`` argument is the row that was removed from the
  table. ``None`` *must* must be returned.

The AFTER entry points require that the ``None`` object is
returned. If ``None`` is not returned, the procedural
language will throw an error.

Suppressing Manipulations
~~~~~~~~~~~~~~~~~~~~~~~~~

When a manipulation should not proceed, it can be suppressed by a
trigger. Suppression can only be enacted by triggers that are timed
before the manipulation occurs. Specifically,
``before_insert``, ``before_update``, and
``before_delete`` are the only entry points that can cause
suppression.

In order to signal that suppression should occur, the called entry point
will need to raise the :ref:`pg_stopEvent` exception::

 CREATE TABLE single_column(t text);
 INSERT INTO single_column VALUES ('removed'), ('kept');

 CREATE FUNCTION suppress_removal()
 RETURNS TRIGGER LANGUAGE python AS
 $python$
 from Postgres import StopEvent

 def before_delete(td, old):
 	if old['t'] == 'kept':
 		# This does the trick.
 		raise StopEvent
 $python$;

 CREATE TRIGGER suppres_it BEFORE DELETE ON single_column
 FOR EACH ROW EXECUTE PROCEDURE suppress_removal();

 DELETE FROM single_column;
 SELECT t FROM single_column;
 --  t   
 --------
 -- kept
 --(1 row)


Substituting Rows
~~~~~~~~~~~~~~~~~

When a different row should take the place of the one being inserted or
updated, substitution can be performed by
*returning the replacement row*. Substitution can only be used with row
triggers that are timed before the manipulation. Specifically, only
``before_insert`` and ``before_update``
can perform row substitution. If ``None`` is returned,
*the manipulation will continue unaltered*::

 CREATE TABLE two_columns(i int, t text);

 CREATE FUNCTION substitute_row()
 RETURNS TRIGGER LANGUAGE python AS
 $python$
 def before_insert(td, new):
 	if new['i'] == 0:
 		# .replace() creates a *new* record with `t` set to 'newval'
 		return new.replace(t = 'newval')
 	# Otherwise, None is returned by default, which causes
 	# the manipulation to continue.
 $python$;

 CREATE TRIGGER execsub BEFORE INSERT ON two_columns
 FOR EACH ROW EXECUTE PROCEDURE substitute_row();

 INSERT INTO two_columns VALUES (0, 'text');
 SELECT i, t FROM two_columns;
 -- i |   t    
 -----+--------
 -- 0 | newval
 --(1 row)

The object that is returned by the entry point will be coerced into a
record of the composite type of the trigger's table. This means that a
simple Python tuple or dictionary could be returned in order to supply the
replacement. See `Composite Types`_ for more information.

.. tip::
   The ``replace`` and ``transform`` methods
   on records are often the most convenient mechanisms to create the
   appropriate substitute. See the :ref:`pg_types_record` reference
   documentation for more details on how to use these methods.

Statement Triggers
------------------

Like row-triggers, trigger functions called with statement orientation
are given a :ref:`pg_triggerdata` object as the first argument.
Statement triggers timed after are given one or two additional
``None`` objects--two for update manipulations. These additional
arguments are given in order to accommodate for future functionality.

The following example shows a statement trigger that can handle multiple
event types::

 CREATE TABLE monitored_table(i int, t text);

 CREATE FUNCTION watch_table()
 RETURNS TRIGGER LANGUAGE python AS
 $python$
 import Postgres

 def after_truncate_statement(td, olds):
 	Postgres.notify('was_truncated')

 def after_delete_statement(td, olds):
 	Postgres.notify('removed_data')

 def after_insert_statement(td, news):
 	Postgres.notify('new_data')
 $python$;

 CREATE TRIGGER watch_trigger BEFORE INSERT ON monitored_table
 FOR STATEMENT EXECUTE PROCEDURE watch_table();

 LISTEN was_truncated;
 LISTEN new_data;
 LISTEN removed_data;
 INSERT INTO monitored_table VALUES (0, 'text'), (1, 'more text');
 DELETE FROM monitored_table;
 TRUNCATE monitored_table;

Statement Trigger Entry Points:

 ``def before_insert_statement(td):``
  Executed before each INSERT statement.

 ``def before_update_statement(td):``
  Executed before each UPDATE statement.

 ``def before_delete_statement(td):``
  Executed before each DELETE statement.

 ``def before_truncate_statement(td):``
  Executed before each TRUNCATE statement.

 ``def after_insert_statement(td, news):``
  Executed after each INSERT statement.

  The ``news`` argument is always ``None``.

 ``def after_update_statement(td, olds, news):``
  Executed after each UPDATE statement.

  The ``olds`` and ``news`` arguments are always ``None``.

 ``def after_delete_statement(td, olds):``
  Executed after each DELETE statement.

  The ``olds`` argument is always ``None``.

 ``def after_truncate_statement(td, olds):``
  Executed after each TRUNCATE statement.

  The ``olds`` argument is always ``None``.


.. _programming_stateful:

Stateful Functions
==================

.. note::
   Stateful functions are an advanced feature of pg-python.

Stateful functions are entry points decorated with :ref:`pg_stateful`.
When an entry point is decorated with `Postgres.Stateful`, the callable
should return a state object. The state object is normally a
generator object, but it can be any object that supports the generator
protocol::

 CREATE FUNCTION counting() RETURNS int LANGUAGE python AS
 $python$
 """
 It counts.
 """
 from Postgres import Stateful
 
 @Stateful
 def main():
     i = 0
     while True:
         yield i
         i = i + 1
 $python$;
 
 SELECT counting(), counting(), col FROM (
  SELECT 'one'::text AS col UNION ALL
  SELECT 'two'::text AS col) AS t(col);
 
  counting | counting | col 
 ----------+----------+-----
         0 |        0 | one
         1 |        1 | two
 (2 rows)

The example shows that state is *not* shared by multiple calls.
Each, independent invocation of the function will have its own state object.
For state shared across all invocations, there is no need to use the
stateful feature, a module global can be referenced and updated as
needed.

Stateful functions get new arguments every time they are invoked; the call
to the ``send`` method is given a Python tuple of the new
parameters. With generators, these new parameters are provided by the
``yield`` statement::

 CREATE FUNCTION counting_and_more(more int) RETURNS int LANGUAGE python AS
 $python$
 """
 It counts, and it adds more.
 """
 from Postgres import Stateful
 
 @pytypes
 @Stateful
 def main(more):
     i = 0
     while True:
         i = i + 1 + more
         args = (yield i)
         more = int(args[0])
 $python$;
 
 SELECT counting_and_more(j), counting_and_more(k) FROM
  (VALUES (0, 1), (0, 1), (1, 10)) AS g(j,k);
 
 -- counting_and_more | counting_and_more
 ---------------------+-------------------
 --                 1 |                 2
 --                 2 |                 4
 --                 4 |                15
 --(3 rows)

Stateful can be used with entry points in `Trigger Returning Functions`_ and
`Set Returning Functions`_. However, in the case of SRFs, it's very unlikely to
offer useful functionality as state is already managed. In the case of TRFs,
it's primarily useful in generic triggers where data and routines are
specialized for the target relation can be easily re-used.

Inline Execution
================

.. note::
   DO-statements were introduced in PostgreSQL 9.0, earlier versions
   will not support this feature.

pg-python supports ``DO`` statements by executing the code in
a temporary module object. Unlike functions, no entry points are used::

 CREATE TEMP TABLE tmp (i int, t text);

 DO $python$
 
 ins = prepare("INSERT INTO tmp VALUES ($1, $2)")
 
 with xact():
 	ins(1, 'A simple sentence.')
 	ins(2, 'Another simple sentence.')
 
 $python$ LANGUAGE python;
 
 SELECT * FROM tmp;
 -- i |            t             
 -----+--------------------------
 -- 1 | A simple sentence.
 -- 2 | Another simple sentence.
 --(2 rows)

Inline execution is very useful for Python as it allows further
initialization of the language's environment without a custom function.
In particular, the `Postgres.preload` function can be used from a
DO-statement to make sure that specific procedures are ready for execution
well before they are used.

Function Modules
================

All ``python`` SQL functions are Python modules.
Once loaded, function modules exist in ``sys.modules`` as regular modules do.
The primary differences between regular Python modules and function modules
are that the function module's source is stored inside the database
and the presence of ``__func__``, a :ref:`pg_function` object, in the module's
dictionary. This object represents the Postgres function and provides the
functionality needed to create the module object itself--it is also the
module's ``__loader__``.

A function module's attributes:

 ``__name__``
  The function's Oid as a string object. Used as the ``sys.modules`` key.

 ``__loader__``
  The module's PEP-302 loader.

 ``__func__``
  The module's :ref:`pg_function` object.

 ``__file__``
  The qualified regprocedure representation of the SQL function.

When a function is loaded, all of the necessary information is cached for
re-use. The function module exists in ``sys.modules`` and it contains
``__func__``. That :ref:`pg_function` object contains all the
information needed to properly construct the arguments upon
execution and what to do with the returned object.
However, this information can change. If a function is replaced or a
dependency is modified after the function was loaded, the function needs to
be completely reloaded to reflect these changes. Notably, altering a table
whose type is referenced by a function will cause it to be
reloaded. This can be problematic *if* the function's
module body initializes temporary tables or other session objects for
subsequent use. If a function's dependencies are subject to change,
precautions may need to be taken in the module body in order to avoid errors
caused by already initialized objects.

Data Objects
============

When parameters are given to a function or rows are returned by a prepared
statement, :ref:`pg_object` instances are used to represent the data.
The only exception is that Python's ``None`` is used to represent SQL NULLs.
Some of these data objects have custom interfaces for convenient access to
parts of the data. For example, :ref:`pg_types_date`
objects have properties for getting the day, month, and year parts of the
date.

Data objects can make use of the operators that exist in the database.
Python does not support arbitrary operator methods, so only some
operators are directly accessible from Python. However, the few methods
it does support allows for date-time arithmetic, concatenation, and
comparisons.

.. note::
   Operators are greedy is the sense that they will implicitly convert a
   Python object to the type of the other operand before performing the
   operation. This only occurs when one of the operands is not
   a :ref:`pg_object`.

Data objects are local to the backend context, and most Python modules
are written to expect Python types. Data objects should be converted
to Python types before passing them to most external functions. However,
many Postgres interfaces are capable of taking data objects, so conversion
is not a necessity.
In situations where data objects are not wanted, the `Postgres.pytypes`
decorator and the `Postgres.iterpytypes` processor can be used
to handle many of the common conversion cases. These tools will convert
Postgres integers and strings to Python longs and strings.

Array Types
-----------

Arrays support the standard sequence interfaces provided by many of
the built-in Python types. The notable restriction is that only steps
of one are supported by the ``__getitem__`` interface.

Arrays implement a specialized constructor that allows multi-dimension
arrays to be built from balanced Python list objects. The constructor
is very particular about array boundaries and *requires*
that dimensions be represented using a list object. This requirement is
made to allow the distinction between an array's dimension and an element.
By using Python list objects for dimension boundaries, other types--such as
Python tuples--can be used to represent the elements of composite arrays::

 from Postgres.types import int4
 
 i4 = int4.Array([[1,2],[3,4]])
 
 assert i4[0] == (1,2)
 assert i4[1] == (3,4)

The Python object interface provided for arrays take on Python semantics.
Index errors will be raised when requested items do not exist, and
a subarray or element will be returned unless a slice is requested.

Postgres arrays are *quite* different from Python
sequences. While the Python sequence interfaces are implemented to be
consistent with the usual Python semantics, additional array interfaces are
implemented to provide access methods that are consistent with the
interfaces used in SQL::

 from Postgres.types import int4
 int4array = int4.Array
 
 i = int4array([
     [[1,2],[3,4]],
     [[5,6],[7,8]],
 ])
 
 # i[1][1][1] in SQL
 assert i.sql_get_element((1,1,1)) == 1
 
 # i[1][1][2] in SQL
 assert i.sql_get_element((1,1,2)) == 2
 
 # i[2][1][1] in SQL
 assert i.sql_get_element((2,1,1)) == 5


The ``sql_get_element`` method doesn't raise index errors
when the boundary is crossed by an index. It is consistent with how
single elements are accessed using indirection in SQL.
``None`` is returned in cases where the requested element
does not exist.

See :ref:`pg_array` for more information about the array interfaces.

Composite Types
---------------

Composite types are supported by subclasses of the
:ref:`pg_types_record` type. Composite types are fairly interesting objects
as they are both a sequence and a mapping. In the following examples, the
composite type ``composite_test`` will be used to aid examples::

 CREATE TYPE composite_test AS (i int, t text);

It can be bound in Python using ``regtype``; subsequent examples will assume
``composite`` to have been bound::

 import Postgres
 from Postgres.types import regtype
 
 composite = Postgres.Type(regtype('composite_test'))

Being a sequence and a mapping, records can be created from a sequence or a
dictionary object. When a dictionary is used to provide the fields, absent
attributes will be initialized to NULL. However, when a sequence is given,
all attributes must be present or an exception will be raised::

 from_dict = composite({'t': 't specified, but no i'})
 from_seq = composite((1234, 't'))

The Postgres string representation is also acceptable::

 from_str = composite('(1234,"t")')

Attributes of the records can be accessed by index or by name using
``Postgres.types.record.__getitem__``::

 r = composite((1234, 'text'))
 assert r["i"] == r[0] == 1234
 assert r["t"] == r[1] == 'text'

Like Python tuples, records are immutable. So, in order to get a record
with different fields, a new record must be created. This can be achieved
by instantiating the type with the new fields, or by using the record's
manipulation methods. There are two manipulation methods, ``replace`` and
``transform``--both return a new record object::

 x = composite((1,'test'))
 
 # the keywords identify the attributes to replace
 y = x.replace(t = 'newval')
 
 assert y == ctype((1,'newval'))

.. tip:: These methods are particularly useful with `Trigger Returning Functions`.

See :ref:`pg_types_record` for more information.

Database Access
===============

Database access is provided by :ref:`pg_Statement` objects. Each created
instance represents a single SQL statement that can take positional
parameters and return rows.

Statement objects have multiple methods that can execute the statement. The
simplest form of execution is performed by calling a statement
object(__call__). When this execution method is used, the entire result set
is returned as a Python list::

 import Postgres
 
 # prepare is a built-in pointing at Postgres.Statement
 stmt = prepare("SELECT i FROM (VALUES (1), (2), (3)) AS g(i) WHERE i > $1")

 # positional parameters are given as arguments
 r = stmt(1)
 
 Postgres.WARNING(str(list(map(tuple, r))))
 # WARNING:  [(int4('2'),), (int4('3'),)]
 # CONTEXT:  [__inline_executor__ while executing main]

Internally, a cursor is allocated when ``__call__`` is used,
but all the rows are fetched and stored in the list. Most
execution methods will return an actual :ref:`pg_cursor` object.
The cursor that's created will be configured to suite the request. When the ``rows``
execution method is used, the cursor returned can only be used as an
iterable producing individual rows--the seek and read methods are not
usable::

 import Postgres
 
 stmt = prepare("SELECT 1")
 
 for x in stmt.rows():
     Postgres.WARNING(str(x[0]))

The ``rows`` execution method should be used when the result
set is being processed on a row-by-row basis. When a large result set is
being produced it is particularly advantageous in comparison to
``__call__`` as the result set is not allocated in a Python
list.

Scrollable cursors can be created using the ``declare(...)``
execution method. This will return a cursor whose seek and read methods are
usable. The cursor's ``__next__`` method is usable, but in
order to guarantee exact positioning, no read-ahead is performed::

 import Postgres
 
 stmt = prepare("SELECT i FROM generate_series($1, $2) AS g(i)")
 
 cur = stmt.declare(1, 100)
 
 first10 = cur.read(10)
 cur.seek(50)
 last50 = cur.read()
 
Manipulation statements without a RETURNING clause do not return rows.
Calling methods that return a cursor configured to return rows with such
statements will result in an exception. When these kinds of statements
need to be executed, ``__call__`` and ``first``
should be used. In the case of ``first``, only the row count
is returned. For ``__call__``, a tuple containing the command
tag and the row count is returned::

 import Postgres
 
 sqlexec("CREATE TEMP TABLE simple_data (t text, i int)")
 ins = prepare("INSERT INTO simple_data (t,i) VALUES ($1, $2)")
 
 # __call__
 cmdtag, rowcount = ins('text', 1)
 
 # first
 rowcount_from_first = ins.first('other text', 2)
 
 Postgres.WARNING("%s %d rows" %(cmdtag, rowcount))

When DML statements need to be executed for a sequence of parameters, the
``load_rows`` and ``load_chunks`` methods
can be used. These methods take an iterable that ultimately produces
the arguments to execute the statement with. Upon exhausting the given
iterator, these methods will return the total row count. Notably, these
methods *cannot* be used with statements that return
rows--an exception will be raised by either method if the statement
returns rows. An example of ``load_rows``::
 
 import Postgres
 
 sqlexec("CREATE TEMP TABLE load_rows_test (t text, i int)")
 ins = prepare("INSERT INTO load_rows_test (t,i) VALUES ($1, $2)")
 
 # __call__
 count = ins.load_rows([
 	('first', 1),
 	('second', 2),
 	('third', 3),
 ])
 
 Postgres.WARNING("%d rows" %(count,))
 # WARNING:  3 rows
 # CONTEXT:  [__inline_executor__ while executing main]
 
Contents of the temporary table::

 SELECT * FROM load_rows_test;
 --   t    | i
 ----------+---
 -- first  | 1
 -- second | 2
 -- third  | 3
 --(3 rows)

See :ref:`pg_statement` and :ref:`pg_cursor` for further information about
the database access interfaces.

.. _errors:

Error Control
=============

When a database error is caused by use of a Postres interface, a
:ref:`pg_exception` is raised. After a database error occurs, access to the
database is prohibited until the failed
transaction state is corrected. The transaction state cannot be
corrected within the function unless the error occurs inside a
:ref:`pg_transaction` block::

 CREATE TEMP TABLE pkviolation (s SERIAL PRIMARY KEY);
 INSERT INTO pkviolation VALUES (DEFAULT);
  
 DO $python$
 
 import Postgres
 
 insert_stmt = prepare("INSERT INTO pkviolation VALUES ($1)")

 try:
     with xact():
         insert_stmt(1)
 except Postgres.Exception as x:
     if x.code != '23505':
         raise
     Postgres.WARNING("caught unique violation")
     insert_stmt(12)
 
 $python$ LANGUAGE python;
 -- DO
 
 SELECT * FROM pkviolation;
 -- s
 ------
 --  1
 -- 12
 --(2 rows)

:ref:`pg_transaction` context managers can only be used once, so
``xact`` should normally be directly invoked by the with-statement,
``with xact():``.

.. note::
   It is important that exceptions are caught *outside* of
   the transaction block. *Attempting to interact with the database*
   *while in a failed transaction will result in another exception being raised.*

Postgres errors can be raised from Python using :ref:`pg_ereport`.
However, the :ref:`pg_error` wrapper is more convenient, so direct use
of ``ereport`` is *not* recommended. This
can be used to construct custom errors with more useful information than the
more verbose Python exception output::

 import Postgres
 
 Postgres.ERROR(
     message = 'application error message',
     code = 'AP001',
     # Don't include the usual PL context(traceback).
     inhibit_pl_context = True,
 )

By default, all exceptions include the traceback and additional context
identifying the Postgres function that raised the exception. The
``ereport`` interface allows this information to be
suppressed in order to provide the programmer with greater control over the
reported error message.
