-----------------------------------------------------------------
PySQLite: Python DB-API 2.0 Compliant Interface Module for SQLite
-----------------------------------------------------------------

This document was last updated for PySQLite version 0.4.3.

===============
0. Front Matter
===============

0.1 Copyright notice and License
--------------------------------

(c) 2002 Michael Owens
(c) 2002-2003 Gerhard Hring

Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted, provided that
the above copyright notice appear in all copies and that both that copyright
notice and this permission notice appear in supporting documentation.

This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.

0.2 Abstract
------------

SQLite is a powerful embedded relational database management system in a
compact C library, developed by D. Richard Hipp. It offers support for a large
subset of SQL92, multiple tables and indices, transactions, not to mention ODBC
drivers. The library is self-contained and implemented in under 20,000 lines of
code, which itself is uncopyrighted, and free to use for any purpose. It runs
on a wide variety of platforms including Linux, FreeBSD, Windows, and has been
ported to the Sharp Zaurus.

The goal of this project is to develop a Python extension for SQLite that
conforms to the Python Database API Specification 2.0, exposes all SQLite
functionality to Python programmers and offers additional ease-of-use features.

===============
1. Introduction
===============

1.1 SQLite and Python
---------------------

1.1.1 Overview
~~~~~~~~~~~~~~

SQLite is a powerful embedded database implemented in a small, compact C
library. It has been adapted to command-line utilities, specialized C programs,
web servers, PDAs, and programming languages, of which Python is but one.

Because of its small size, wide platform availability, liberal copyright
license, and solid design, it lends itself to wide range of applications in
both open source and commercial software. Programmers who have traditionally
used simple BTree databases such as gdbm can now take advantage of using a
relational database without adding substantial overhead to their applications.
Users who often wrote simple reports in Microsoft Visual Basic and Access can
now easily use Python and SQLite for free, and the resulting databases and
programs are no longer limited to Windows-only platforms. The very same code
will also run just as easily on an Apple Macintosh or Sun UltraSPARC.

Python is the best scripting language in the world (I say without the least
bias). It is clean, simple, elegant, and has an extensive library and user
base. Like SQLite, it is available on everything from PDAs to mainframes. It is
used in scientific computing, system administration, graphical user
applications, search engines, web servers, movie studios, and the list goes on.
It is not only an excellent language for professional programmers, but for
beginners as well. It has been ported to a wide array of operating systems and
architectures and is free on every one of them.

1.1.2 Applications
~~~~~~~~~~~~~~~~~~

The combination of Python and SQLite makes programming with a relational
database both open and available, spanning operating systems and architectures,
and all at a surprisingly low cost [1]_. PySQLite's application ranges from both
sides of the continuum: Python can be a handy administration language with
which to make operating on SQLite databases easier, and likewise SQLite can
serve as a convenient persistent storage mechanism for use in existing Python
applications.

1.1.3 Ease of Use
~~~~~~~~~~~~~~~~~

The ease of use of both SQLite and Python cannot be understated. By itself, the
SQLite C library requires only three functions to execute an SQL query and
retrieve its results. Python makes it even easier by virtue of its being a
scripting language. A complete program to sort a taxonimic database of plants
by species and genus name could be done with the following program::

    import sqlite

    cx = sqlite.connect("db")
    cu = conn.cursor()
    cu.execute("""
        select genus, species, family, category
        from calflora
        order by species, genus
        """)
    for row in cu.fetchall():
        print "%14s, %15s, %19s, %8s, %25s, %i" % (row.genus,
                                                   row.species,
                                                   row.family,
                                                   row.category)
    cx.close()

1.2 Required Software
---------------------

- PySQLite requires Python 2.1 or later.

- PySQLite requires SQLite 2.5.6 or later. Using the most current one is
  recommended in order to enjoy the full feature set.

1.3 Thread Safety
-----------------

SQLite does not directly support threads, but at the same time it does not
impede their use either. Detailed information regarding the use of threads is
available on the website. In essence, you can use threads safely as long as
each thread maintins its own connection and associated cursors. Multiple
threads accessing the same connection or cursor is not safe, unless you take
precautions using the threading module. Thus, PySQLite is thread safe at level
1, as defined in the DB-API Specification.

1.4 Installation
----------------

The source distribution can be built on both Windows and UNIX. However,
building on Windows can be somewhat more difficult due to differences in build
tools.

All platforms require that both Python and SQLite libraries/headers are already
present on the system.

1.4.1 UN*X Installation
~~~~~~~~~~~~~~~~~~~~~~~

The source tarball is currently the only method of installation on UNIX.
Fortunately, it is easy and painless. After you untar the source, cd into the
pysqlite directory and do the following as root::

    python setup.py build
    python setup.py install

1.4.2 Windows Installation
~~~~~~~~~~~~~~~~~~~~~~~~~~

On Windows, Python's distutils defaults to the Visual C++ compiler from
Microsoft. If you want to use other compilers for compiling Python extensions
on Windows, look into chapter 3.1 "Using non-Microsoft compilers on Windows" in
the "Installing Python Modules" of your Python documentation. It's available
online at http://www.python.org/doc/current/inst/non-ms-compilers.html

The following are build instructions for the GNU C compiler, Borland C++ and
for Microsoft's Visual C++ environment.

1.4.2.1 Using the GNU C Compiler
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is the most tested method, as the author uses mingw to create the win32
binaries for PySQLite.

As you have read the Python documentation for non-Microsoft compilers by now,
you have mingw or Cygwin installed and created the required import library for
the Python DLL. Fine, let's continue.

1. From http://www.hwaci.com/sw/sqlite/download.html get the
sqlite_source.zip and sqlitedll.zip files. Unpack them all in the same
directory. Create an import library for the GNU linker:

``dlltool --def sqlite.def --dllname sqlite.dll --output-lib libsqlite.a``

2. Unpack the PySQLite sources and open setup.py in your editor. Search for
win32. Change the include_dirs and library_dirs variable to point the place
where you've unpacked the SQLite files and where you created the import
library.

3. Build PySQLite:

``python setup.py build --compiler=mingw32``

1.4.2.2 Using the Borland Compiler
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As you have read the Python documentation for non-Microsoft compilers by now,
you have installed the Borland C++ compiler and created the required import
library for the Python DLL. Fine, let's continue.

1. From http://www.hwaci.com/sw/sqlite/download.html get the
sqlite_source.zip and sqlitedll.zip files. Unpack them all in the same
directory.

2. Create an import library for the Borland linker:

``implib -a sqlite.lib sqlite.dll``

3. Unpack the PySQLite sources and open setup.py in your editor. Search for
"win32". Change the include_dirs and library_dirs variable to point the
place where you've unpacked the SQLite files and where you created the
import library.

4. Build PySQLite:

``python setup.py build --compiler=bcpp``

1.4.2.3 Using Microsoft Visual C++
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. From http://www.hwaci.com/sw/sqlite/download.html get the
sqlite_source.zip and sqlitedll.zip files. Unpack them all in the same
directory.

2. Create an import library for the Microsoft linker:

``lib /def:sqlite.def``

3. Unpack the PySQLite sources and open setup.py in your editor. Search for
win32. Change the include_dirs and library_dirs variable to point the place
where you've unpacked the SQLite files and where you created the import
library.

4. Build PySQLite:

``python setup.py build``

1.4.2.4 After Building PySQLite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Make sure the sqlite.dll can be found. Either copy it into your system
directory, somewhere else in your PATH or in the same directory as the .pyd
file from PySQLite.

2. Run the included test suite with these or similar commands::

    copy ..\sqlite\sqlite.dll build\lib.win32-2.3
    copy test\*.py build\lib.win32-2.3
    cd build\lib.win32-2.3
    python all_tests.py

All the tests should pass. If something goes wrong, report it to the
PySQLite developers using the Sourceforge bug tracker.

3. Install PySQLite:

``python setup.py install``

Again make sure the sqlite.dll can be found

1.5 Getting Help
----------------
 

1.5.1 Mailing Lists
~~~~~~~~~~~~~~~~~~~

If an answers cannot be found in the documentation, you can post to the
PySQLite users' mailing list on SourceForge. To join the list, go to
http://lists.sourceforge.net/lists/listinfo/pysqlite-users. 

================================
2. Features Specific to PySQLite
================================

2.1 API Modifications
---------------------

2.1.1 Functions
~~~~~~~~~~~~~~~

connect()

    SQLite is a file-based embedded database. As such, you only need to give
    the path to the database file to the connect function. 

2.2 API Additions
-----------------

2.2.1 Data Types
~~~~~~~~~~~~~~~~

SQLite is typeless. It stores all data as text. However, all data returned from
PySQLite are cast into their respective Python types as declared in the
database schema, provided that said types are standard ANSI SQL type (e.g.
VARCHAR, INT, FLOAT, etc.).

In addition to using the schema, PySQLite allows you to specify the data types
of columns to be retrieved by a select statement using types in a SQL comment.
This is done using execute(). When fetchone(), fetchmany() or fetchall() is
called, PySQLite converts columns in the returned row to the types specified in
the types list. For example, say you have a table defined as such::

    create table plants
    (
      id integer primary key,
      name varchar(25),
      av_height int,
      av_lifespan float
    );

Now you want to select all records in plants and have them properly typed. You would do as follows::

    cnx = connect('db')
    c = cnx.cursor()
    c.execute('-- types int,str,int,float')
    c.execute('select * from plants')

PySQLite looks for types in SQL comments and if found, stores the list of types
internally for later use when rows are fetched from the result set. In this
particular example that even without the '-types' mechanism, data types would
have still been properly cast since they are ANSI datatypes.

2.2.1.1 Custom Types
~~~~~~~~~~~~~~~~~~~~

While PySQLite supports native Python data types, you can also apply your own
data types to result sets as well. When you do this, PySQLite will pass the
field value to your type's constructor when converting the column type. The way
to do this is to add your type the the converters list, which is done on
connect() This is perhaps best illustrated by example::

    class my_type:
        def __init__(self,value):
            self.value = value

        def __repr__(self):
            return "my_type(%.3f)" % float(self.value)

        def __str__(self):
            return "%.3f" % float(self.value)

    cnx = sqlite.connect(db="db", mode=077, converters={'my_type':my_type})
    c = conn.cursor()

    SQL = "select genus, species, family, category, author1, upper_elev_ft " \
          "  from calflora order by genus, species limit 33"

    cursor.execute('-- types str,str,str,str,str,my_type')
    cursor.execute(SQL)

The resulting rows fetched will have upper_elev_ft formatted with three decimal places.

2.2.2 Result Sets
~~~~~~~~~~~~~~~~~

PySQLite's result set class is taken from pyPgSQL, which adds many niceties not
required by the DB API Specification. As such, fetchone() returns more than
just a tuple of fields, but a flexible data structure which can behave as a
tuple, but also has attributes with the field names from the returned rows.
Thus you can do things as in the following example::

    c.execute('select last_name, street, city, zip from contacts')

    row = c.fetchone()

    # By field number (the DB-API way)
    print row[1]

    # By attribute
    print row.city
    print row.zip

    # As a slice
    print row[2:-1]

2.2.3 Aggregates and Functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The SQLite Extended C API has a nice framework for allowing you to add your own
functions and aggregates which can be called from SQL. For example, while
SQLite does not support the concept of CURRENT_DATE, you could use the
framework to create an equivalent function (current_date()) so that statements
like SELECT current_date() or INSERT into orders (cust_id, price, order_date)
values (1, 100, current_date() are valid statements within SQLite.

The only problem is that the SQLite framework requires you to implement these
extensions in C. Fortunately, PySQLite provides a framework which allows you to
implement them in Python. The process of creating user-defined functions and
aggregates is very similar to creating custom types. The best way to
demonstrate these is by example.

2.2.3.1 Functions
~~~~~~~~~~~~~~~~~

The following is from functions.py which is included in the calflora example.
It creates a function times_two which takes a single argument and appropriately
return a multiple of two. The example applies the times_two argument to the
upper_elev_ft field in a simple select statement.::

    # Function def times_two(x): return float(x)*2

    conn = sqlite.connect("db", 077)

    # Add the aggregate
    conn.create_function("times_two", 1, times_two)

    cursor = conn.cursor()

    # Expect string, then two float values for columns
    cursor.execute('-- types str, float, float')

    # Execute SQL with custom aggregate
    cursor.execute("select family, upper_elev_ft, times_two(upper_elev_ft) from calflora limit 10")

    # DB-API extension that requires Python 2.2, else use cursor.fetchall():
    for row in cursor:
        print row

2.2.3.2 Aggregates
~~~~~~~~~~~~~~~~~~

The following is from aggregates.py which is included in the calflora example.
It creates an aggregate called variance which computes the variance of a set
(or population). This example determines the variance in elevation heights (the
upper_elev_ft field) of all plant species listed in the table.

::

    class variance:
        def __init__(self):
            self.X = []

        def step(self, x):
            self.X.append(float(x))

        def reset(self):
            self.X = []

        def finalize(self):
            mu = 0
            n = len(self.X)

            for x in self.X:
                mu += x
            mu = float(x)/n

            sigma = 0
            for x in self.X:
                sigma += (x-mu)**2

            sigma = sigma / (n - 1)

            return sigma

    conn = sqlite.connect("db", 077)

    # Add the aggregate
    conn.create_aggregate("variance", 1, variance)

    cursor = conn.cursor()

    # Expect float value for column
    cursor.execute('-- types float')

    # Execute SQL with custom aggregate
    cursor.execute("select variance(upper_elev_ft) from calflora")

    # Fetch
    row = cursor.fetchone()

    # Print
    print "Upper elevation variance: %.2f" % row[0]

.. [1] That is to say free. 
