Forth-SQL Interface








Introduction

This implementation of ForthQL depends on a two basic principles of operation:

Portability

This module is written in VFX Forth for Linux but it can probably be ported easily to other Forth environments as well.

The following aspects of VFX has been used:




Embedding SQL code

Examples shown are based on the paper by N. J. Nelson cited in the References section.

The main idea of this interface is to embed multiline SQL statements inside colon definitions as shown below:


: TEST1
  SQL| INSERT INTO delegates      \ example comment 1
  VALUES ('Chuck','Moore',1) |SQL
;

SQL statements are started by the word SQL| and ended either with |SQL, |SQL. or |SQL> . They span one or more lines and Forth comments starting with \ can appear in between. These comments are typically used to document the SQL code itself.

The SQL statement is compiled into a Z-string inside the current definition (TEST1 in this case) and sent to the DBMS for execution when the current definition is executed. Apart from these comments, which are stripped in the actual SQL string, the syntax of the SQL statement depends on the actual DBMS being used.

Ending word variations |SQL or |SQL> are used whenever zero or one Forth callback action are used. See glossary section below.

Lack of SQL ending word inside a file will result in "delimiter not found" exception.




Dynamic SQL code

ForthQL allows not only static SQL statements to be embedded in Forth but also allows for dynamic SQL statement generation by means of SQL parameters as in the example:


: TEST2
  SQL| SELECT cuscode,cusname FROM customers  \ Fixed part
  WHERE cusid BETWEEN
  | LOWLIMIT |  AND | HIGHLIMIT               \ parameters
  |SQL> TESTOUT                               \ Output
;

In this example, we build a template SQL statement. Words LOWLIMIT and HIGHLIMIT are template parameters and will patch the SQL string with proper values at runtime, when TEST2 is executed.

CHAR | is both used as the ending character of the static SQL string (as in the first and third appearance) and as a Forth word that restarts the SQL string compilation (as in the second appearance).

SQL parameters are specified like in the example below:



:noname    \ ca1 u1 --
  s"    12" drop -rot move ;
5 CHARS SQLParameter: LOWLIMIT

   s" 100009" drop -rot move ;
6 CHARS SQLParameter: HIGHLIMIT

At compile time, when the template SQL is being built, parameters need to specify how many space or 'width' they need in the SQL string. LOWLIMIT needs 5 CHARS and HIGHLIMIT needs 6 CHARS.

At runtime, SQL parameters run an execution token xt This token is for a word that must fill in just the part of the SQL string ca1 u1 reserved to its parameter. Of course, u1 is equal tothe parameter 'width'. Nameless definitons above will fill reserved spaces with literals 12 and 100009 respectvely.

This mechanism is totally independent from the usual SQL run time parameter binding using wildcards.

Parameter edition support

The above mechanisms are enough to edit any string needed into the parameter memory area. However, we can take advantage of VFX GenIO Architecture and use well known words like TYPE, . , EMIT and friends. The advantages are best appreciated when formatting numbers. The above example can be re-written as:



:noname                           \ ca1 u1 --
  [sqlio 12 . sqlio] ;
5 CHARS SQLParameter: LOWLIMIT

:noname                           \ ca1 u1 --
  [sqlio 100009 . sqlio] ;
8 CHARS SQLParameter: HIGHLIMIT

Words [sqlio opens a context where standard I/O is redirected to the ca1 u1 memory region using the SQL-MemDev GenIO Device. Word sqlio] closes this device and restores standard I/O to previous values. Raw access to this memory zone is still possible using IOCTL-GENs, but it should not be necessary in this approach.

Warnings:




Processing SQL output

Word |SQL just executes the SQL code for a statement that does not produce a result set, like an insertion into a table (as in TEST1). Of course, other situations will result in the retrieval of one or more rows in a result set, which can be processed one by one by a Forth word.

Word |SQL> - as shown in TEST2 - will also compile some runtime code and the execution token for the next word (TESTOUT in this case). The runtime code is responsible to invoke TESTOUT for each row being retrieved. Some DBMS APIs like SQLite3 has a C callback interface designed for this purpose. See the proper section for details.

Support for OOP

Extending the concept of |SQL>, another word named |SQL>> lets you specify both a callback and a client data pointer. This gives some support to integrate callback actions with object methods.

This feature is highly dependant on the DBMS API and the OOP library used. The following example is done using SQLite3 and GForth 'objects.fs' library ported to VFX Forth for Linux.



sql-dry-run off

: quote   ( -- ) [char] ' emit ;

: [qtype]   ( ca u -- )
   postpone quote
   postpone type
   postpone quote
; immediate

: creation
   SQL| CREATE TABLE IF NOT EXISTS Person(
      name TEXT PRIMARY KEY, surname TEXT, age INTEGER);
   |SQL.
;

: persons
   SQL| SELECT * FROM Person; |SQL.
;

object class
   cell% inst-var m-name
   cell% inst-var m-surname
   cell% inst-var m-age
end-class Person


Person methods
protected
  :noname                                 \ ca1 u1 --
    [sqlio m-name $@ [qtype] sqlio]
  ; 16 chars SQLParameter: /name/

  :noname                                 \ ca1 u1 --
    [sqlio m-surname $@ [qtype] sqlio]
  ; 16 chars SQLParameter: /surname/

  :noname                                 \ ca1 u1 --
    [sqlio  m-age ? sqlio]
  ; 8 chars SQLParameter: /age/

  :m (refresh)  ( colValue** this -- )
    0 sql3-$@ evaluate m-age !
  ;m

  :noname   ( object* nCols colValue** colName** -- ior )
    drop nip swap (refresh)
    SQLITE_OK
  ; SQLite3Callback: <<refresh>>


public
   m:   ( ca1 u1 ca2 u2 age this -- ) \ overrides construct
    0 dup m-name ! m-surname !
    m-age ! m-surname $! m-name $!
  ;m overrides construct

  :m save-person   ( this -- )
     SQL| INSERT OR REPLACE INTO Person
     VALUES( | /name/ |  , | /surname/ | , | /age/ | );
     |SQL.
  ;m

  :m refresh ( this -- )
    ." Refresing age from database for " m-name $@ type cr
    SQL| SELECT age FROM Person WHERE name = | /name/
    |SQL>> this <<refresh>>
  ;m

end-methods

: Homer   s" Homer" ;
: Maggie  s" Maggie" ;
: Bart    s" Bart" ;
: Simpson s" Simpson" ;

Homer  Simpson 43 Person heap-new constant p1
Maggie Simpson 40 Person heap-new constant p2
Bart   Simpson 14 Person heap-new constant p3

s" simpson.db" db-open db-throw creation

p1 save-person
p2 save-person
p3 save-person
p1 refresh



Debugging SQL code

VARIABLE SQL-DRY-RUN controls the behaviour of |SQL, |SQL> and |SQL. If set, it will compile code to type the contents of the SQL buffer and then EXIT. Parameters are seen with its run-time values into their places. Memory overruns in the SQL buffer are a likely cause of SQL syntax errors.

This is a compilation flag. To disable it, you must turn it off and reload your code.




Tunning the SQL buffer size

SQL statements are compiled (as Z strings) inside the current definition in a similar way as words S" or ." do. The size of this per-word SQL Buffer is is controlled by the VALUE #SQLBuffer with a default size. However, the user can change this value prior to defining a given SQL statement to tune for very large statements and to avoid waste of memory space for short statements.

A buffer too short for a given statement will result in a buffer overflow exception. Forth comments do not count against this limit.

Words SQLBuffer> and >SQLBuffer, used as a pair surronding the definition, change and restore to a previous value the SQL buffer size.

VARIABLE SQL-TUNNING controls the printing of compile-time diagnostics on the SQL buffer memory usage. By default it is turned off.



SQL-TUNNING ON
#148 CHARS SQLBuffer>
: MYDEF   ( -- )
  SQL| CREATE TABLE IF NOT EXISTS Person(
     oid INTEGER PRIMARY KEY,           \ object Id
     name TEXT,
     surname TEXT,
     age INTEGER
     );
     |SQL.
;
>SQLBuffer
\ and the output is ...
MYDEF SQL Buffer = 148 bytes , used = 148 , wasted = 0

Example above shows the exact amount of tweaking using SQLBuffer>, hence the 0 wasted bytes.




Multithreading

ForthQL depends on a simple API DB-EXECUTE or DB-PROCESS This API hides the database connections. It is up to the actual Forth DBMS driver to make this thread-safe For instance, it could declare a USER variable DB-HANDLE. to have several threads with its own connection.

Some data structures like SQL-MemDev are conditionally compiled into USER variables if SQL-MULTITHREAD is non-zero.




Limitations




Glossary

Support words

General purpose words which could be placed elsewhere and not related to ForthQL. They may be placed in my Extras project in a future.



Caddr/len strings

: STRING/ \ ca1 u1 u -- ca2 u2
Get the string-matched ca2 u2 from the string-remaining ca1 u1 and the length u of the original string. (Jenny Brien).

: /COMMENT                                          \ ca1 u1 -- ca2 u2
Strip trailing Forth backlash comments from a string.



Input Specification and Parsing

: PARSE-AREA@             \ -- ca u
Get the as yet unparsed portion of the input buffer. (Jenny Brien).

: PARSE-AREA!          \ ca u -- ;
Set the portion of the input buffer still to be parsed to ca u. Must start within the input buffer! (Jenny Brien).

: PARSE-AREA/               \ ca1 u1 -- ca2 u2
Get the already parsed string ca2 u2 in the input buffer from the yet unparsed ca1 u1 string. Similar to what STRING/ does.

: SKIP-CHAR                       \ ca1 u1 --
Skip 1 CHARS (usualy the SCANned character) from the remaining space ca1 u1 in the Forth input buffer, updating the input buffer.



Memory operations

: +MOVE                                   \ ca1 u1 ca2 u2 -- ca3 u3
Move memory region ca2 u2 to receiver memory buffer ca1 u1. Available receiver memory buffer is now ca3 u3.

Generic DB API

This simple API is needed for ForthQL. These words must be implemented by the underlying DBMS Forth driver module. You must include this driver module before ForthQL module. Connection to DBMS is out of the scope.

Usage of zero-teminated strings (Z-strings) are required to ease interfacing with foreign, C-based DBMS APIs.

: db-execute                      \ z-addr -- ior
Execute SQL statement z-addr not returning any output, like table creation or row insertion/update/delete. ior code signals operation result.

: db-process                      \ z-addr xt -- ior
Execute SQL statement z-addr. When complete, word given by xt is repetidely called, row by row, to process output. ior code signals operation result.

: db-process-with                         \ z-addr xt1 xt2 -- ior
Execute SQL statement z-addr. When complete, word given by xt2 is repetidely called, row by row, to process output. ior code signals operation result. xt1 is the execution token for a word that - when executed - return a cell with client data. Client data can be anything, but mostly will be an object handle for OOP support (the this keyword).

<clientData> VALUE clientDataPtr   \ xt1 is ' clientDataPtr.
: this ( -- handle) ... ;          \ xt1 is ' this

: db-throw                       \ ior --
Map DB specific ior into an appropiate user exception and throw it. Exceptions are DBMS specific.

: db-print                      \ z-addr --
Print the SQL statement z-addr being executed and its result set nicely formatted.



SQL Buffer management

#256 CHARS Value #SQLBuffer
SQL Memory Buffer default size in bytes.

: SQLBuffer>                              \ n1 -- n2
Set the current SQL buffer size to n1 bytes. Return the old value n2 for latter restoration. Intended to use in pair with >SQLBuffer. See example in the Tunning the SQL buffer size section.

: >SQLBuffer                              \ n1 --
Set the current SQL buffer size to n1 bytes.

Variable SQL-TUNNING
Flag. If true, prints the actual size SQL string being compiled so that you can fine tune #SQLBuffer for that word.

: ?sql-tunning                            \ ca1 u1 -- ca1
Print a summary report of bytes being used for the SQL Buffer in the word being defined.

SQL Code Compiler words

Variable SQL-DRY-RUN
Activate a mode where only SQL compiler words only compile code to print the SQL code. Do not execute anything on the DBMS. Warining: This is a compile (loadtime) flag, not runtime option

Exceptions thrown at compile time.

ErrDef SQLDelimErr "No SQL Delimiter | found"

ErrDef SQLBuffOvf  "SQL Buffer overflow"

: SQL|                                    \ -- ca1 u1
Start SQL string compilation, initializing a SQL memory buffer and concatenating verbatim until following CHAR |. Any backlashed comments are removed. Preserves newline characters.

: |                                       \ ca1 u1 - ca2 u2
Resume SQL string compilation to SQL buffer ca1 u1 until next CHAR | is found, as above. Leave remaining SQL buffer ca2 u2.

: |SQL                          \ ca1 u1 --
End current SQL statement compilation. Receives as input the remaining SQL buffer ca1 u1 to find out and compile the SQL buffer starting address. Compile internaly a call to DB-EXECUTE with all necessary parameters. At run time, DB-EXECUTE is invoked, which will send the SQL statement for the DBMS to execute.

: |SQL>                         \ ca1 u1 "action" --
End current SQL statement compilation and compiles the following "action" word. Receives as input the remaining SQL buffer ca1 u1 to find out and compile the SQL buffer starting address. Compile a call to DB-PROCESS with all necessary parameters. At run time, DB-PROCESS is invoked which will send the SQL statement for the DBMS to execute. For each row returned by the DBMS as the result, it will invoke the xt of "action".

: |SQL>>                                \ ca1 u1 "object" "action" --
End current SQL statement compilation and compiles both the following "object" and "action" words. Receives as input the remaining SQL buffer ca1 u1 to find out and compile the SQL buffer starting address. Compile a call to DB-PROCESS-WITH with all necessary parameters. At run time, DB-PROCESS-WITH is invoked which will send the SQL statement for the DBMS to execute. For each row returned by the DBMS as the result, it will invoke the xt of "action" with the xt of "object" as a parameter.

: |SQL.                         \ ca1 u1 --
End current SQL statement compilation. Receives as input the remaining SQL buffer ca1 u1 to find out and compile the SQL buffer starting address. Compile a call to DB-PRINT into the current definition. At run-time, DB-PRINT is executed which sends the SQL statement to the DBMS for execution and prints (a more or less nicely formatted) result set.

Dynamic SQL Statements

: SQLParameter:                  \ xt width -- ; [child] ca1 u1 pfa -- ca2 u2
Define a SQL parameter that will get substituted at runtime when the definition containing the SQL code is executed. width is the 'parameter width', that is the amouts of bytes in the SQL string that must be reserved to be patched later on. Children words receive the remaining SQL buffer ca1 u1 to just to compile ca1 width into the word that is defining the SQL statement; along with the xt and to calculate the remaining buffer. ca2 u2. xt is the execution token for an action word that performs this patching, receiving exactly the given memory area, like:

: parameter-filler   ( ca1 width -- ) ... ;


Support to parameter edition

The following words help parameter edition at runtime by doing I/O at the memory area to fill. They are not strictly needed but they are quite convenient.

TextBuff: SQL-MemDev
GenIO Memory Device SID. The multithreaded version is a USER variable, that must be initialized per thread with the phrase SQL-MemDev initTextBuffSid.

: +SQLMemDev                              \ ca1 u1 --
Open the SQL-MemDev GenIO memory device to edit memory zone ca1 u1. Standard I/O is redirected to this device. Words like EMIT, . or TYPE write on this region.

: -SQLMemDev       \ --
Close the SQL-MemDev GenIO Memory Device to for parameter edition. Warning: Standard I/O is not yet restored to previous value.

: [sqlio                                  \ ca1 u1 -- R: -- ip-handle op-handle
Convenience macro for the phrase [io +SQLMemDev . Open the SQL-MemDev memory device, configures it to use buffer ca1 u1 and redirects standard I/O to it. Intended to use inside a colon definiton at the start of a SQLParameter action. See the Dynamic SQL code section at the beginning of this chapter.

: sqlio]                                  \ R: -- ip-handle op-handle
Convenience macro for the phrase -SQLMemDev io] . Close the SQL-MemDev memory device and restores standard I/O to its previous settings. Intended to use inside a colon definiton at the end of a SQLParameter action. See the Dynamic SQL code section at the beginning of this chapter.