By Gregory H. Deatz
Updated by Craig Stuntz
This
article is both a quick introduction to writing simple User Defined Functions
(UDFs) for InterBase and a fairly complete reference guide to UDF programming
techniques. Accordingly, it is divided
into two sections.
In the first section,
the reader will learn how to quickly create simple UDFs in Delphi. Since UDFs, in general, should be
simple, this section may be all that is necessary for most InterBase
users. In the second section, we’ll
explore the reasons behind the suggestions in the first section.
What is a UDF? A user defined function (UDF) in
InterBase is merely a function written in any programming language that is
compiled into a shared library. Under Windows platforms, shared libraries are
commonly referred to as dynamic link libraries (DLL's). This
simple use of shared libraries provides the developer with a large amount of
power and flexibility. Virtually any function that can be exposed through a DLL
can be used by InterBase. This comment, however, should be taken with a grain
of salt — the intent of a UDF is to perform some small operation that is not
available in SQL or InterBase’s stored procedure language.
An example of a UDF is
function Modulo (Numerator, Denominator: Integer): Integer
Divide Numerator by
Denominator and return the remainder.
This function is
essential in many routines, but it is not available in InterBase’s DSQL
language.
Before
we start going through some examples of writing UDFs, let's talk about what you
should be doing, and what you should not be doing.
Once you get the hang of
writing UDFs, you will probably think that a whole world of InterBase
extensibility has opened up to you through UDFs. On the one hand, it has; the mechanisms for invoking UDFs are
quite simple, and since a UDF is simply a routine written in your favorite
programming language, you can do virtually anything, right?
Well, yes and no... One
thing you can't do with UDFs: You can't pass NULLs to them – that is, if you pass a variable with
the state of NULL, the UDF will not be able to distinguish this
from a non-NULL state. Likewise, a UDF cannot return a NULL value.
Also, a UDF does not
work within the context of a transaction. That is, transaction level
information cannot be passed to a UDF, and therefore, a UDF isn't able to
access the database. Sort of. A UDF can establish a new connection to the
database and start another transaction if it so desires, but this is where we
come to the “dos and don’ts,” not to the “can’ts.”
When you write UDFs, you
should follow these simple rules:
1. A UDF should be a simple, quick function.
2. A UDF should not attempt to access or directly
affect the state of the database.
3. A UDF should not raise unhandled exceptions.
4. All code must be thread-safe
What does this mean? Well, a function that trims a string, performs modulo arithmetic,
performs fancy date arithmetic or evaluates aspects of dates are all nice,
simple, quick functions. They are good examples of candidate UDFs.
Now, a function that
attaches to a database, and inserts, deletes or updates data is probably a bad
idea. A function that launches a program that performs a series of complex
tasks is probably a bad idea. Why? Quite simply because these types of
functions might prevent InterBase from doing transactional stuff or, even
worse, they could significantly damage the performance of your server. As soon as a UDF is called, the thread that
called that UDF blocks until the UDF returns.
Remember, of course,
that these are general guidelines. Your particular business case might dictate
a need to do something that is generally bad because in your case it is
specifically good.
Let's get to the heart
of the discussion, now!
Choose File->New.
Choose DLL (for Delphi) or Shared Object (for Kylix).
Now, choose File->New, then choose Unit.
It might be wise to do a Save All at this point... put your project where you feel is a good spot...
Create a modulo routine; in the newly created unit, declare the routine in the interface section:
function Modulo(var i, j: Integer): Integer; cdecl;
Don’t forget the cdecl; it’s important. Implement the routine in the implementation section:
function Modulo(var i, j: Integer): Integer;
begin
// Avoid raising an EdivByZero exception
if (j = 0) then
result := -1 // just check the boundary condition, and
// return a
reasonably uninteresting answer.
else
result := i mod j;
end;
In the newly created project source (.DPR), type the following immediately preceding the “begin end.:”
exports
Modulo;
Build the project, and you now have a working library.
This simplest way to get InterBase to appropriately find the DLL (for Windows) or .so (for Linux) is to copy it to the UDF directory under the InterBase Installation. In a default Windows installation of IB 6 or higher, this may be “c:\Program Files\Borland\IntrBase\UDF,” but you might need to create the UDF directory yourself. In IB5.6 and older versions, the DLL should go into the …IntrBase\bin directory.
To use the UDF.... do the following:
1. Connect to a new or existing database using ISQL.
2. Type the following:
declare external function f_Modulo
integer, integer
returns
integer by value
entry_point 'Modulo'
module_name 'filename minus extension [.so or .dll]';
3. Commit your changes.
4. Now test it...
select
f_Modulo(3, 2)
from
rdb$database
Whew! That was really easy, wasn't it?
We’ve successfully
written a Delphi DLL or Kylix .so that can be used as a UDF library in
InterBase. Wouldn’t a great function be
function Left (sz: Pchar; Len: Integer): Pchar;
Return the Len leftmost
characters of sz.
The question is, how do
we return a string? Already implied by the above declaration, InterBase doesn’t
respect Object Pascal’s String data type, so we are
forced to use PChar.
PChars versus Strings is not a big deal, except that Object Pascal
does not automatically clean up PChars, and the memory to store a PChar must be, at some point, explicitly allocated by
the developer. Our first UDF returned a
scalar value on the function’s calling stack, which meant that we did not have
to worry about cleanup issues. With any form of string, cleanup issues must be
addressed.
Fortunately, InterBase
provides a very simple solution to this issue.
We can tell InterBase to allocate a string for the result and pass it to
our UDF as a parameter, and InterBase will free the memory when it’s no longer
needed. We add an extra parameter to
the function so that InterBase can pass an allocated string, and return a
pointer to this parameter as a result.
function Left(var sz:
Pchar; var Len: integer; var szRes: PChar): PChar; cdecl;
var
i: integer;
begin
{ First, make Result point to same memory as
szRes }
result := szRes;
{ Initialize counter }
i := 0;
while (sz^ <> #0) and (i < Len) do
begin
{ Copy one character }
szRes^ := sz^;
{ Increment pointers to next character }
Inc(i);
Inc(sz);
Inc(szRes);
end;
{ Terminate result string }
szRes^ := #0;
end;
Now, in InterBase, we declare it as follows:
declare external
function Left
cstring(64), integer, cstring(64)
returns parameter 3
entry_point 'Left' module_name 'UDFTest'
Obviously, you’ll need
to replace “UDFTest” with the actual name of your library file, minus the .so
or .dll extension.
And finally we can test
this example by declaring the external function and using it in a silly select
statement:
select
Left('Hello World', 5)
from
rdb$database
Note that although the
function internally takes three parameters, we only pass two in the SELECT statement – the third is for the result and will
be filled in by InterBase.
In InterBase 6, three different “date” types are supported, DATE, TIME, and TIMESTAMP. For those familiar with InterBase 5.6 and older, the TIMESTAMP data type is exactly the equivalent of the 5.x DATE type.
In order to “decode” and “encode” these types into something meaningful for your Delphi program, you need a little bit of information about the InterBase API. In the past, it was necessary to add the headers for these functions to your UDF library manually. Since the InterBase Express (IBX) components are open source, however, we can simply reuse the declarations there. (Note that this does not mean you’re using IBX in your UDFs – you’re just saving yourself some typing since the IB API functions are already defined there. In fact, this will work with versions of Delphi prior to version 5, which IBX doesn’t support. For Delphi/BCB 4 and earlier, you can either run the IBX installer and manually move the .pas files into your search path, or get the files directly from http://sourceforge.net/projects/ibx)
Now, let's write some date UDFs!
First, make sure the version of IBX that you have is up-to-date. If you have Delphi 5 and have never updated IBX, go to http://codecentral.borland.com/codecentral/ccweb.exe/author?authorid=102 and get the latest version. Run the installer and you’re ready to go. In the interface section of the newly created unit, add the IBIntf unit to the interface uses clause (this will allow you to use IB data types and API calls):
uses
IBExternals, IBHeader, IbIntf;
...and add the following lines to the initialization section:
initialization
IsMultiThread := TRUE; // You already did this one, right? :)
CheckIBLoaded; // This makes it possible to call IB API functions.
This is much easier than typing the function and data structure declarations yourself!
Now type the following declarations:
function Year(var ib_date: TISC_TIMESTAMP):
Integer; cdecl;
function Hour(var ib_time: TISC_TIMESTAMP): Integer; cdecl;
In the implementation section of the unit, type:
function Year(var ib_date: TISC_TIMESTAMP):
Integer;
var
tm_date: TM;
begin
isc_decode_sql_date(@ib_date, @tm_date);
result := tm_date.tm_year + 1900;
end;
function Hour(var ib_time:
TISC_TIMESTAMP): Integer;
var
tm_date: TM;
begin
isc_decode_sql_time(@ib_time, @tm_date);
result := tm_date.tm_hour;
end;
In your project source, in the "exports" section, type:
exports
Year,
Hour;
Now build the project again.
To use the routine, copy the compiled library to your IB UDF directory, then go to ISQL, reconnect to the database you used above, and type:
declare external function f_Year
date
returns integer by value
entry_point 'Year' module_name 'filename';
declare external function f_Hour
time
returns integer by value
entry_point 'Hour' module_name 'filename';
And test it...
select
f_Year(cast('7/11/00' as date))
from
rdb$database
Not quite as easy as number manipulations, but still pretty simple, huh?
Important note: These routines are for demonstration purposes only. In IB6 or higher, you can get the HOUR or YEAR from a TIMESTAMP using the EXTRACT function – no UDF is necessary.
The InterBase Developers Guide discusses writing UDFs which accept a Blob value as an argument or return a Blob in some detail, but the examples are in C. The code accompanying this paper has an Object Pascal example, and there are more such examples in the FreeUDF library available from http://www.mers.com in the Downloads section.
When writing a UDF which takes a Blob argument, the UDF will not receive a pointer to the Blob data. Instead, the UDF receives a data structure containing a Blob handle, basic information about the Blob, and pointers to two procedures used to get and set the Blob data itself. These procedures are simply isc_get_segment and isc_put_segment, and are discussed in the InterBase API Guide. You will need the data structure and function definitions, so the easiest thing to do is to include the same IBX units in the uses clause which we used for the date UDFs, above.
To work with Blob data passed to the UDF, it is necessary to allocate a buffer to hold the returned Blob data. The easiest way to do this is to declare a variable of type string and allocate the memory with SetLength. You will then need to cast the variable as a PChar when calling GetSegment or PutSegment. This works well even if the data in the Blob is not a sting; you can just treat it as an array of bytes.
To return Blob data from a UDF, there are two separate memory allocation issues to deal with: The Blob information structure and the Blob data itself. To allocate space for the Blob information structure, use an additional argument to the function and RETURNS PARAMETER n syntax when declaring the UDF, similar to Solution #4 for returning strings (see below). Allocating space for the returned Blob data itself is handled by the InterBase API inside of the PutSegment function.
As an example, consider the following (shortened) function from the FreeUDF library:
function StrBlob(sz: PChar; Blob: PBlob): PBlob;
begin
result := Blob;
if (not Assigned(Blob)) or
(not Assigned(Blob^.BlobHandle)) then exit;
Blob^.PutSegment(Blob^.BlobHandle, sz, StrLen(sz));
end;
The correct declaration for this function is:
DECLARE EXTERNAL FUNCTION F_STRBLOB
CSTRING(80),
BLOB
RETURNS PARAMETER 2
ENTRY_POINT 'StrBlob' MODULE_NAME 'FreeUDFLib';
...And you might use it as follows:
INSERT INTO SOME_TABLE (
ID_FIELD,
BLOB_FIELD
) VALUES (
GEN_ID(SOME_TABLE_G, 1),
F_STRBLOB('Foo'));
Notice that only one parameter is passed to the function in SQL. The second parameter is an allocated buffer for the returned Blob which is created and passed by InterBase, but is not referenced when calling the UDF from SQL.
The example accompanying this paper uses the segment sizes defined for the Blob columns, but this is not strictly necessary. You can pass a buffer large enough to hold the entire Blob (though do note that Blobs can be larger than the addressable memory limit of 2 GB on NT), and performance may be enhanced by doing so.
UDFs are so simple in principle that it can be quite frustrating when they do not seem to work properly. In this section we’ll discuss how to diagnose and correct common errors.
Whenever you have a problem getting a UDF to work, the first thing to try is disconnecting from and then reconnecting to the database. This solves a surprising number of UDF problems!
Symptom: “Module or entry point not found” error message. This is InterBase’s way of telling you that it cannot locate either the library (on Windows, this would be the DLL file) or that it can find the library but cannot find the function within that library. It would be nice if IB told you which item it cannot find, but it doesn’t – so you need to check both.
First, ensure that the DLL or SO (for Linux) file is in the correct place. For IB 6, the default location for all UDF libraries is in the UDF subdirectory of your main InterBase directory. You can change this location using an option in the ibconfig file, so you may wish to check this file to ensure that the default location has not been overridden. Also, double-check the filename to ensure that it was spelled correctly in your DDL when you declared the UDF. You can use IBConsole or your favorite InterBase administration tool to see what spelling you used when you declared the UDF. If you’re using Unix, remember that filenames on Unix are case-sensitive!
Second, ensure that the function name exported from the library is the same as the entry_point you specified when declaring the UDF. Since C++ and other languages can mangle function names (in order to retrofit DLLs for overloaded exports) I recommend using a PE viewer to verify the exported name of the function. You can get a good, free PE viewer from http://www.volweb.cz/pvones/delphi . Compare this name with the function name in the DDL you used to declare the UDF and make sure they are the same, including upper/lower case. If you find you have misspelled the function name in your DDL, you need to DROP then re-ADD the UDF. If you discover that the exported function name is mangled in a library written in C++, you need to declare your function as extern "C".
Symptom: Connection to database closes when calling the UDF. The InterBase server may have crashed. Check the interbase.log file on the server and see if there are any messages there. Make sure your function is not raising an unhandled exception.
Symptom: “…attempted to access a virtual address without permission” error message. This is a Windows message indicating that your multithreaded program attempted to do something which is not thread-safe. Although InterBase can on rare occasions cause this error itself, non-thread-safe UDFs are by far the most common cause of this message, and if you get this message when doing anything which involves a UDF, the UDF is probably at fault.
A full discussion of the issues which can make a function or library non-thread-safe is outside the scope of this paper, but if you’re allocating memory (even temporarily) inside of a Delphi function, the first thing to check is to make sure the IsMultiThreaded system global variable is set TRUE in the initialization section of your unit.
UDFs in Delphi and Kylix must be declared and exported using the cdecl calling convention – that is, parameters should be passed on the stack, not the registers, from right to left. The stack clean-up will be performed by the caller (in this case, InterBase). C programmers can use the default C calling convention (not __fastcall), but should include extern "C" to prevent the C++ compiler from mangling the exported function name.
Here are a few possible
solutions to the “returning strings” conundrum. We will explain why certain
solutions are desired, and why others will simply not work.
There is an obvious way
to accomplish returning strings: Maintain a global PChar that has some amount of space allocated to it.
Stuff this string with the desired return value, and return the global PChar. The
only problem with this is that InterBase is multi-threaded, and if a Delphi DLL
does this, InterBase will surely crash in a multi-user environment.
It seems that solution
#1 is no solution at all...
Instead of maintaining a
single global variable, maintain a thread-local
variable.
Whenever a UDF wishes to
return a string, it simply copies the string into the string referenced by the
thread-local PChar, and returns it.
This solution certainly
sounds elegant, and it certainly lends itself to being clean, but... How do we
manage thread-local variables in Delphi? Also, this solution certainly sounds
like it will work, but will it? InterBase is a multi-threaded environment,
surely, but how does it handle the scheduling of its calls to UDFs? (see
section A discussion of Solution #2)
Another obvious
solution: Every time a function returning a string is ready to return the
string, allocate a bit of memory for the string and return a PChar
to it. InterBase won’t crash — at least not
right away. It should be clear that this presents a nasty memory leak. The
Delphi function will keep allocating memory, but nobody is cleaning it up.
In InterBase 5.0, a new
keyword called free_it was introduced. This keyword is used like this:
declare external function
f_Leftcstring(254),
integer
returns
cstring(254) free_it
entry_point 'Left' module_name 'FreeUDFLib.dll';
If the developer chooses
to return strings in the “memory-leaky” fashion, then the UDF should be
declared with the free_it keyword, just like above. This allows the
developer of the UDF to be sloppy, and it forces InterBase to do the
housekeeping.
This is a reasonable
solution if the developer of the UDF wants to be sloppy; however, it is the
authors’ contention that all functions, especially third-party functions should do their own housekeeping, or they should do nothing to “dirty the
house” to begin with. It is considered bad form to write a function that is
known to be leaky, only to “pass the buck” to the calling application.
Another problem with the
free_it solution
is that it only works with InterBase 5 and up. If the developer intends to
write functions for use with InterBase 4.x or lower, this solution simply won’t
work. (see section A discussion of Solution #3)
An under-documented
feature of InterBase allows a UDF declaration to specify a particular parameter
as the assumed return value of the function. By implementing a UDF in this way,
the UDF developer forces InterBase to pass it valid space holders for strings.
In other words, the UDF developer won’t have to worry about dynamic memory
issues because this problem is dealt with entirely in the InterBase
engine.
As previously indicated,
third-party routines should either do their own housekeeping, or they should do
nothing to “dirty the house” to begin with. This method is a simple and elegant
way to avoid messing up InterBase’s house. (see section A discussion of
Solution #4).
Delphi is a derivative
product of the old days, when Borland Pascal was called Borland Pascal and
multi-threading was just plain unavailable to the DOS world. When Delphi moved
into the Win32 world with version 2.0, Borland discovered that the memory
manager wasn’t thread-safe.
To solve the
thread-safety concerns, they wrapped their memory management routines in
critical sections, thus making the memory manager thread-safe. Critical
sections are beyond the scope of this article. Suffice it to say that they
ensure orderly access to a shared resource.
For performance reasons,
however, the critical sections are used only if a not-so-well-known system
variable, IsMultiThread is set to TRUE. (IsMultiThread
is defined in the unit `System.pas’, which is implicitly used by all Delphi
units.) IsMultiThread is
automatically set TRUE when a Delphi TThread is used, but when writing multithreaded routines
which do not subclass TThread – such as UDFs – IsMultiThread should be
explicitly set TRUE. The
easiest place to do this is in the initialization section of the unit which
contains the UDFs.
The basic gist of this
story is as follows: Delphi is thread-safe, but only when the developer tells it
to be. Whenever an application or library knows that it may be dealing with
multiple threads it should guarantee that IsMultiThread is set to TRUE; otherwise, the application or library is not thread safe. (Again: IsMultiThread
is set implicitly if the
developer uses a TThread
subclass.)
It cannot be stressed
enough that IsMultiThread
must be set to True in multi-threading environments.
In our introduction to
this solution, we asked the question, “Will thread-locals work?” The answer is
a resounding yes! InterBase is a multi-threaded architecture, and any number of
different queries can be running in a given thread. InterBase is guaranteed to
execute a UDF and process its results within a single atomic action,
thus thread-locals are perfectly safe for returning strings. (For a more in-depth
conversation, visit IB’s web site, or talk with the author after the lecture).
Thread-local variables
are extremely easy to work with, and Delphi makes it even easier through the
use of the threadvar construct. Let’s examine how to manage
thread-local variables.
The simplest way to deal
with thread-local variables is through the use of Delphi’s keyword threadvar. The developer acts as if a global variable is
being declared, but instead of using the var
keyword, the keyword threadvar
is used. For example, the following code
snippet declares a thread-local variable called szMyString:
threadvar
szMyString: PChar;
The keyword threadvar
can only be used at the unit level. In
other words, a function cannot have local variables declared as thread-local.
The reasoning behind this is clear: Local variables are intrinsically local to
the thread in which they were called. The only time a “thread-local” variable
needs to be used is when a sharable resource is being discussed, and sharable
resources are declared outside the scope of procedures and functions.
In the section on
threading (see section Thread-level initialization and finalization {anchor
link to Thread-level initialization and finalization, this document }), we
point out some problems with using threadvar, so it is important to note how Windows handles
thread-local access.
There are four routines
involved in managing thread-local variables:
Function: DWORD TlsAlloc
Allocate a
thread-local index, this index is used to access a thread-local variable.
Allocating an index is basically equivalent to declaring a threadvar. It
returns $FFFFFFFF when an error occurs; otherwise, it returns a
valid thread-local index.
Function: BOOL TlsFree (DWORD
dwTlsIndex)
Free up a
thread-local index. This is used when the thread-local variable referenced by
the thread-local index is no longer needed.
It returns True when a thread-local index is successfully
freed.
Function: Pointer TlsGetValue(DWORD dwTlsIndex)
Return the
thread-local 32-bit value indexed by dwTlsIndex. The value dwTlsIndex must have been
previously allocated using TlsAlloc.
Function: Bool TlsSetValue(DWORD dwTlsIndex, Pointer lpvTlsValue)
Set the 32-bit
value indicated by this thread-local index to the value specified in lpvTlsValue. The code snippet below shows how these
functions work together:
var
hTLSValue: DWORD;
...
hTLSValue = TlsAlloc;
if (hTLSValue = $FFFFFFFF) then
(* raise an exception or something *)
...
TlsSetValue(hTLSValue, Pointer(100));
...
ShowMessage(IntToStr(Integer(TlsGetValue(hTLSValue))));
...
TlsFree(hTLSValue);
In the next section (see
section Thread-level initialization and finalization {anchor link to Thread-level
initialization and finalization, this document }), we will show how FreeUDFLib
uses the Windows API to manage its thread-local variables, so we will wait
until then to illustrate any examples.
In general, DLLs do not
create threads, and in the case of building UDFs, this is no exception; however, InterBase does create
threads, and it is essential that the DLL knows when a thread is created and
when a thread is closed.
Initial inspection of
Delphi indicates that the initialization
and finalization
sections of a Delphi unit are prime
candidates for thread-level initialization and finalization. Further inspection
reveals that these sections are only fired when the library is loaded and when
it is freed, respectively. Good try, but not good enough.
Delphi defines a
variable DllProc. DllProc is a procedure pointer, and by assigning a
procedure to DllProc, the DLL can perform actions whenever an
attached application creates or destroys a thread.
A DLL entry-point
procedure is declared like this:
procedure LibEntry(Reason: Integer);
The actual name of the procedure is irrelevant. It is merely important to
note that a library entry procedure gets a single argument, Reason, which indicates why the procedure is being called.In
Delphi, there are three possible Reason’s for a DllProc to be called:
1. Reason = DLL_THREAD_ATTACH.
Whenever a
thread is created in an attached application, DllProc
will be called with this reason. This gives
the DLL an opportunity to initialize any thread-local variables.
2. Reason = DLL_THREAD_DETACH.
Whenever a
thread is being closed in an attached application, DllProc
will be called with this reason. This gives
the DLL an opportunity to free up any resources used by thread-local variables.
Take care! Suppose that an application starts some threads; it then loads the
DLL. The DLL is never explicitly told that those threads are executing (DllProc
will never be called with the DLL_THREAD_ATTACH
argument); however, if those threads exit
gracefully, the DLL will be informed that they are closing. This means that the DLL is potentially
responsible for cleaning up uninitialized data.
3. Reason = DLL_PROCESS_DETACH.
Whenever the
calling application unloads a library, DllProc
will be called with this reason. This is
exactly equivalent to the finalization section of a Delphi unit, so it is
irrelevant to our discussions.
Let’s study some
examples:
procedure
DllEntry(Reason: Integer);
begin
case Reason of
DLL_THREAD_ATTACH: begin
tlObject := TTestObject.Create;
DllShowMessage(tlObject.ObjectName);
end;
DLL_THREAD_DETACH: begin
(* Uninitialized data is guaranteed to
be nil. *)
if (tlObject = nil) then
DllShowMessage(‘Object is nil.’)
else
(* and we’ve guaranteed that
initialized data has an object *)
DllShowMessage(tlObject.ObjectName);
tlObject.Free;
end;
end;
end;
initialization
IsMultiThread := True;
DllProc := @DllEntry;
tlObject := TTestObject.Create;
finalization
(* Uninitialized data is guaranteed to be
nil. *)
if (tlObject = nil) then
DllShowMessage(‘Object is nil.’)
else
(* and we’ve guaranteed that initialized
data has an object *)
DllShowMessage(tlObject.ObjectName);
tlObject.Free;
end.
As this code snippet
shows, Dll1 can easily respond to the creation of threads in a calling
application.
To further the reader’s
understanding of these entry point functions, and to demonstrate a problem with
Delphi’s threadvar
construct, the reader should study Greg
Deatz’s article, “Thread-safe DLLs” in the November 1998 issue of Delphi
Informant magazine.
As was mentioned before, InterBase 5.0 introduces the free_it
keyword, thus
allowing the UDF developer to use dynamically allocated memory for the return
of strings and dates.
Aside from the author’s contention that this is sloppy, and that it won’t
work with versions of InterBase previous to 5.0, this is a fully supported and
“sponsored” technique for returning strings to InterBase. (see section Solution
#3: Returning dynamically allocated strings {anchor link to Solution#3, this
document}) So, sloppy or not, we must
“face the music,” and explore returning dynamically allocated strings to
InterBase.
The Windows version
of InterBase is compiled using Microsoft’s
C-compiler(MSVC). Without getting into a discussion as to why they
chose this compiler, suffice it to say that InterBase expects dynamically
allocated memory to be allocated using MSVC’s malloc
routine.
MSVC’s malloc
routine handles memory allocation in a
manner “all its own.” That is, we can’t rightly infer how it
manages memory, but it certainly does not allocate memory in the
same fashion as Delphi or other languages. So, a Delphi function, for example,
that tries to dynamically allocate memory using GetMem
or the Windows system call GlobalAlloc
will most certainly cause
problems with InterBase if used in conjunction with the free_it
keyword.
IB 5.5 and later ships with a special library called IB_util which contains a function you
can use called ib_util_malloc. This
function is compatible with MSVC’s malloc.
To use this function, simply add IB’s “include” directory to your Delphi
search path, then copy ib_util.dll to your Windows system directory. Finally, add ib_util to the uses clause of your unit’s implementation section.
These steps allow you to use the ib_util_malloc function.
For IB 5.1 and earlier, you must use the MSVC malloc directly.
This problem is resolved by making use of the fact that MSVC
applications must be distributed with the run-time MSVC library, ‘msvcrt.dll’. If InterBase or the InterBase client is
installed on a system, then this DLL is installed on your system as well.
By making the following
declaration in your Delphi UDF library,
function
malloc(Size: Integer): Pointer; cdecl; external ‘msvcrt.dll’;
you will allow Delphi to
make use of MSVC’s malloc routine, so that the free_it keyword can be used.
Let’s take a look at CopyString:
function
CopyString(sz: PChar): PChar; cdecl;
var
szLen: Integer;
begin
szLen := 0;
while (sz[szLen] <> #0) do
Inc(szLen);Inc(szLen);
// replace ib_util_malloc with malloc for
< IB 5.5
result := ib_util_malloc(szLen);
Move(sz^, result^, szLen);
end;
Quite simple, CopyString allocates enough space for the passed string (sz) plus the null terminator, and it copies the
string.
The declaration for CopyString is as follows:
declare external
function CopyString
cstring(64)
returns
cstring(64) free_it
entry_point ‘CopyString’ module_name
‘UDFTest1.dll’;
And finally, we can test
this example:
select
CopyString(‘Hello World’)
from
rdb$database
It is a bit frustrating
to think that we can write a UDF library that doesn’t support as many versions
of InterBase as desired. And, if you agree with the authors that the “sloppy”
approach just won’t do, then this section might be for you.
As was briefly alluded
to above, a UDF should do its own housekeeping, and when possible, it should
probably also try to avoid “dirtying the house” at all. InterBase’s external
function declaration syntax includes the ability for InterBase to pass the
result buffer to the UDF, so that UDF can take the “high road,” and be a
gracious guest, providing information only, but not cluttering up InterBase’s
house at all.
In addition, this method
for declaring functions makes it possible to avoid using the free_it keyword, so that UDF libraries built in this way
can be used in versions of InterBase previous to version 5.0. (see section
Solution #4: Making InterBase do the work {anchor link to Solution#4,this
document})
How is this done?
Clearly, this is best illustrated through an example. Consider the following
implementation of CopyString:
function
CopyString(sz, szRes: PChar): PChar; cdecl;
begin
result := szRes;
while (sz^ <> #0) do begin
szRes^ := sz^;
Inc(sz);
Inc(szRes);
end;
szRes^ := sz^;
end;
Now, in InterBase, we declare it as follows:
declare external
function CopyString cstring(64), cstring(64)
returns parameter 2
entry_point ‘CopyString’ module_name
‘UDFTest2.dll’
And finally, we can test
this example by using it in a silly select statement:
select
CopyString(‘Hello World’)
from
rdb$database
This is perhaps the best method for returning a string, as it keeps the responsibilities for allocating and freeing memory clear, and it seems to work in all cases. The only downside seems to be a bug in ISQL (and also shows up in IBConsole) which incorrectly repors the UDF declaration when extracting metadata.
As mentioned above, you can use Borland Kylix to write UDFs for Linux using the same source code you use for Delphi UDFs on Windows. But if, for some reason, you prefer to use gcc, it’s not so hard.
Whenever you compile a C-File, it creates an object file, which is something that will be statically linked to some other code during a linking phase, which generally produces an executable file of some form. It’s during this linking phase that we are going to tell the c-compiler to create a “shared library,” which is essentially a “shared object” file that can be dynamically linked to a program at run-time, not at compile-time. In Windows-ese, we call these “things” DLLs.
In Unix/Linux-ese, we call these things shared libraries, which are essentially “shared object” files – libraries of code that can be dynamically linked at run-time. Thus we arrive at the conventional “so” extension for these files.
You need to remember that there is nothing inherently different between a Linux Shared Library and a Windows DLL. They are the same thing, at least in concept.
So.... how do we create a shared library under Linux?
This much is easy, right? Just open a text file with a .c extension.
int modulo(int *, int *);
int modulo(a, b)
int *a;
int *b;
{
if (*b == 0)
return -1; // return something suitably stupid.
else
return *a % *b;
}
At the command-line
gcc -c -O -fpic -fwritable-strings <your udf>.c
ld -G <your udf>.o -lm -lc -o <your udflib>.so
cp <your udflib>.so
/usr/interbase/udf
In ISQL
declare external function f_Modulo
integer, integer
returns
integer by value
entry_point 'modulo' module_name 'name of shared library';
commit;
select f_Modulo(3, 2) from
rdb$database;
Holy guacamole, Batman! That was really easy.
Gregory Deatz is a senior programmer/analyst at Hoagland, Longo, Moran, Dunst & Doukas, a law firm in New Brunswick, NJ. He has been working with Delphi and InterBase for approximately two and a half years and has been developing under the Windows API for approximately five years. His current focus is in legal billing and case management applications. He is the author of FreeUDFLib, a free UDF library for InterBase written entirely in Delphi, and FreeIBComponents, a set of native InterBase components which are the foundation for the InterBase Express (IBX) component set included with Delphi, Borland C++ Builder, and InterBase.
Craig Stuntz is a senior developer at Vertex Systems Corporation, a company which produces software for non-profit, rehabilitation organizations, and a member of TeamB.