3130 - Building UDF Libraries

By Gregory H. Deatz

Updated by Craig Stuntz

This article is both a quick introduction to writing simple User Defined Functions (UDFs) for InterBase and a fairly complete reference guide to UDF programming techniques. Accordingly, it is divided into two sections.

In the first section, the reader will learn how to quickly create simple UDFs in Delphi. Since UDFs, in general, should be simple, this section may be all that is necessary for most InterBase users. In the second section, we’ll explore the reasons behind the suggestions in the first section.

Section I: UDF Overview

What is a UDF? A user defined function (UDF) in InterBase is merely a function written in any programming language that is compiled into a shared library. Under Windows platforms, shared libraries are commonly referred to as dynamic link libraries (DLL's). This simple use of shared libraries provides the developer with a large amount of power and flexibility. Virtually any function that can be exposed through a DLL can be used by InterBase. This comment, however, should be taken with a grain of salt — the intent of a UDF is to perform some small operation that is not available in SQL or InterBase’s stored procedure language.

An example of a UDF is

function Modulo (Numerator, Denominator: Integer): Integer

Divide Numerator by Denominator and return the remainder.

This function is essential in many routines, but it is not available in InterBase’s DSQL language.

Some dos, some don’ts

Before we start going through some examples of writing UDFs, let's talk about what you should be doing, and what you should not be doing.

Once you get the hang of writing UDFs, you will probably think that a whole world of InterBase extensibility has opened up to you through UDFs. On the one hand, it has; the mechanisms for invoking UDFs are quite simple, and since a UDF is simply a routine written in your favorite programming language, you can do virtually anything, right?

Well, yes and no... One thing you can't do with UDFs: You can't pass NULLs to them – that is, if you pass a variable with the state of NULL, the UDF will not be able to distinguish this from a non-NULL state. Likewise, a UDF cannot return a NULL value.

Also, a UDF does not work within the context of a transaction. That is, transaction level information cannot be passed to a UDF, and therefore, a UDF isn't able to access the database. Sort of. A UDF can establish a new connection to the database and start another transaction if it so desires, but this is where we come to the “dos and don’ts,” not to the “can’ts.”

When you write UDFs, you should follow these simple rules:

1. A UDF should be a simple, quick function.

2. A UDF should not attempt to access or directly affect the state of the database.

3. A UDF should not raise unhandled exceptions.

4. All code must be thread-safe

What does this mean? Well, a function that trims a string, performs modulo arithmetic, performs fancy date arithmetic or evaluates aspects of dates are all nice, simple, quick functions. They are good examples of candidate UDFs.

Now, a function that attaches to a database, and inserts, deletes or updates data is probably a bad idea. A function that launches a program that performs a series of complex tasks is probably a bad idea. Why? Quite simply because these types of functions might prevent InterBase from doing transactional stuff or, even worse, they could significantly damage the performance of your server. As soon as a UDF is called, the thread that called that UDF blocks until the UDF returns.

Remember, of course, that these are general guidelines. Your particular business case might dictate a need to do something that is generally bad because in your case it is specifically good.

Let's get to the heart of the discussion, now!

Writing UDFs in Delphi or Kylix

Choose File->New.

Choose DLL (for Delphi) or Shared Object (for Kylix).

Now, choose File->New, then choose Unit.

It might be wise to do a Save All at this point... put your project where you feel is a good spot...

Create a modulo routine; in the newly created unit, declare the routine in the interface section:

function Modulo(var i, j: Integer): Integer; cdecl;

Don’t forget the cdecl; it’s important. Implement the routine in the implementation section:

function Modulo(var i, j: Integer): Integer;

begin

// Avoid raising an EdivByZero exception

if (j = 0) then

result := -1 // just check the boundary condition, and

// return a reasonably uninteresting answer.

else

result := i mod j;

end;

In the newly created project source (.DPR), type the following immediately preceding the “begin end.:”

exports

Modulo;

Build it, use it

Build the project, and you now have a working library.

This simplest way to get InterBase to appropriately find the DLL (for Windows) or .so (for Linux) is to copy it to the UDF directory under the InterBase Installation. In a default Windows installation of IB 6 or higher, this may be “c:\Program Files\Borland\IntrBase\UDF,” but you might need to create the UDF directory yourself. In IB5.6 and older versions, the DLL should go into the …IntrBase\bin directory.

To use the UDF.... do the following:

1. Connect to a new or existing database using ISQL.

2. Type the following:

declare external function f_Modulo

integer, integer

returns

integer by value

entry_point 'Modulo'

module_name 'filename minus extension [.so or .dll]';

3. Commit your changes.

4. Now test it...

select

f_Modulo(3, 2)

from

rdb$database

Whew! That was really easy, wasn't it?

Returning strings

We’ve successfully written a Delphi DLL or Kylix .so that can be used as a UDF library in InterBase. Wouldn’t a great function be

function Left (sz: Pchar; Len: Integer): Pchar;

Return the Len leftmost characters of sz.

The question is, how do we return a string? Already implied by the above declaration, InterBase doesn’t respect Object Pascal’s String data type, so we are forced to use PChar.

PChars versus Strings is not a big deal, except that Object Pascal does not automatically clean up PChars, and the memory to store a PChar must be, at some point, explicitly allocated by the developer. Our first UDF returned a scalar value on the function’s calling stack, which meant that we did not have to worry about cleanup issues. With any form of string, cleanup issues must be addressed.

Fortunately, InterBase provides a very simple solution to this issue. We can tell InterBase to allocate a string for the result and pass it to our UDF as a parameter, and InterBase will free the memory when it’s no longer needed. We add an extra parameter to the function so that InterBase can pass an allocated string, and return a pointer to this parameter as a result.

function Left(var sz: Pchar; var Len: integer; var szRes: PChar): PChar; cdecl;

var

i: integer;

begin

{ First, make Result point to same memory as szRes }

result := szRes;

{ Initialize counter }

i := 0;

while (sz^ <> #0) and (i < Len) do begin

{ Copy one character }

szRes^ := sz^;

{ Increment pointers to next character }

Inc(i);

Inc(sz);

Inc(szRes);

end;

{ Terminate result string }

szRes^ := #0;

end;

Now, in InterBase, we declare it as follows:

declare external function Left

cstring(64), integer, cstring(64)

returns parameter 3

entry_point 'Left' module_name 'UDFTest'

Obviously, you’ll need to replace “UDFTest” with the actual name of your library file, minus the .so or .dll extension.

And finally we can test this example by declaring the external function and using it in a silly select statement:

select

Left('Hello World', 5)

from

rdb$database

Note that although the function internally takes three parameters, we only pass two in the SELECT statement – the third is for the result and will be filled in by InterBase.

UDFs with Dates

In InterBase 6, three different “date” types are supported, DATE, TIME, and TIMESTAMP. For those familiar with InterBase 5.6 and older, the TIMESTAMP data type is exactly the equivalent of the 5.x DATE type.

In order to “decode” and “encode” these types into something meaningful for your Delphi program, you need a little bit of information about the InterBase API. In the past, it was necessary to add the headers for these functions to your UDF library manually. Since the InterBase Express (IBX) components are open source, however, we can simply reuse the declarations there. (Note that this does not mean you’re using IBX in your UDFs – you’re just saving yourself some typing since the IB API functions are already defined there. In fact, this will work with versions of Delphi prior to version 5, which IBX doesn’t support. For Delphi/BCB 4 and earlier, you can either run the IBX installer and manually move the .pas files into your search path, or get the files directly from http://sourceforge.net/projects/ibx)

Now, let's write some date UDFs!

First, make sure the version of IBX that you have is up-to-date. If you have Delphi 5 and have never updated IBX, go to http://codecentral.borland.com/codecentral/ccweb.exe/author?authorid=102 and get the latest version. Run the installer and you’re ready to go. In the interface section of the newly created unit, add the IBIntf unit to the interface uses clause (this will allow you to use IB data types and API calls):

uses

IBExternals, IBHeader, IbIntf;

...and add the following lines to the initialization section:

initialization

IsMultiThread := TRUE; // You already did this one, right? :)

CheckIBLoaded; // This makes it possible to call IB API functions.

This is much easier than typing the function and data structure declarations yourself!

Now type the following declarations:

function Year(var ib_date: TISC_TIMESTAMP): Integer; cdecl;

function Hour(var ib_time: TISC_TIMESTAMP): Integer; cdecl;

In the implementation section of the unit, type:

function Year(var ib_date: TISC_TIMESTAMP): Integer;

var

tm_date: TM;

begin

isc_decode_sql_date(@ib_date, @tm_date);

result := tm_date.tm_year + 1900;

end;

function Hour(var ib_time: TISC_TIMESTAMP): Integer;

var

tm_date: TM;

begin

isc_decode_sql_time(@ib_time, @tm_date);

result := tm_date.tm_hour;

end;

In your project source, in the "exports" section, type:

exports

Year,

Hour;

Now build the project again.

To use the routine, copy the compiled library to your IB UDF directory, then go to ISQL, reconnect to the database you used above, and type:

declare external function f_Year

date

returns integer by value

entry_point 'Year' module_name 'filename';

declare external function f_Hour

time

returns integer by value

entry_point 'Hour' module_name 'filename';

And test it...

select

f_Year(cast('7/11/00' as date))

from

rdb$database

Not quite as easy as number manipulations, but still pretty simple, huh?

Important note: These routines are for demonstration purposes only. In IB6 or higher, you can get the HOUR or YEAR from a TIMESTAMP using the EXTRACT function – no UDF is necessary.

Section II: UDFs in Depth

UDFs with Blobs

The InterBase Developers Guide discusses writing UDFs which accept a Blob value as an argument or return a Blob in some detail, but the examples are in C. The code accompanying this paper has an Object Pascal example, and there are more such examples in the FreeUDF library available from http://www.mers.com in the Downloads section.

When writing a UDF which takes a Blob argument, the UDF will not receive a pointer to the Blob data. Instead, the UDF receives a data structure containing a Blob handle, basic information about the Blob, and pointers to two procedures used to get and set the Blob data itself. These procedures are simply isc_get_segment and isc_put_segment, and are discussed in the InterBase API Guide. You will need the data structure and function definitions, so the easiest thing to do is to include the same IBX units in the uses clause which we used for the date UDFs, above.

To work with Blob data passed to the UDF, it is necessary to allocate a buffer to hold the returned Blob data. The easiest way to do this is to declare a variable of type string and allocate the memory with SetLength. You will then need to cast the variable as a PChar when calling GetSegment or PutSegment. This works well even if the data in the Blob is not a sting; you can just treat it as an array of bytes.

To return Blob data from a UDF, there are two separate memory allocation issues to deal with: The Blob information structure and the Blob data itself. To allocate space for the Blob information structure, use an additional argument to the function and RETURNS PARAMETER n syntax when declaring the UDF, similar to Solution #4 for returning strings (see below). Allocating space for the returned Blob data itself is handled by the InterBase API inside of the PutSegment function.

As an example, consider the following (shortened) function from the FreeUDF library:

function StrBlob(sz: PChar; Blob: PBlob): PBlob;
begin
result := Blob;
if (not Assigned(Blob)) or
(not Assigned(Blob^.BlobHandle)) then exit;
Blob^.PutSegment(Blob^.BlobHandle, sz, StrLen(sz));
end;

The correct declaration for this function is:

DECLARE EXTERNAL FUNCTION F_STRBLOB

CSTRING(80),

BLOB

RETURNS PARAMETER 2

ENTRY_POINT 'StrBlob' MODULE_NAME 'FreeUDFLib';

...And you might use it as follows:

INSERT INTO SOME_TABLE (

ID_FIELD,

BLOB_FIELD

) VALUES (

GEN_ID(SOME_TABLE_G, 1),

F_STRBLOB('Foo'));

Notice that only one parameter is passed to the function in SQL. The second parameter is an allocated buffer for the returned Blob which is created and passed by InterBase, but is not referenced when calling the UDF from SQL.

The example accompanying this paper uses the segment sizes defined for the Blob columns, but this is not strictly necessary. You can pass a buffer large enough to hold the entire Blob (though do note that Blobs can be larger than the addressable memory limit of 2 GB on NT), and performance may be enhanced by doing so.

Troubleshooting UDFs

UDFs are so simple in principle that it can be quite frustrating when they do not seem to work properly. In this section we’ll discuss how to diagnose and correct common errors.

Whenever you have a problem getting a UDF to work, the first thing to try is disconnecting from and then reconnecting to the database. This solves a surprising number of UDF problems!

Symptom: “Module or entry point not found” error message. This is InterBase’s way of telling you that it cannot locate either the library (on Windows, this would be the DLL file) or that it can find the library but cannot find the function within that library. It would be nice if IB told you which item it cannot find, but it doesn’t – so you need to check both.

First, ensure that the DLL or SO (for Linux) file is in the correct place. For IB 6, the default location for all UDF libraries is in the UDF subdirectory of your main InterBase directory. You can change this location using an option in the ibconfig file, so you may wish to check this file to ensure that the default location has not been overridden. Also, double-check the filename to ensure that it was spelled correctly in your DDL when you declared the UDF. You can use IBConsole or your favorite InterBase administration tool to see what spelling you used when you declared the UDF. If you’re using Unix, remember that filenames on Unix are case-sensitive!

Second, ensure that the function name exported from the library is the same as the entry_point you specified when declaring the UDF. Since C++ and other languages can mangle function names (in order to retrofit DLLs for overloaded exports) I recommend using a PE viewer to verify the exported name of the function. You can get a good, free PE viewer from http://www.volweb.cz/pvones/delphi . Compare this name with the function name in the DDL you used to declare the UDF and make sure they are the same, including upper/lower case. If you find you have misspelled the function name in your DDL, you need to DROP then re-ADD the UDF. If you discover that the exported function name is mangled in a library written in C++, you need to declare your function as extern "C".

Symptom: Connection to database closes when calling the UDF. The InterBase server may have crashed. Check the interbase.log file on the server and see if there are any messages there. Make sure your function is not raising an unhandled exception.

Symptom: “…attempted to access a virtual address without permission” error message. This is a Windows message indicating that your multithreaded program attempted to do something which is not thread-safe. Although InterBase can on rare occasions cause this error itself, non-thread-safe UDFs are by far the most common cause of this message, and if you get this message when doing anything which involves a UDF, the UDF is probably at fault.

A full discussion of the issues which can make a function or library non-thread-safe is outside the scope of this paper, but if you’re allocating memory (even temporarily) inside of a Delphi function, the first thing to check is to make sure the IsMultiThreaded system global variable is set TRUE in the initialization section of your unit.

Calling conventions

UDFs in Delphi and Kylix must be declared and exported using the cdecl calling convention – that is, parameters should be passed on the stack, not the registers, from right to left. The stack clean-up will be performed by the caller (in this case, InterBase). C programmers can use the default C calling convention (not __fastcall), but should include extern "C" to prevent the C++ compiler from mangling the exported function name.

Returning strings – an in-depth discussion

Here are a few possible solutions to the “returning strings” conundrum. We will explain why certain solutions are desired, and why others will simply not work.

“Solution” #1: Global static memory

There is an obvious way to accomplish returning strings: Maintain a global PChar that has some amount of space allocated to it. Stuff this string with the desired return value, and return the global PChar. The only problem with this is that InterBase is multi-threaded, and if a Delphi DLL does this, InterBase will surely crash in a multi-user environment.

It seems that solution #1 is no solution at all...

Solution #2: Thread-local static memory

Instead of maintaining a single global variable, maintain a thread-local variable.

Whenever a UDF wishes to return a string, it simply copies the string into the string referenced by the thread-local PChar, and returns it.

This solution certainly sounds elegant, and it certainly lends itself to being clean, but... How do we manage thread-local variables in Delphi? Also, this solution certainly sounds like it will work, but will it? InterBase is a multi-threaded environment, surely, but how does it handle the scheduling of its calls to UDFs? (see section A discussion of Solution #2)

Solution #3: Returning dynamically allocated strings

Another obvious solution: Every time a function returning a string is ready to return the string, allocate a bit of memory for the string and return a PChar to it. InterBase won’t crash — at least not right away. It should be clear that this presents a nasty memory leak. The Delphi function will keep allocating memory, but nobody is cleaning it up.

In InterBase 5.0, a new keyword called free_it was introduced. This keyword is used like this:

declare external function f_Leftcstring(254),

integer

returns

cstring(254) free_it

entry_point 'Left' module_name 'FreeUDFLib.dll';

If the developer chooses to return strings in the “memory-leaky” fashion, then the UDF should be declared with the free_it keyword, just like above. This allows the developer of the UDF to be sloppy, and it forces InterBase to do the housekeeping.

This is a reasonable solution if the developer of the UDF wants to be sloppy; however, it is the authors’ contention that all functions, especially third-party functions should do their own housekeeping, or they should do nothing to “dirty the house” to begin with. It is considered bad form to write a function that is known to be leaky, only to “pass the buck” to the calling application.

Another problem with the free_it solution is that it only works with InterBase 5 and up. If the developer intends to write functions for use with InterBase 4.x or lower, this solution simply won’t work. (see section A discussion of Solution #3)

Solution #4: Making InterBase do the work

An under-documented feature of InterBase allows a UDF declaration to specify a particular parameter as the assumed return value of the function. By implementing a UDF in this way, the UDF developer forces InterBase to pass it valid space holders for strings. In other words, the UDF developer won’t have to worry about dynamic memory issues because this problem is dealt with entirely in the InterBase engine.

As previously indicated, third-party routines should either do their own housekeeping, or they should do nothing to “dirty the house” to begin with. This method is a simple and elegant way to avoid messing up InterBase’s house. (see section A discussion of Solution #4).

Notes on Delphi’s memory manager

Delphi is a derivative product of the old days, when Borland Pascal was called Borland Pascal and multi-threading was just plain unavailable to the DOS world. When Delphi moved into the Win32 world with version 2.0, Borland discovered that the memory manager wasn’t thread-safe.

To solve the thread-safety concerns, they wrapped their memory management routines in critical sections, thus making the memory manager thread-safe. Critical sections are beyond the scope of this article. Suffice it to say that they ensure orderly access to a shared resource.

For performance reasons, however, the critical sections are used only if a not-so-well-known system variable, IsMultiThread is set to TRUE. (IsMultiThread is defined in the unit `System.pas’, which is implicitly used by all Delphi units.) IsMultiThread is automatically set TRUE when a Delphi TThread is used, but when writing multithreaded routines which do not subclass TThread – such as UDFs – IsMultiThread should be explicitly set TRUE. The easiest place to do this is in the initialization section of the unit which contains the UDFs.

The basic gist of this story is as follows: Delphi is thread-safe, but only when the developer tells it to be. Whenever an application or library knows that it may be dealing with multiple threads it should guarantee that IsMultiThread is set to TRUE; otherwise, the application or library is not thread safe. (Again: IsMultiThread is set implicitly if the developer uses a TThread subclass.)

It cannot be stressed enough that IsMultiThread must be set to True in multi-threading environments.

A discussion of Solution #2

In our introduction to this solution, we asked the question, “Will thread-locals work?” The answer is a resounding yes! InterBase is a multi-threaded architecture, and any number of different queries can be running in a given thread. InterBase is guaranteed to execute a UDF and process its results within a single atomic action, thus thread-locals are perfectly safe for returning strings. (For a more in-depth conversation, visit IB’s web site, or talk with the author after the lecture).

Thread-local variables are extremely easy to work with, and Delphi makes it even easier through the use of the threadvar construct. Let’s examine how to manage thread-local variables.

Thread-local variables the Delphi way

The simplest way to deal with thread-local variables is through the use of Delphi’s keyword threadvar. The developer acts as if a global variable is being declared, but instead of using the var keyword, the keyword threadvar is used. For example, the following code snippet declares a thread-local variable called szMyString:

threadvar

szMyString: PChar;

The keyword threadvar can only be used at the unit level. In other words, a function cannot have local variables declared as thread-local. The reasoning behind this is clear: Local variables are intrinsically local to the thread in which they were called. The only time a “thread-local” variable needs to be used is when a sharable resource is being discussed, and sharable resources are declared outside the scope of procedures and functions.

Thread local variables the API way

In the section on threading (see section Thread-level initialization and finalization {anchor link to Thread-level initialization and finalization, this document }), we point out some problems with using threadvar, so it is important to note how Windows handles thread-local access.

There are four routines involved in managing thread-local variables:

Function: DWORD TlsAlloc

Allocate a thread-local index, this index is used to access a thread-local variable. Allocating an index is basically equivalent to declaring a threadvar. It returns $FFFFFFFF when an error occurs; otherwise, it returns a valid thread-local index.

Function: BOOL TlsFree (DWORD dwTlsIndex)

Free up a thread-local index. This is used when the thread-local variable referenced by the thread-local index is no longer needed. It returns True when a thread-local index is successfully freed.

Function: Pointer TlsGetValue(DWORD dwTlsIndex)

Return the thread-local 32-bit value indexed by dwTlsIndex. The value dwTlsIndex must have been previously allocated using TlsAlloc.

Function: Bool TlsSetValue(DWORD dwTlsIndex, Pointer lpvTlsValue)

Set the 32-bit value indicated by this thread-local index to the value specified in lpvTlsValue. The code snippet below shows how these functions work together:

var

hTLSValue: DWORD;

...

hTLSValue = TlsAlloc;

if (hTLSValue = $FFFFFFFF) then

(* raise an exception or something *)

...

TlsSetValue(hTLSValue, Pointer(100));

...

ShowMessage(IntToStr(Integer(TlsGetValue(hTLSValue))));

...

TlsFree(hTLSValue);

In the next section (see section Thread-level initialization and finalization {anchor link to Thread-level initialization and finalization, this document }), we will show how FreeUDFLib uses the Windows API to manage its thread-local variables, so we will wait until then to illustrate any examples.

Thread-level initialization and finalization

In general, DLLs do not create threads, and in the case of building UDFs, this is no exception; however, InterBase does create threads, and it is essential that the DLL knows when a thread is created and when a thread is closed.

Initial inspection of Delphi indicates that the initialization and finalization sections of a Delphi unit are prime candidates for thread-level initialization and finalization. Further inspection reveals that these sections are only fired when the library is loaded and when it is freed, respectively. Good try, but not good enough.

Delphi defines a variable DllProc. DllProc is a procedure pointer, and by assigning a procedure to DllProc, the DLL can perform actions whenever an attached application creates or destroys a thread.

A DLL entry-point procedure is declared like this:

procedure LibEntry(Reason: Integer);

The actual name of the procedure is irrelevant. It is merely important to note that a library entry procedure gets a single argument, Reason, which indicates why the procedure is being called.In Delphi, there are three possible Reason’s for a DllProc to be called:

1. Reason = DLL_THREAD_ATTACH.

Whenever a thread is created in an attached application, DllProc will be called with this reason. This gives the DLL an opportunity to initialize any thread-local variables.

2. Reason = DLL_THREAD_DETACH.

Whenever a thread is being closed in an attached application, DllProc will be called with this reason. This gives the DLL an opportunity to free up any resources used by thread-local variables. Take care! Suppose that an application starts some threads; it then loads the DLL. The DLL is never explicitly told that those threads are executing (DllProc will never be called with the DLL_THREAD_ATTACH argument); however, if those threads exit gracefully, the DLL will be informed that they are closing. This means that the DLL is potentially responsible for cleaning up uninitialized data.

3. Reason = DLL_PROCESS_DETACH.

Whenever the calling application unloads a library, DllProc will be called with this reason. This is exactly equivalent to the finalization section of a Delphi unit, so it is irrelevant to our discussions.

Let’s study some examples:

procedure DllEntry(Reason: Integer);

begin

case Reason of

DLL_THREAD_ATTACH: begin

tlObject := TTestObject.Create;

DllShowMessage(tlObject.ObjectName);

end;

DLL_THREAD_DETACH: begin

(* Uninitialized data is guaranteed to be nil. *)

if (tlObject = nil) then

DllShowMessage(‘Object is nil.’)

else

(* and we’ve guaranteed that initialized data has an object *)

DllShowMessage(tlObject.ObjectName);

tlObject.Free;

end;

initialization

IsMultiThread := True;

DllProc := @DllEntry;

tlObject := TTestObject.Create;

finalization

(* Uninitialized data is guaranteed to be nil. *)

if (tlObject = nil) then

DllShowMessage(‘Object is nil.’)

else

(* and we’ve guaranteed that initialized data has an object *)

DllShowMessage(tlObject.ObjectName);

tlObject.Free;

end.

As this code snippet shows, Dll1 can easily respond to the creation of threads in a calling application.

To further the reader’s understanding of these entry point functions, and to demonstrate a problem with Delphi’s threadvar construct, the reader should study Greg Deatz’s article, “Thread-safe DLLs” in the November 1998 issue of Delphi Informant magazine.

A discussion of Solution #3

As was mentioned before, InterBase 5.0 introduces the free_it keyword, thus allowing the UDF developer to use dynamically allocated memory for the return of strings and dates.

Aside from the author’s contention that this is sloppy, and that it won’t work with versions of InterBase previous to 5.0, this is a fully supported and “sponsored” technique for returning strings to InterBase. (see section Solution #3: Returning dynamically allocated strings {anchor link to Solution#3, this document}) So, sloppy or not, we must “face the music,” and explore returning dynamically allocated strings to InterBase.

Memory allocation issues

The Windows version of InterBase is compiled using Microsoft’s C-compiler(MSVC). Without getting into a discussion as to why they chose this compiler, suffice it to say that InterBase expects dynamically allocated memory to be allocated using MSVC’s malloc routine.

MSVC’s malloc routine handles memory allocation in a manner “all its own.” That is, we can’t rightly infer how it manages memory, but it certainly does not allocate memory in the same fashion as Delphi or other languages. So, a Delphi function, for example, that tries to dynamically allocate memory using GetMem or the Windows system call GlobalAlloc will most certainly cause problems with InterBase if used in conjunction with the free_it keyword.

IB 5.5 and later ships with a special library called IB_util which contains a function you can use called ib_util_malloc. This function is compatible with MSVC’s malloc. To use this function, simply add IB’s “include” directory to your Delphi search path, then copy ib_util.dll to your Windows system directory. Finally, add ib_util to the uses clause of your unit’s implementation section. These steps allow you to use the ib_util_malloc function.

For IB 5.1 and earlier, you must use the MSVC malloc directly. This problem is resolved by making use of the fact that MSVC applications must be distributed with the run-time MSVC library, ‘msvcrt.dll’. If InterBase or the InterBase client is installed on a system, then this DLL is installed on your system as well.

By making the following declaration in your Delphi UDF library,

function malloc(Size: Integer): Pointer; cdecl; external ‘msvcrt.dll’;

you will allow Delphi to make use of MSVC’s malloc routine, so that the free_it keyword can be used.

Working through an example

Let’s take a look at CopyString:

function CopyString(sz: PChar): PChar; cdecl;

var

szLen: Integer;

begin

szLen := 0;

while (sz[szLen] <> #0) do

Inc(szLen);Inc(szLen);

// replace ib_util_malloc with malloc for < IB 5.5

result := ib_util_malloc(szLen);

Move(sz^, result^, szLen);

end;

Quite simple, CopyString allocates enough space for the passed string (sz) plus the null terminator, and it copies the string.

The declaration for CopyString is as follows:

declare external function CopyString

cstring(64)

returns

cstring(64) free_it

entry_point ‘CopyString’ module_name ‘UDFTest1.dll’;

And finally, we can test this example:

select

CopyString(‘Hello World’)

from

rdb$database

A discussion of Solution #4

It is a bit frustrating to think that we can write a UDF library that doesn’t support as many versions of InterBase as desired. And, if you agree with the authors that the “sloppy” approach just won’t do, then this section might be for you.

As was briefly alluded to above, a UDF should do its own housekeeping, and when possible, it should probably also try to avoid “dirtying the house” at all. InterBase’s external function declaration syntax includes the ability for InterBase to pass the result buffer to the UDF, so that UDF can take the “high road,” and be a gracious guest, providing information only, but not cluttering up InterBase’s house at all.

In addition, this method for declaring functions makes it possible to avoid using the free_it keyword, so that UDF libraries built in this way can be used in versions of InterBase previous to version 5.0. (see section Solution #4: Making InterBase do the work {anchor link to Solution#4,this document})

How is this done? Clearly, this is best illustrated through an example. Consider the following implementation of CopyString:

function CopyString(sz, szRes: PChar): PChar; cdecl;

begin

result := szRes;

while (sz^ <> #0) do begin

szRes^ := sz^;

Inc(sz);

Inc(szRes);

end;

szRes^ := sz^;

end;

Now, in InterBase, we declare it as follows:

declare external function CopyString cstring(64), cstring(64)

returns parameter 2

entry_point ‘CopyString’ module_name ‘UDFTest2.dll’

And finally, we can test this example by using it in a silly select statement:

select

CopyString(‘Hello World’)

from

rdb$database

This is perhaps the best method for returning a string, as it keeps the responsibilities for allocating and freeing memory clear, and it seems to work in all cases. The only downside seems to be a bug in ISQL (and also shows up in IBConsole) which incorrectly repors the UDF declaration when extracting metadata.

Building UDFs with gcc in Linux

As mentioned above, you can use Borland Kylix to write UDFs for Linux using the same source code you use for Delphi UDFs on Windows. But if, for some reason, you prefer to use gcc, it’s not so hard.

Whenever you compile a C-File, it creates an object file, which is something that will be statically linked to some other code during a linking phase, which generally produces an executable file of some form. It’s during this linking phase that we are going to tell the c-compiler to create a “shared library,” which is essentially a “shared object” file that can be dynamically linked to a program at run-time, not at compile-time. In Windows-ese, we call these “things” DLLs.

In Unix/Linux-ese, we call these things shared libraries, which are essentially “shared object” files – libraries of code that can be dynamically linked at run-time. Thus we arrive at the conventional “so” extension for these files.

You need to remember that there is nothing inherently different between a Linux Shared Library and a Windows DLL. They are the same thing, at least in concept.

So.... how do we create a shared library under Linux?

Create a C-file

This much is easy, right? Just open a text file with a .c extension.

Create the modulo routine

int modulo(int *, int *);

int modulo(a, b)

int *a;

int *b;

{

if (*b == 0)

return -1; // return something suitably stupid.

else

return *a % *b;

}

Build it, use it

At the command-line

gcc -c -O -fpic -fwritable-strings <your udf>.c

ld -G <your udf>.o -lm -lc -o <your udflib>.so

cp <your udflib>.so /usr/interbase/udf

In ISQL

declare external function f_Modulo

integer, integer

returns

integer by value

entry_point 'modulo' module_name 'name of shared library';

commit;

select f_Modulo(3, 2) from rdb$database;

Holy guacamole, Batman! That was really easy.

About the Authors

Gregory Deatz is a senior programmer/analyst at Hoagland, Longo, Moran, Dunst & Doukas, a law firm in New Brunswick, NJ. He has been working with Delphi and InterBase for approximately two and a half years and has been developing under the Windows API for approximately five years. His current focus is in legal billing and case management applications. He is the author of FreeUDFLib, a free UDF library for InterBase written entirely in Delphi, and FreeIBComponents, a set of native InterBase components which are the foundation for the InterBase Express (IBX) component set included with Delphi, Borland C++ Builder, and InterBase.

Craig Stuntz is a senior developer at Vertex Systems Corporation, a company which produces software for non-profit, rehabilitation organizations, and a member of TeamB.