The current kForth release is:
Versions: 1.5.2 (x86-linux), 1.4.1 (ppc-osx, and x86-cygwin)
Last Release Date: 2011-03-05
Systems: Linux (x86), Mac OS X (ppc), Windows 98/NT/2000/XP with Cygwin(x86)
kForth is specified as a subset of the ANS Forth standard,
given in
DPANS94. Code written for kForth is portable to
ANS-compliant Forth systems with the use of trivially
defined extensions (see the Special Features section below).
The compliance with ANS Forth may be checked using John Hayes' suite
of tests for the core words of an ANS Forth system:
tester.4th and
core.4th.
Tests involving unsupported words such as
HERE
and ,
and C,
have
been commented out, as well as tests involving the
BEGIN ... WHILE ... WHILE ... REPEAT ... THEN
structure,
and some weird variants of CREATE
and DOES>
usage. Compliance with the ANS Forth extension words for
working with double length numbers may be checked using
dbltest.4th. Tests are commented out for words which are not
implemented in kForth.
kForth is an indirect threaded code (ITC) system. The kForth compiler/interpreter parses the input stream into a vector of pseudo op-codes or Forth Byte Code. Upon execution, the vector of byte codes is passed on to a virtual machine which looks up the execution address of the words and performs either a call or an indirect jump to the next execution address. The type of threading used in the virtual machine is a hybrid of indirect call threading and indirect jump threading. The kForth virtual machine is implemented as a mixture of assembly language, C, and C++ functions. Only the assembly language portion of the virtual machine utilizes indirect jump threading.
kForth versions 1.2.10 and earlier implement
symmetric integer division. An alternative
form of signed integer division is called floored integer division.
Both symmetric and floored division yield identical results
when the two operands, dividend and divisor, are either both
positive integers or both negative integers. However, when the
two operands differ in sign, symmetric and floored integer
division can give different results. For example,
Floored Division: -8 3 / . -3 ok
Symmetric Division: -8 3 / . 2 ok
Similarly, the word MOD yields different results
on floored and symmetric division systems. Under floored division,
MOD is truly a modulus operator (i.e. the result
of n1 n2 MOD is a number in the range
[0, n2)), while under symmetric division, MOD simply
returns a remainder. The following paper provides a discussion of
integer division in computing languages:
Division and Modulus for Computer Scientists by Daan Leijen.
Floored integer division was guaranteed by the Forth-83 standard. However, the DPANS94 standard revoked this guarantee and allowed system implementors to choose either symmetric or floored integer division. The rationale in revoking a fixed standard was to allow Forth systems to implement whatever form of integer division was best supported by the microprocessor hardware. Most microprocessors which provide signed integer division implement symmetric division. In kForth, the original rationale for using symmetric division was simply to maintain consistency with the GNU C implementation, which mandates the use of symmetric integer division per the ISO C99 standard (the symmetric version of MOD corresponds to the % operator in C). In general, floored division is considered by computer scientists and mathematicians to be the more useful form of signed integer division.
A significant problem with the DPANS94 standard is that, in
practice, implementors of ANS-compliant Forth systems for a single
hardware platform such as Intel x86 have chosen to use
different forms of division. Consider the behavior of the
Forth systems below, all running under Linux on a Intel PII:
gforth: -8 3 MOD . -2 ok pfe: -8 3 MOD . 1 ok kforth: -8 3 MOD . -2 ok iforth: -8 3 MOD . -2 ok bigforth: -8 3 MOD . 1 okTherefore, a Forth program using signed integer division words (/ MOD /MOD */MOD) may produce different outputs under two different ANS-compliant Forth systems. The DPANS94 standard addresses the portability issue by calling for use of the explicit floored and symmetric division words FM/MOD and SM/REM whenever it is important to explicitly specify the type of division. However, it is highly likely that Forth programmers will casually use signed integer division words such as MOD without always remembering the portability issue.
kForth supports working with signed and unsigned double length numbers, and implements nearly all of the optional double number word set specified by DPANS94, either intrinsically or in the form of Forth source definitions (see ans-words.4th for the latter). In addition to the ANS Forth tests involving double numbers given in core.4th, further tests of double number words implemented in kForth are given in dbltest.4th.
One significant departure in kForth from typical Forth systems which
provide double numbers is the method of entry of double length numbers.
Traditional Forth recognizes the decimal point as a marker for a double
number, e.g.
234.
The prohibition on standard double number entry in kForth demands that an
alternate method be provided for entry of double numbers. This may be easily
accomplished by using a string to double number conversion word. There are
two ways to accomplish this. The first method is simple, but it is specific to
kForth, while the second is more complex, but portable to other ANS systems.
In the simple method, we may make use of the non-standard word,
NUMBER?
, to convert a counted string to a signed double length
number, as follows.
c" -20123456789" NUMBER? DROP
NUMBER?
actually returns a flag indicating whether or not
the conversion succeeded, but we drop the flag in the above example for
simplicity. If the conversion did not succeed, a double length zero will
result.
The second method should be used if it is desired to port the code to
other ANS Forth systems. ANS Forth provides >NUMBER for
converting a string to an unsigned double number. A more general string
to double number conversion word, handling both signed and unsigned
double numbers, may be written as follows.
variable dsign : >d ( a u -- d|ud | convert string to a signed/unsigned double ) 0 0 2SWAP \ skip leading spaces and tabs BEGIN OVER C@ DUP BL = SWAP 9 = OR WHILE 1 /STRING REPEAT ?DUP IF FALSE dsign ! OVER C@ CASE [char] - OF TRUE dsign ! 1 /STRING ENDOF [char] + OF 1 /STRING ENDOF ENDCASE >NUMBER 2DROP dsign @ IF DNEGATE THEN ELSE DROP THEN ;
s" 20123456789" >d s" -20123456789" >d s" +20123456789" >d
-234 S>D 2147483647 S>D -2147483649 S>D
The ANS Forth specification allows floating point numbers to be stored either on the data stack or on a separate floating point stack. kForth uses the data stack for holding floating point numbers. Even though many current Forth systems for PCs feature a separate floating point stack, the rationale for using the data stack for floating point operations in kForth was to allow legacy code written for earlier Forth systems (in particular the Forths from Laboratory Microsystems Inc.) to run without significant modifications under kForth. In kForth, a floating point number on the stack occupies two cells. Thus, under 32-bit Windows or Linux, floating point numbers are 64-bit double-precision numbers (equivalent to C's double).
The quality of the floating point arithmetic in kForth may
be checked using the program,
paranoia.4th.
Special features of kForth are described in a two-part article in Forthwrite magazine, issues 116 and 117.These features are:
HERE
address in kForth.
,
(comma operator) in kForth.
C,
operator in kForth.
HERE
does not exist,
the word ALLOT
not only allocates the requested
amount of memory, but also has the non-standard behavior
that it assigns the address of the new memory region
to the parameter field address (PFA) of the last defined word.
In kForth, the use of ALLOT
must always be preceeded
by the use of CREATE
. A variant of ALLOT
,
named ?ALLOT
is also provided. ?ALLOT
has the
same behavior as ALLOT
plus it returns the
start address of the dynamically allocated region on the parameter
stack. ?ALLOT
has the following equivalent definition
under ANS Forth:: ?ALLOT ( u -- a ) HERE SWAP ALLOT ;
?ALLOT
is particularly useful in writing defining words
in the absence of HERE
and the comma operators. For
example, to write your own integer constant defining word:: CONST ( n -- ) CREATE 4 ?ALLOT ! DOES> @ ;
: PTR ( a -- ) CREATE 4 ?ALLOT ! DOES> A@ ;
A@
as follows:: A@ @ ;
0 | not IMMEDIATE | DEFERRED |
1 | IMMEDIATE | DEFERRED |
2 | not IMMEDIATE | NONDEFERRED |
3 | IMMEDIATE | NONDEFERRED |
Precedence | Interpret | Compile |
0 | E0 | E0 |
1 | E2 | E2 |
2 | E1 | E0 |
3 | E1 | E2 |
10 0 do i . loop
do-loop
,
begin-while-repeat
, and if-then
structures
to occur outside of word definitions. kForth can interpret and execute
such structures as long as they are completed on a single line of
input.
NONDEFERRED
are those for which
interpretation of the rest of the input line will depend
on the execution of the word. Thus, the following intrinsic
words in kForth have the nondeferred precedence attribute:\ | .( | BINARY | DECIMAL | HEX |
WORD | ' | CREATE | FORGET | COLD |
ALLOT | ?ALLOT | CONSTANT | FCONSTANT | VARIABLE |
FVARIABLE | CHAR | >FILE | CONSOLE |
NONDEFERRED
keyword to set explicitly the
interpretation precedence of a word. This is due to the automatic
inheritance of the nondeferred attribute: if a word definition includes a
nondeferred word, then the new word is automatically nondeferred
also. Thus, for example, any word which has a definition including
WORD
is also a nondeferred word. Another example
is a defining word, i.e. one which uses CREATE
.
Since CREATE
is nondeferred the new defining word
is also nondeferred.NONDEFERRED
keyword
should be explicitly used is in the definition of a word which
changes the number base. For example,
DECIMAL
: BASE3 3 BASE ! ; NONDEFERRED
BASE3 21
BASE3
was not declared to be a nondeferred word,
then 21
in the above line would be interpreted as
decimal 21 rather than as decimal 7 (which is 21 in base 3).-D
. Compiled op-codes and
other debugging information are displayed in this mode. It is
useful primarily for programmers interested in extending and
debugging their own versions of kForth.
Versions of standard benchmark programs for measuring kForth execution speed may be found in the ftp site under /software/kforth/examples/benchmarks.
The following Forth source files provide tests for ANS compliance of core and standard extension words in Forth-94, for words which are specific to kForth, and for floating point arithmetic. Most of the test files require ttester.4th and tester.4th.
Non-zero return codes from the virtual machine (VM) indicate the
following conditions:
addr
.
ival
.
QUIT
(not seen by user).
ALLOT
memory for a word.
CREATE
(bad word name).
DO
.
BEGIN
.
ELSE
without matching IF
.
THEN
without matching IF
.
ENDOF
without matching OF
.
ENDCASE
without matching CASE
.
ABORT
will reset the stack pointers.
This procedure should be used to recover from VM errors 5 and 7, and
whenever there is a suspicion that the stacks have been
corrupted.
Source code for kForth consists of the following C++, C, and assembly
language files:
kforth.cpp
ForthCompiler.cpp
ForthVM.cpp
vmc.c
vm-common.s
vm.s
vm-fast.s
fbc.h
ForthWords.h
ForthCompiler.h
ForthVM.h
kfmacros.h
The source code is made available to users under the
GNU General
Public License. The Linux version is provided as
source code only and must be built locally on the user's machine
(see installation).
Under Linux, the standard GNU assembler, GNU C and C++ compilers,
and the C++ Standard Template Library (STL)
are required to build the executable. The Windows 95/98/NT console
application was built using the free
Cygwin port of the GNU
development tools.
The file kforth.cpp
serves as a skeleton C++
program to illustrate how the kForth compiler and virtual
machine may be embedded in a standalone program. XYPLOT for
Linux is a more complex GUI program which embeds kForth
to allow user extensibility. The file xyplot.cpp
shows how to set up hooks for calling C++ functions in the host
program from the embedded kForth interpreter and vice-versa.