---------------------------------------------------------------
QL Brouhabouha Forth 1994-05-15
---------------------------------------------------------------

Contents of this file:
-- Terms of distribution
-- Terms of usage
-- For novice users
-- Some of the differences to Forth 83
-- Implementation characteristics

---------------------------------------------------------------
---------------------------------------------------------------
*****************************************************
             Terms of distribution
*****************************************************

BBS Sysops!

** Runtime information is only preserved by QDOS archivers! **

Do NOT re-archive the executable on a different operating system
than QDOS! The archive contains OS header information needed for
execution.

Distribution is welcome, but ONLY allowed with all original
textfiles being at least part of what is distributed, and the
executable file remains the original one.

---------------------------------------------------------------
---------------------------------------------------------------

*****************************************************
               Terms of usage
*****************************************************

Brouhabouha Forth for Sinclair QL operating system QDOS comes
along with permission granted for free use in noncommercial
environments. That is, no profit is made intentionally by its
use. Elsewise contact with the author at one of the below given
addresses for special license terms is mandatory.

This is a forth interpreter and compiler mainly as described by
ANS draft proposal dpANS-6 from June 1993. That paper should be
the best description as well for this forth system.

No expectations may be made other than for occopying some memory
and perhaps stealing some CPU time.

All rights reserved @ Ewald Pfau, 1991-1994.

---------------------------------------------------------------
---------------------------------------------------------------

Comments and reports are welcome.

EMail:
2:316/9.0@fidonet
ehp@ist.tu-graz.ac.at

Surface Mail:
Hoenigtal 145 - 8301 Lassnitzhoehe - Austria/Europe

There also exists a similar version for IBM/DOS machines and
a reduced version for 8051 micro controller. Developping tools
exist for implementation of versions running on different
controllers and platforms.

Thanks to QL community: To Jan Bredenbeek for helping to keep
things together via EMail, to Franz Herrmann for his port of
LHarc Archiver, to Jonathan Hudson for his "Ibmdisk" tool.

Thanks to Forth community: To Johannes Teich for providing news
and putting questions.

---------------------------------------------------------------
---------------------------------------------------------------

*****************************************************
              For novice users
*****************************************************

Only some few hints, no tutorial:

1. There is no other delimiter in forth interpreter than white
space or end of line (end of block, end of file). Exceptions
only if characters are scanned for, as with '"' <double-quotes>
or ')' <right bracket>. So '(' <left bracket> is a Forth word as
all that other stuff and as well delimited by a blank (but it
will scan for a right bracket which by that action does not need
a delimiter).

2. The easiest way to leave the system is to say "BYE".

3. The easiest way to look around is to use the tools provided:
WORDS, .S, SEE <name>, DUMP. While interpreting, strings may be
input with S" <string>" - after that, parameters of a string,
address and length, are waiting on the stack to be dealt with,
as consumed by TYPE or by R/W OPEN-FILE or doubled by 2DUP and
so on; .S will show at any time what is on the stack.

4. Most words are built with defining word : <colon>, followed
by a new_name, a sample of actions, and the definition ended by
; <semi>. Words built that way, colon definitions,
"secondaries", may be "seen" by SEE <new_name>. Other defining
words are CONSTANT, VARIABLE, VALUE, CREATE. By help of DOES>,
as well defining words may be defined, mainly to inherit the
same runtime code to different data structures composed with
CREATE and ALLOT. Since this is an easy way, there are no
predefined arrays. The data representation of any definition is
the execution token (formerly called "cfa", "code field
address"), which may be found by ' <name> ("tick" of name) and
consumed by EXECUTE or by CATCH.

5. Words starting with a dot by convention are words which will
show something. A dot by its own will print a number. Numbers
are input by typing them in, so they are on the stack. For the
interpreter, numbers containing a dot somewhere between the
digits are interpreted to be double-cell numbers, the high value
being top of stack. Numbers being put in with a leading '$', '&'
or '%' are taken to be composed of hex digits, decimal or
binary, independant of what is the momentary valid base held in
variable BASE.

5. Leo Brodie's book "Thinking Forth" has been published again.

---------------------------------------------------------------
---------------------------------------------------------------

*****************************************************
       Some differences to Forth 79/83
*****************************************************

1. Only data memory may be regarded as being addressable!

2. A DOES> -clause only is a valid runtime for definitions which
have been created by direct or indirect use of CREATE. From a
word generated by a (child of) CREATE the address of associated
data memory may be calculated by applying >BODY to the execution
token of that word. The amount of needed data space should have
been reserved by use of ALLOT or , ("comma") or C, ("C-comma").
Proper use of ALIGN must be taken care of, if memory shall be
addressable as storage area for cells.

3. To control recovery from ambiguos situations, execution
tokens may be executed by being consumed as a parameter of
CATCH. By this, a frame is initialized for stack and return
stack, which will be discarded if a THROW is executed by a
called instance, with a parameter other than zero. This
parameter will then be returned by CATCH. Elsewise, a zero is
consumed by THROW, and CATCH returns zero, so the frame has been
worked thru step by step. If CATCH returns a non-zero value, so
an ambiguos situation is given, which may be dealt with as the
special situation dictates. This non-zero parameter may be
handed over then to another THROW following the special action
taken in this case. By each of the interpreting loops, EVALUATE,
LOAD, INCLUDE-FILE, QUIT, such a frame is established, which
will be discarded as a last instance for recovery, which will
yield into the action of QUIT. So ambiguos situations may be
"thrown" in any case, and control is given back to the
interpreting loop then. If interpreting is done from operating
system commandline, so control is given back to the operating
system, to ensure a workable condition if interpreting is done
in unattended situations. As well, if the returnstack frame is
out of bounds, a BYE is executed. Throw-values in the range
-4095...0 are reserved for system own use.

4. Instead of vocabularies, switchable by names, now there are
wordlists, switchable by parameters. WORDLIST will install a new
wordlist and return a wordlist identifier ("wid"). This
parameter is consumed by SET-CURRENT to switch the compiling
wordlist. GET-CURRENT returns the identifier of the active
compiling wordlist. GET-ORDER will return a set of parameters,
the count being on top of stack. SET-ORDER will switch the
active search-order, consuming a set of parameters with its
count on top of stack, the identifier of the first searched
wordlist being next on stack. FIND will search in the active
search-order. SEARCH-WORDLIST only will search in one wordlist,
given as parameter. FIND will consume the address of a counted
string, and if the name contained in that string is not found,
it will return that address and zero as top of stack.
SEARCH-WORDLIST will consume the parameters of a string as
address and length plus a wordlist identifier, and if not found,
only will return zero. If the searched string could be found to
be the name of a definition in the wordlists which had been
scanned, so the execution token of that definition is returned,
and a flag, being greater than zero, if that word is immediate,
elsewise a flag being less than zero. This is the only way, the
immediacy of a definition may be determined.

5. To inherit a compiling action to the runtime of a definition,
instead of formerly used  COMPILE <name> and [COMPILE] <name>
now the "immediacy smart" POSTPONE <name> is provided, as well
as COMPILE, ("compile-comma"). By POSTPONE, "name" will be
compiled by itself if it is immediate, so it will be executed at
runtime instead of being executed at compile time, elsewise
"name" will be compiled at runtime. COMPILE, takes an execution
token as parameter and appends it at runtime to the definition
of which the compilation is being in progress.

6. There are four defined input streams now. This is for strings
in memory, via EVALUATE (given parameters for that string), for
files containing text lines - via INCLUDE-FILE (given a file
handle) or INCLUDED (given parameters for the string of a file
name) -, for block files via THRU or LOAD (given two block
numbers or one; and for this implementation: LOAD-FILE given a
block number and a file handle), and console, via QUIT (no
parameters; state of the machine is reset to interpreting,
debugging is switched off, returnstack and catch-frames are
reset).

7. SAVE-INPUT will return a set of parameters with the count
being on top of stack, being an internal description of the
momentary position valid for the interpreter. By RESTORE-INPUT
it will be tried to set the position valid for the interpreter
to the position as described by the given set of parameters,
which should come from an earlier execution of SAVE-INPUT;
will return a true flag if this was possible, a false flag
otherwise. No switching across different input streams or files
should be done this way (even if this implementation is capable
of it).

8. SOURCE will return the parameters of the whole string which
for the moment is the input for the interpreter, consisting of
one line for console or ascii file, or one block for block
files, or the string given to EVALUATE. Within the input string,
the valid position being interpreted next may be set by reading
the value held in variable >IN and writing it back again. An
offset obtained from scanning in that string may be added or
subtracted. The rest of the string which waits for being
interpreted may be calculated by: "SOURCE >IN @ /STRING". This
will leave a length of zero if the input stream is empty for
this line or block.

9. The rest of this string is discarded by REFILL - and by that,
from the input stream the next line or block is tried to be
obtained. A flag is returned, zero in case the input stream had
been a string in memory for EVALUATE or an end of file has been
reached.

10. A counted string being returned by WORD does no more depend
on being delimited by a blank (or zero in F79). A delimiting
blank will be appended anyway after the isolated string has been
moved to the position in memory returned by HERE. The former
number conversion, which needed that delimiting blank has been
changed in being built upon the behaviour of >NUMBER - which
takes as parameters a double cell accumulation value and the
parameters of a string, these values being updated and returned.
If the string is empty afterwards, so its length is found as the
length parameter being zero, so no delimiter is needed.

11. Formerly used immediate word ASCII, to obtain a character
from the input stream, has been replaced by non-immediate CHAR
and immediate [CHAR]. EXPECT - to obtain a string as typed in
from keyboard - has been replaced by ACCEPT; instead of the
resulting length of the input string being held in variable SPAN
now the result is returned on stack. Instead of fomerly used
FORGET <name> - after having made allocations -, now markers
should be set using MARKER <name>, before making allocations.
Execution of <name> then will have the same effect as now
obsolete FORGET <name>.

---------------------------------------------------------------
---------------------------------------------------------------

*****************************************************
Implementation characteristics and special behaviour:
*****************************************************

By use of ADJUST-SIZES and SAVE-FILE (described somewhere else
in one of these texts), the sizes of RAM, which the executable
claims from QDOS, may be set to individual needs. This is set to
be 192 kB in this distributed version.

For File Wordset words RENAME-FILE and RESIZE-FILE, the QDOS
extension traps provided by TK2 should be workable (Trap 3,
functions $4A and $4B). No other extensions are required except
provision for enough memory, that is at least a 256 kB expansion
card on original QL.

Floating Point wordset is missing.

Only two internal throw codes are implemented, these are -2 for
ABORT" and -1 for ABORT. "-254 THROW" will give the behaviour of
"BYE" (this may change in future versions).

Environmental queries with ENVIRONMENT? only will deliver a
dummy argument of zero.

The executable will claim 192 kB from the operating system at
startup. This is 112 kB for code, 48 kB for headers and 32 kB
for data. With the block editor loaded as-is, for free use
remain about 42 kB for code, 22 kB for headers, 19 kB for data.
After execution of marker NOEDIT, this increases to 59 kB, 27
Kb, 23 kB, with no editor. This may be adjusted to individual
needs by use of ADJUST-SIZES and SAVE-FILE for next start-up of
saved executable.

QDOS channel IDs are held in an internal table keeping a maximum
of 32 values. Forth I/O-handle parameters for I/O-access are offsets
into this table. Positions in this table having become free are
re-used.

Memory in data area is addressable as cells only with aligned
addresses.

EMIT of non-graphical characters is handled by QDOS. It will print a
filled rectangular.

Editing of keyboard input via ACCEPT is done via QDOS trap
IO.FLINE, so commandline history will be functionable if
implemented, and deleting, cursor positioning, and inserting of
characters is working. IO.FLINE is started as a secondary job
from QDOS. Forth job is named "ZQF" in QDOS job table. Secondary
job is named "ZQF/con_".

After the given string length for ACCEPT has been input, or the
ENTER key is pressed, the call is terminated. The code for the
ENTER key is the code for LF.

KEY will act upon character input in the range of character
codes 0 ... 255. Keypresses giving other values are discarded if
waiting or scanning for input with KEY or KEY?. The set of all
keycodes may be obtained by use of EKEY or scanned for by use of
EKEY?.

A character storage cell has the width of one address unit. At
any time the machine may be regarded as character aligned. The
only memory operator acting upon address units is MOVE. All
other memory operators will act upon character or cell units.

No provision is made to store definition names containing
characters which are no printable characters, via keyboard or
ascii file input. If such names have been defined, using input
from block files or via EVALUATE, so the corresponding execution
tokens may be found as well the same way.

If the input stream is switched to an ascii file and
interpreting is done via INCLUDE-FILE or INCLUDED, so characters
with codes smaller than 32 are treated as white space, except
for the codes for CR and LF, which are taken as EOL markers.
End of line is reached with CR, LF, or the sequence CR LF.

While compiling, control flow stack for compiling of
conditionals is the parameter stack. Non-immediate words CS-PICK
and CS-ROLL will copy or exchange control flow values.

A value of greater than 36 kept in variable BASE will give an
undefined behaviour for digit conversion. A value of up to 36
will provide conversion to digits 0...9,A...Z. DECIMAL will
reset conversion to digits 0...9.

Compilation of an ABORT" <text>" sequence will be compiled as a
sequence starting with a conditional branch and an inline
string containing <text>.

Maximum sizes are: 255 characters for counted strings, 30 for
definition names and the output of WORD. Parsed strings may be
as long as the input field where they are taken from. If 'S"' is
executed while interpreting, so the parsed string is kept in a
circular buffer, and at least the two last input strings are
available, each with a maximum size of 255 characters.

User input and output devices are opened at start of program.
The first input device is the commandline as given to QDOS as
parameter with program call. This line is evaluated. After that,
all further input is taken via QDOS call IO.FLINE. User output
device is a console window set to no border, CSIZE 0,0, INK 7,
PAPER 0 at position 0, 0, sized 512 * 256 with cursor switched
on and the window cleared. First IO.FLINE and output window are
set up, then the commandline is evaluated, then the banner is
shown. This behaviour may change or as well be made configurable
in future versions.

The accessible dictionnary space is data space only, this is
separate from code space and name space.

One address unit is 8 bits wide.

Numbers are stored in cells sized 4 address units, in two's
complement representation for signed values.

Single cell signed numbers may take values from -2^31 to 2^31-1.
Single cell unsigned numbers may take values from 0 to 2^32-1.
Positive single cell numbers may take values from 0 to 2^31-1.

Double cell signed numbers may take values from -2^63 to 2^63-1.
Double cell unsigned numbers may take values from 0 to 2^64-1.
Positive double cell numbers may take values from 0 to 2^63-1.

Data space of dictionnary is contiguos. Stack and return stack
are kept inside this area and should be regarded as
non-addressable - positions may change with heap allocations.
Strings returned by S", C", WORD, PARSE, SOURCE should be
regarded as read-only and be copied before they are written to.

Buffer for WORD is shared with buffer for pictured number
conversion and kept at the address returned by HERE. WORD will
not return a string with a size greater than 31 characters.

Storage area for one cell is sized 4 address units.
Storage area for one character is sized 1 address unit.

Keyboard input terminal buffer is sized 128 characters.

Storage area for pictured numeric output string is sized 64
characters, 32 of which are shared with the buffer for WORD.

The size of the scratch area, starting at the address returned
by PAD, is all unused data memory. So " PAD UNUSED " will give a
maximum temporary storage area.

Finding definition names internally is done by uppercasing the
result of WORD from keyboard input and uppercasing the names
searched for. Names of new definitions are stored uppercase.
This case independent behaviour may be switched off by writing 0
to variable CAPS. After that, names of new definitions are
stored as given and finding is done case dependant with input as
given.

The system prompt, after a line from keyboard has been
successfully interpreted, is "ok" followed by as many dots, but
not more than 32, as are items on the stack. If state of machine
is the compiling state, instead of "ok" "]" is output. Cursor
is set to a new line then.

Operators * / MOD /MOD */ */MOD are not provided and have to be
defined using the preferred method of rounding characteristics,
given either by SM/REM, symmetrical, or FM/MOD, towards negative
infinite.

Definitions by using SM/REM are:

: /MOD       over 0< swap sm/rem ;
: /          /mod swap drop ;
: MOD        /mod drop ;

: *          m* drop ;
: */MOD    >R m* R> sm/rem ;
: */         */mod swap drop ;

In compiling state, variable STATE contains -1, else 0.

Arithmetic overflow or underflow will give results modulo 2^32,
the operators seen as unsigned numbers. Division by zero will
give results of 0 for quotient and remainder.

While compiling a runtime part of a definition in a DOES>
-clause, the name of the defining word of which the compilation
is in progress, will not be found. It is revealed only after
finishing the definition with SEMI, ';'.

No system words will use the scratch area starting at PAD.

Return stack size is 256 cells.
Parameter stack size is all of unused memory growing downwards,
if not used as scratch area starting at PAD, growing upwards.
By UNUSED, a stack size of 256 cells is taken into account.

---------------------------------------------------------------
---------------------------------------------------------------
