RHYBAG usage and file format:
============================

Here are basic notes on the usage and file format of `rhybag'.


BUILDING:
--------

Run the `mk' script to build the optimised executable.  This uses the
Perl-script `strip_warnings' to remove warnings that should be
ignored.  If you don't have Perl, edit out this part of the `mk'
script, and put up with lots of warnings about multi-character
constants.


USAGE:
-----

rhybag: Rhythmic Binaural Generator, version 0.1.0
Copyright (c) 1999 Jim Peters, released under the GNU GPL

Usage: rhybag [options] sequence-file [sequence]
   Defaults to playing `main' sequence if none specified
Options:
 -o file      Output raw data to a file instead of to /dev/dsp
 -o file.wav  Output WAV-format data to a file
 -O           Output raw data to standard output

 -r rate      Manually select an output rate (Hz, default is 44100Hz)
 -S fade_ms   Set the time over which to fade sudden volume changes (def 5ms)
 -A vol       Set master volume to given %age of maximum possible, calculated
                automatically (default is 95%)
 -M vol       Manually set master volume level (%age, 100 == as written)
 -D           Generate debugging dump of notes and sequences
 -DD          Also generate debugging dump of rendering events

By default `rhybag' looks for a sequence called `main', and tries to
play that, but it can be set to play another sequence by specifying
this on the command line.

Wave-format headers are automatically written to any filename ending
in .WAV (case-insensitive).  Otherwise the output is raw data.  Only
16-bit stereo data is output at present.

The output rate defaults to 44100Hz, but this may be altered using the
-r option.

By default all sudden changes in volume (except those at the start of
a note) are smoothed by converting them into a volume slide of 5ms
duration.  This helps to avoid clicking.  To adjust or remove this
slide, use the -S option.

The utility automatically calculates the maximum possible amplitude
that could be generated by a sequence (assuming that all samples could
reach the maximum of their 16-bit range, and that wave-peaks of
binaurals may coincide).  It then adjusts the amplitude so that volume
levels could peak at 95% of the maximum possible.

To automatically adjust to another level rather than 95%, use the -A
option.  Levels above 100 may give interesting distortion effects -
overflow values wrap around.

To adjust the level manually, use -M.  If you run first without -A or
-M you will see the equivalent -M values that the utility has
automatically calculated, and you can use these as a guideline for
choosing your own -M value.  -M is useful when volume levels must
agree between samples generated from separate runs of `rhybag'.

Debugging output may be generated with -D or -DD.


FILE FORMAT:
===========

The file is considered a sequence of words.  A word is any sequence of
characters between white space.  However, the characters ();, always
count as separate words, whether or not there is white space around
them.  

Comments start with a word starting with '#', and continue to the end
of the line.  This means that '#' embedded in words is okay.

At a top-level, the file is structured as a sequence of directives,
each of which may span many lines, and which ends in a semicolon.  The
syntax is C-like in its use of semicolons, brackets and so on.

Option directives appear at the top of the file, and take the form:

       opt options ... ;

They allow command-line options to be included in the file.  If there
are both command-line options and `opt' options, then the command-line
options are processed last, taking precedence over the options in the
file.

After these come note and sequence definitions, taking the forms:

      note <name> bin <binaural-envelope>;
      note <name> samp [<sequence-name>] <samp-options> <samp-envelope>;
      note <name> samp <file-name> <samp-options> <samp-envelope>;
      note <name> mix <mix-note-list>;
      seq <name> <sequence>;

Note that in general, there should be no forward references.  This
means that notes must be defined before they are used, and sequences
must be defined before they are called by other sequences.


Binaural notes
--------------

Binaural notes are defined using `note ... bin'.  The binaural
envelope is a list along lines of:

      time:carr+bin/amp [, ->] ...

Time is a time relative to the start of the note.  Times may be
negative (allowing the note to start before its official starting
time).  Times are in ms, although other units are available.  For
example, `1h20m3s400' is 1 hour, 20 minutes, 3 seconds and 400 ms.  A
relative time starts with `+', and is relative to the previous
binaural spec.

Carrier frequency `carr' is in Hz, although frequencies may also be
specified in centi-semitones relative to middle-C by following the
number with a `c' (0c for middle-C, 1200c for an octave above, etc)
(@@@ this feature not yet tested).  The binaural frequency `bin' can
only be specified in Hz.  The sign of the frequency is used to select
which channel carries the higher frequency.

Amplitudes are on an arbitrary scale where a single tone with
amplitude 100 would be the loudest tone that could be possibly output
without clipping if no volume adjustments are made (-M 100).

The `time:carr+bin/amp' spec must be all-one-word without spaces, and
if it is not the last, it should be connected to the next with either
`,' or `->'.  `,' indicates a sudden change at the start of the next
part, and `->' indicates a slide.

Slides are always exponential in amplitude and frequency.  This gives
linear decibel changes in sound volume, and linear semitone changes in
pitch.  The maximum slide range is 16 octaves, or a 2^16 change.  For
slides to zero (for example 0 amplitude), a true exponential slide
would never arrive, and in this case the maximum possible slide is
used (all 16 octaves), sliding to 1/65536th of the non-0 value, rather
than actually to zero.

Here is an example binaural note:

      note aa bin
       -200:6400+0/1 ->
       0:100+1.5/20 ->
       3s:400+6.0/20 ->
       3s200:3200+0/1;

Note that if the binaural frequency is 0 (giving same frequency in
both channels), then the waves are started in phase.  Otherwise the
waves are started out of phase.  This is done because when the waves
are out of phase, this gives the low-point in the binaural envelope.
@@@ Perhaps this should be changed ... ?


Sample-based notes
------------------

Sample-based notes are based on a sample, which is either loaded from
disk, or is generated from a sequence, which is then shaped in pitch
and amplitude by an envelope.  The definition takes on of the forms:

      note <name> samp [<sequence-name>] <samp-options> <samp-envelope>;
      note <name> samp <file-name> <samp-options> <samp-envelope>;

The <samp-options> for sample-files are as follows.  Only headerless
raw sample-files are handled at the moment (@@@), and they are assumed
to contain 16-bit stereo data at the same rate as the output rate (-r
option) unless otherwise specified.

      mono	Specify that the sample-file is mono rather than stereo
      loop	Specify that the sample-file may be looped
      @freq	Specify that the sample was recorded at `freq'Hz, and adjust
		  accordingly to output it correctly
      p###	Specify that the sample has a section `###' samples long that
		  should be played before the note's official start-time.

For samples generated from sequences, the following <samp-options>
apply.  This actually renders the sequence to disk, after which it is
played directly from the file.  The volume levels are automatically
adjusted to 95% of maximum possible (@@@ change this ?).

      mono	Force the sample to be rendered in mono rather than stereo
      loop	Generate a perfectly-loopable sample from the sequence

When a loopable sample is generated, it will be exactly the same
length as the sequence itself.  It is perfectly loopable, because it
is generated so that notes playing at the end of the loop wrap around
to the start of the loop without any glitches - similarly for notes at
the start which have a warm-up part that goes before time zero - these
wrap around to the end of the sample.  Playing the loopable sample
over and over has the same effect as playing the sequence over and
over.

In the two cases, all of the time from -infinity to plus infinity is
accounted for.  For loopable samples, the sample is repeated before
time 0 (the official start of the note) as many times as necessary to
fill the envelope, and similarly after time 0.  The start of the loop
will always coincide with time zero, even if pitch-shifting has been
taking place before time 0.

For non-loopable samples, there may be part of the sample which plays
before time 0, and the remainder of the sample follows on until it is
exhausted.  @@@ At the moment this is not handled correctly - only
samples which start at time 0 really work.  This will be fixed
eventually, when I can find a good way to handle pitch-shifting
effects on the start of the note (which may happen even if the user is
not pitch-shifting, for example when the sample-rate has to be
adjusted).

The pitch/amplitude envelope consists of a sequence of specifications
similar to the binaural specifications:

      time:pitch/amp [, ->] ...

This is mostly the same as the binaural specification, except that the
pitch-shift value `pitch' is measured in centi-semitones relative to
the natural pitch of the sample, for example 1200 for an octave up,
-700 for a fifth below.

Note that there are some limitations in the sample-playing mechanism
that affect very large samples playing at altered pitches.  Basically,
the longer the sample, the bigger the steps between playback rates.
It is always possible to play back a sample at its original rate, and
the maximum sample-length at 44.1kHz is 18 hours at this rate (5Gb
file if mono).  For a more realistic sample-length of 1 hour, say, we
are left with 4 bits of precision, giving a next-step-up of 1
semitone, but nothing in between.  For a short sample-length of 1
minute, there are 10 bits of precision, with a next step up of 2/100
semitones.  It is possible to use very short loopable samples as
wave-generators, in which case, the precision will be large, which is
exactly what we want.

The sample-files reside on disk, and are mapped into memory rather
than loaded.  This means that the Linux paging/swapping mechanism
takes care of loading the data.  However, there are still delays in
this method, and I think I'll have to put in some buffering on the
output to cover this (@@@).


Triggered notes  (@@@ NYI)
---------------

This triggers a note repeatedly based on the phase of a binaural note.

      note <name> trigger <percentage> <binaural-note> <triggered-note>;

The trigger point is specified as a percentage, where 0% and 100% are
the quiet parts and 50% is the loud part of the binaural note.  The
triggered note can be of any type.


Mix notes
---------

These are notes that are built up from a number of other notes of any
type (including other mix notes).

      note <name> mix <mix-note-list>;      

The notes making up the list are each specified as follows:

      time:name/vol

The `time' specifies the time relative to the start-time of the note
at which the note named `name' is started.  `vol' is the adjustment to
the note's volume level that should be applied - 100 indicates no
change (100%), 50 means half-amplitude, etc.


Sequences
---------

Sequences consist of a number of notes arranged in parallel and series
according to timing indications.
		  
      seq <name> <sequence>;

The sequence itself consists of the following constructions:

  <seq> | <seq>

        Play the two sequences in parallel (i.e. simultaneously)

  ( <seq> )
  ( <seq> ) x##

        Group notes together to control timing, etc, locally, or to
	limit how far the | parallel operator reaches.  If a `x##'
	sequence follows, repeat the enclosed sequence `##' times.

  [<seq-name>]
  [<seq-name>] x##

	Include the named sequence at this point, repeating it `##'
	times if the `x##' sequence follows.  The time it takes up is
	the logical length of the sequence.

  note-name{' '' '''}{. .. ...}

	Include a note at this point.  The length is the current
	beat-length, modified by quotes and dots that follow.  For
	each quote the length is halved, and for each dot the length
	steps half-way towards double the original length.  This means
	that "aa aa" takes two beats, and so does "aa. aa'", and so
	does "aa.. aa''".  This emulates dotted and tailed notes from
	normal musical notation.

	Note that the note-length is only a logical length.  In fact
	the note is just triggered at this point, and it may actually
	start playing before this time, and it may extend far beyond
	the logical length.

  {_ __ ___}{' '' '''}{. .. ...}

        Include a rest at this point.  This is just used to indicate a
        space in the timing.  The length is the number of underscores,
        which is then adjusted by quotes and dots as before.

  ##ms
  ##bpm
	
	Specify the base beat-length at this () level and deeper
	levels.  The default is 240bpm, equal to 250ms.  Once past the
	next closing brace at this level, this new setting is
	forgotten, and whatever rate was set when the open brace was
	reached is restored - this means that the change is local to a
	brace level.

  ##/##

	Adjust the beat-length relative to the base beat-length (set
	by the last ##ms or ##bpm sequence).  ##/## means `fit ##
	notes into the time of ##', so 3/2 means three notes in the
	space of two.  Useful for unusual rhythms.


Each element in the sequence takes up a certain amount of logical
time, and each series of elements also.  When series appear in
parallel, the length of the combined sequence is the length of the
longest one.  Eventually this works its way up through the braces to
give the total length of the whole sequence.  This is what is used
when, for example, a loopable sample is generated from the sequence,
or when the sequence is included in another sequence.
  

FUTURE PLANS: (@@@)
------------

- Allow length to be fed to a note, and allow it to specify times in
  terms of its length.

- Allow triggered notes to specify start and end trigger percentages
  which gives a length to the triggered notes.  This would allow
  flashing light-glasses (assuming we could generate high-pitched
  tones to L or R channel).

- When specifying envelopes, and using mostly relative times (+100,
  etc), it seems awkward because the relative time is in fact the
  length of the previous part, which then seems to be written on the
  wrong line.  Maybe find some way to put the length on the previous
  part, and omit the time: spec from the next part.

- Add buffering of, say, 2 to 5 seconds to absorb delays caused by
  Linux loading up parts of the samples.

- Add code to do phase-shifting of samples in L+R channels to generate
  binaural-type `helicopter' effects described by users of CoolEdit.

- Allow binaural beat frequency to be omitted from spec - to say
  100/10 instead of 100+0/10, or 100/0 instead of 100+0/0

- Allow the initial phase-difference of binaurals to be specified

- Allow pitch-shift to be specified for sample instruments so that
  it's easy to use them as wave-table style loops

- Allow <pink> and <white> where the sample filename should go to
  generate noise.  Write a loopable sample file first.

- Extend this to generate all kinds of different loopable samples
  (e.g. square-wave, tri-angle, whatever).

- Keep the temporary files in a cache so that we don't have to
  regenerate them each time.

- Allow reading WAV files to pick up automatically loop points,
  frequency, mono/stereo, etc.

- Oversampling levels x2, x4

- Option to scan samples to determine max level ???

- 8-bit output ?