RHYBAG usage and file format: ============================ Here are basic notes on the usage and file format of `rhybag'. BUILDING: -------- Run the `mk' script to build the optimised executable. This uses the Perl-script `strip_warnings' to remove warnings that should be ignored. If you don't have Perl, edit out this part of the `mk' script, and put up with lots of warnings about multi-character constants. USAGE: ----- rhybag: Rhythmic Binaural Generator, version 0.1.0 Copyright (c) 1999 Jim Peters, released under the GNU GPL Usage: rhybag [options] sequence-file [sequence] Defaults to playing `main' sequence if none specified Options: -o file Output raw data to a file instead of to /dev/dsp -o file.wav Output WAV-format data to a file -O Output raw data to standard output -r rate Manually select an output rate (Hz, default is 44100Hz) -S fade_ms Set the time over which to fade sudden volume changes (def 5ms) -A vol Set master volume to given %age of maximum possible, calculated automatically (default is 95%) -M vol Manually set master volume level (%age, 100 == as written) -D Generate debugging dump of notes and sequences -DD Also generate debugging dump of rendering events By default `rhybag' looks for a sequence called `main', and tries to play that, but it can be set to play another sequence by specifying this on the command line. Wave-format headers are automatically written to any filename ending in .WAV (case-insensitive). Otherwise the output is raw data. Only 16-bit stereo data is output at present. The output rate defaults to 44100Hz, but this may be altered using the -r option. By default all sudden changes in volume (except those at the start of a note) are smoothed by converting them into a volume slide of 5ms duration. This helps to avoid clicking. To adjust or remove this slide, use the -S option. The utility automatically calculates the maximum possible amplitude that could be generated by a sequence (assuming that all samples could reach the maximum of their 16-bit range, and that wave-peaks of binaurals may coincide). It then adjusts the amplitude so that volume levels could peak at 95% of the maximum possible. To automatically adjust to another level rather than 95%, use the -A option. Levels above 100 may give interesting distortion effects - overflow values wrap around. To adjust the level manually, use -M. If you run first without -A or -M you will see the equivalent -M values that the utility has automatically calculated, and you can use these as a guideline for choosing your own -M value. -M is useful when volume levels must agree between samples generated from separate runs of `rhybag'. Debugging output may be generated with -D or -DD. FILE FORMAT: =========== The file is considered a sequence of words. A word is any sequence of characters between white space. However, the characters ();, always count as separate words, whether or not there is white space around them. Comments start with a word starting with '#', and continue to the end of the line. This means that '#' embedded in words is okay. At a top-level, the file is structured as a sequence of directives, each of which may span many lines, and which ends in a semicolon. The syntax is C-like in its use of semicolons, brackets and so on. Option directives appear at the top of the file, and take the form: opt options ... ; They allow command-line options to be included in the file. If there are both command-line options and `opt' options, then the command-line options are processed last, taking precedence over the options in the file. After these come note and sequence definitions, taking the forms: note bin ; note samp [] ; note samp ; note mix ; seq ; Note that in general, there should be no forward references. This means that notes must be defined before they are used, and sequences must be defined before they are called by other sequences. Binaural notes -------------- Binaural notes are defined using `note ... bin'. The binaural envelope is a list along lines of: time:carr+bin/amp [, ->] ... Time is a time relative to the start of the note. Times may be negative (allowing the note to start before its official starting time). Times are in ms, although other units are available. For example, `1h20m3s400' is 1 hour, 20 minutes, 3 seconds and 400 ms. A relative time starts with `+', and is relative to the previous binaural spec. Carrier frequency `carr' is in Hz, although frequencies may also be specified in centi-semitones relative to middle-C by following the number with a `c' (0c for middle-C, 1200c for an octave above, etc) (@@@ this feature not yet tested). The binaural frequency `bin' can only be specified in Hz. The sign of the frequency is used to select which channel carries the higher frequency. Amplitudes are on an arbitrary scale where a single tone with amplitude 100 would be the loudest tone that could be possibly output without clipping if no volume adjustments are made (-M 100). The `time:carr+bin/amp' spec must be all-one-word without spaces, and if it is not the last, it should be connected to the next with either `,' or `->'. `,' indicates a sudden change at the start of the next part, and `->' indicates a slide. Slides are always exponential in amplitude and frequency. This gives linear decibel changes in sound volume, and linear semitone changes in pitch. The maximum slide range is 16 octaves, or a 2^16 change. For slides to zero (for example 0 amplitude), a true exponential slide would never arrive, and in this case the maximum possible slide is used (all 16 octaves), sliding to 1/65536th of the non-0 value, rather than actually to zero. Here is an example binaural note: note aa bin -200:6400+0/1 -> 0:100+1.5/20 -> 3s:400+6.0/20 -> 3s200:3200+0/1; Note that if the binaural frequency is 0 (giving same frequency in both channels), then the waves are started in phase. Otherwise the waves are started out of phase. This is done because when the waves are out of phase, this gives the low-point in the binaural envelope. @@@ Perhaps this should be changed ... ? Sample-based notes ------------------ Sample-based notes are based on a sample, which is either loaded from disk, or is generated from a sequence, which is then shaped in pitch and amplitude by an envelope. The definition takes on of the forms: note samp [] ; note samp ; The for sample-files are as follows. Only headerless raw sample-files are handled at the moment (@@@), and they are assumed to contain 16-bit stereo data at the same rate as the output rate (-r option) unless otherwise specified. mono Specify that the sample-file is mono rather than stereo loop Specify that the sample-file may be looped @freq Specify that the sample was recorded at `freq'Hz, and adjust accordingly to output it correctly p### Specify that the sample has a section `###' samples long that should be played before the note's official start-time. For samples generated from sequences, the following apply. This actually renders the sequence to disk, after which it is played directly from the file. The volume levels are automatically adjusted to 95% of maximum possible (@@@ change this ?). mono Force the sample to be rendered in mono rather than stereo loop Generate a perfectly-loopable sample from the sequence When a loopable sample is generated, it will be exactly the same length as the sequence itself. It is perfectly loopable, because it is generated so that notes playing at the end of the loop wrap around to the start of the loop without any glitches - similarly for notes at the start which have a warm-up part that goes before time zero - these wrap around to the end of the sample. Playing the loopable sample over and over has the same effect as playing the sequence over and over. In the two cases, all of the time from -infinity to plus infinity is accounted for. For loopable samples, the sample is repeated before time 0 (the official start of the note) as many times as necessary to fill the envelope, and similarly after time 0. The start of the loop will always coincide with time zero, even if pitch-shifting has been taking place before time 0. For non-loopable samples, there may be part of the sample which plays before time 0, and the remainder of the sample follows on until it is exhausted. @@@ At the moment this is not handled correctly - only samples which start at time 0 really work. This will be fixed eventually, when I can find a good way to handle pitch-shifting effects on the start of the note (which may happen even if the user is not pitch-shifting, for example when the sample-rate has to be adjusted). The pitch/amplitude envelope consists of a sequence of specifications similar to the binaural specifications: time:pitch/amp [, ->] ... This is mostly the same as the binaural specification, except that the pitch-shift value `pitch' is measured in centi-semitones relative to the natural pitch of the sample, for example 1200 for an octave up, -700 for a fifth below. Note that there are some limitations in the sample-playing mechanism that affect very large samples playing at altered pitches. Basically, the longer the sample, the bigger the steps between playback rates. It is always possible to play back a sample at its original rate, and the maximum sample-length at 44.1kHz is 18 hours at this rate (5Gb file if mono). For a more realistic sample-length of 1 hour, say, we are left with 4 bits of precision, giving a next-step-up of 1 semitone, but nothing in between. For a short sample-length of 1 minute, there are 10 bits of precision, with a next step up of 2/100 semitones. It is possible to use very short loopable samples as wave-generators, in which case, the precision will be large, which is exactly what we want. The sample-files reside on disk, and are mapped into memory rather than loaded. This means that the Linux paging/swapping mechanism takes care of loading the data. However, there are still delays in this method, and I think I'll have to put in some buffering on the output to cover this (@@@). Triggered notes (@@@ NYI) --------------- This triggers a note repeatedly based on the phase of a binaural note. note trigger ; The trigger point is specified as a percentage, where 0% and 100% are the quiet parts and 50% is the loud part of the binaural note. The triggered note can be of any type. Mix notes --------- These are notes that are built up from a number of other notes of any type (including other mix notes). note mix ; The notes making up the list are each specified as follows: time:name/vol The `time' specifies the time relative to the start-time of the note at which the note named `name' is started. `vol' is the adjustment to the note's volume level that should be applied - 100 indicates no change (100%), 50 means half-amplitude, etc. Sequences --------- Sequences consist of a number of notes arranged in parallel and series according to timing indications. seq ; The sequence itself consists of the following constructions: | Play the two sequences in parallel (i.e. simultaneously) ( ) ( ) x## Group notes together to control timing, etc, locally, or to limit how far the | parallel operator reaches. If a `x##' sequence follows, repeat the enclosed sequence `##' times. [] [] x## Include the named sequence at this point, repeating it `##' times if the `x##' sequence follows. The time it takes up is the logical length of the sequence. note-name{' '' '''}{. .. ...} Include a note at this point. The length is the current beat-length, modified by quotes and dots that follow. For each quote the length is halved, and for each dot the length steps half-way towards double the original length. This means that "aa aa" takes two beats, and so does "aa. aa'", and so does "aa.. aa''". This emulates dotted and tailed notes from normal musical notation. Note that the note-length is only a logical length. In fact the note is just triggered at this point, and it may actually start playing before this time, and it may extend far beyond the logical length. {_ __ ___}{' '' '''}{. .. ...} Include a rest at this point. This is just used to indicate a space in the timing. The length is the number of underscores, which is then adjusted by quotes and dots as before. ##ms ##bpm Specify the base beat-length at this () level and deeper levels. The default is 240bpm, equal to 250ms. Once past the next closing brace at this level, this new setting is forgotten, and whatever rate was set when the open brace was reached is restored - this means that the change is local to a brace level. ##/## Adjust the beat-length relative to the base beat-length (set by the last ##ms or ##bpm sequence). ##/## means `fit ## notes into the time of ##', so 3/2 means three notes in the space of two. Useful for unusual rhythms. Each element in the sequence takes up a certain amount of logical time, and each series of elements also. When series appear in parallel, the length of the combined sequence is the length of the longest one. Eventually this works its way up through the braces to give the total length of the whole sequence. This is what is used when, for example, a loopable sample is generated from the sequence, or when the sequence is included in another sequence. FUTURE PLANS: (@@@) ------------ - Allow length to be fed to a note, and allow it to specify times in terms of its length. - Allow triggered notes to specify start and end trigger percentages which gives a length to the triggered notes. This would allow flashing light-glasses (assuming we could generate high-pitched tones to L or R channel). - When specifying envelopes, and using mostly relative times (+100, etc), it seems awkward because the relative time is in fact the length of the previous part, which then seems to be written on the wrong line. Maybe find some way to put the length on the previous part, and omit the time: spec from the next part. - Add buffering of, say, 2 to 5 seconds to absorb delays caused by Linux loading up parts of the samples. - Add code to do phase-shifting of samples in L+R channels to generate binaural-type `helicopter' effects described by users of CoolEdit. - Allow binaural beat frequency to be omitted from spec - to say 100/10 instead of 100+0/10, or 100/0 instead of 100+0/0 - Allow the initial phase-difference of binaurals to be specified - Allow pitch-shift to be specified for sample instruments so that it's easy to use them as wave-table style loops - Allow and where the sample filename should go to generate noise. Write a loopable sample file first. - Extend this to generate all kinds of different loopable samples (e.g. square-wave, tri-angle, whatever). - Keep the temporary files in a cache so that we don't have to regenerate them each time. - Allow reading WAV files to pick up automatically loop points, frequency, mono/stereo, etc. - Oversampling levels x2, x4 - Option to scan samples to determine max level ??? - 8-bit output ?