SoX(1)                         Sound eXchange_ng                        SoX(1)

NAME
       SoX - Sound eXchange_ng, another Swiss Army knife of audio manipulation

SYNOPSIS
       sox_ng [global-options] [format-options] infile1
            [[format-options] infile2] ... [format-options] outfile
            [effect [effect-options]] ...

       play_ng [global-options] [format-options] infile1
            [[format-options] infile2] ... [format-options]
            [effect [effect-options]] ...

       rec_ng [global-options] [format-options] outfile
            [effect [effect-options]] ...

DESCRIPTION
   Introduction
       SoX  reads  and  writes audio files in most popular formats and can op-
       tionally apply effects to them. It can combine multiple input  sources,
       synthesize  audio  and, on many systems, act as a general purpose audio
       player or as a multitrack audio recorder.  It can also split the  input
       into multiple output files.

       All  SoX  functionality is available using just the sox_ng command.  To
       simplify playing and recording audio, if SoX is invoked as play_ng, the
       output file is automatically set to be the default sound device and, if
       invoked as rec_ng, the default sound device is used as an input source.
       Additionally, the soxi_ng command provides a convenient  way  to  query
       audio file header information.

       The  heart  of  SoX is a library called libsox_ng.  Those interested in
       extending SoX or using it in other programs should refer  to  the  lib-
       sox_ng manual page.

       SoX is a command-line audio processing tool particularly suited to mak-
       ing quick, simple edits and to batch processing.  If you need an inter-
       active, graphical audio editor, use audacity(1).

                                 *        *        *

       The overall SoX processing chain can be summarized as follows:

                    Input(s) -> Combiner -> Effects -> Output(s)

       On the SoX command line, the positions of the Output(s) and the Effects
       are  swapped w.r.t. the logical flow just shown and, while options per-
       taining to files are placed before their respective file names, the op-
       posite is true for effects.  To show how this works in  practice,  here
       is a selection of examples of how SoX might be used.  The simple

          sox_ng recital.au recital.wav

       translates  an  audio  file  in  Sun AU format to a Microsoft WAV file,
       while

          sox_ng recital.au -b 16 recital.wav channels 1 rate 16k fade 3 norm

       performs the same format translation  but  also  applies  four  effects
       (down-mix  to  one  channel, sample rate change, fade in and normalize)
       and stores the result at a bit-depth of 16.

          sox_ng -r 16k -e signed -b 8 -c 1 voice-memo.raw voice-memo.wav

       converts `raw' (a.k.a. `headerless') audio to  a  self-describing  file
       format,

          sox_ng slow.aiff fixed.aiff speed 1.027

       adjusts audio speed,

          sox_ng short.wav long.wav longer.wav

       concatenates two audio files and

          sox_ng -m music.mp3 voice.wav mixed.flac

       mixes together two audio files.

          play_ng "The Moonbeams/Greatest/*.ogg" bass +3

       plays a collection of audio files applying a bass boosting effect,

          play_ng -n -c1 synth sin %-12 sin %-9 sin %-5 sin %-2 fade h 0.1 1 0.1

       plays a synthesized `A minor seventh' chord with a pipe organ sound,

          rec_ng -c 2 radio.aiff trim 0 30:00

       records half an hour of stereo audio and

          play_ng -q take1.aiff & rec -M take1.aiff take1-dub.aiff

       (with  a  POSIX  shell  and  where supported by hardware) records a new
       track in a multitrack recording.  Finally,

          rec_ng -r 44100 -b 16 -e signed-integer -p \
            silence 1 0.50 0.1% 1 10:00 0.1% | \
            sox_ng -p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : \
            newfile : restart

       records a stream of audio such as an LP or cassette and splits it  into
       multiple  audio files at points where there are two seconds of silence.
       Also, it does not start recording until it detects some sound and stops
       after it sees ten minutes of silence.

       The above is just an overview of SoX's capabilities. Detailed  explana-
       tions of how to use all SoX parameters, file formats and effects can be
       found below in this manual, in soxformat_ng(7) and in soxi_ng(1).

   File Format Types
       SoX  can  work with `self-describing' and `raw' audio files.  `Self-de-
       scribing' formats (e.g. WAV, FLAC, MP3) have a header  that  completely
       describes  the  signal  and  encoding attributes of the audio data that
       follows. `raw' or `headerless' formats do not contain this information,
       so the audio characteristics of these must be described on the SoX com-
       mand line, except for a few which can be inferred from the filename ex-
       tension such as .gsm, always -c1 -r8000 -e gsm, and named  raw  formats
       like .f32, .s16 and .ul which give the encoding but not the sample rate
       or number of channels.

       The  following  four characteristics are used to describe the format of
       audio data:

       sample rate
              The sample rate in samples per second (`Hertz' or `Hz').   Digi-
              tal  telephony traditionally uses a sample rate of 8000Hz (8kHz)
              though 16 and even 32kHz are becoming more common. Audio Compact
              Discs use 44100Hz (44.1kHz), Digital Audio Tape  and  many  com-
              puter systems use 48kHz and professional audio systems often use
              96kHz.

       sample size
              The  number of bits used to store each sample.  Today, 16-bit is
              commonly used, 8-bit was popular in the early days  of  computer
              audio and 24-bit is used in the professional audio arena.

       data encoding
              The  way  in  which  each  audio  sample is represented (or `en-
              coded').  Some encodings have variants with  different  byte-or-
              derings  or  bit-orderings, some compress the audio data so that
              it takes up less disk space or transmission bandwidth  than  un-
              compressed formats.  Commonly-used encoding types include float-
              ing point, <mu>-law, ADPCM, signed-integer PCM, MP3 and FLAC.

       channels
              The  number  of  audio  channels  contained  in  the  file.  One
              (`mono') and two (`stereo') are widely used and `surround sound'
              audio typically contains six or more channels.

       The term `bit rate' is a measure of the amount of storage  occupied  by
       an  encoded audio signal per unit of time.  It can depend on all of the
       above and is typically denoted as  a  number  of  kilobits  per  second
       (kbps).   An  A-law telephony signal has a bit rate of 64 kbps, MP3-en-
       coded stereo music typically has a bit rate of 128-196 kbps  and  FLAC-
       encoded stereo music typically has a bit rate of 550-760 kbps.

       Most self-describing formats also allow textual `comments' to be embed-
       ded  in  the  file  that can be used to describe the audio in some way,
       e.g. for music, the title, the author, etc.

       One important use of audio file comments is to convey `Replay Gain' in-
       formation.  SoX can apply Replay Gain automatically  for  formats  that
       contain comments but does not generate it.  By default, SoX copies com-
       ments  from the first input file to output files that support comments,
       so output files may contain Replay Gain information which is incorrect.
       This can be fixed, when converting input files with  --replay-gain  en-
       abled, by removing all comments using --comment "" or removing just the
       REPLAYGAIN comment with

       soxi_ng -a in.au | grep -v REPLAYGAIN > comments
       sox_ng --replay-gain=track in.au --comment-file comments out.au


   Determining and setting the File Format
       SoX  uses several mechanisms to determine or set the format of an audio
       file.  Depending on the circumstances, individual  characteristics  may
       be determined or set using different mechanisms.

       To  determine the format of an input file, SoX uses, in order of prece-
       dence and as given or available:

       1.  Command-line format options,

       2.  The contents of the file header,

       3.  The filename extension.

       To set the output file format, SoX uses, in order of precedence and  as
       given or available:

       1.  Command-line format options,

       2.  The filename extension,

       3.  The  input  file format characteristics or the closest that is sup-
           ported by the output file type.

       For all files, SoX exits with an error if the file type cannot  be  de-
       termined.  Command-line  format options may need to be added or changed
       to resolve the problem.

   Playing & Recording Audio
       The play_ng and rec_ng commands are provided so that basic playing  and
       recording is as simple as

          play_ng existing-file.wav

       and

          rec_ng new-file.wav

       These two commands are functionally equivalent to

          sox_ng existing-file.wav -d

       and

          sox_ng -d new-file.wav

       Further  options  and  effects (as described below) can be added to the
       commands in either form.

                                 *        *        *

       When playing a file with a sample rate that is not supported by the au-
       dio output device, SoX automatically invokes the rate effect to perform
       the necessary sample rate conversion.  For compatibility with old hard-
       ware, the default rate quality level is  set  to  `low'.  This  can  be
       changed by explicitly specifying the rate effect with a different qual-
       ity level, e.g.

          play_ng ... rate -m

       or by using the --play-rate-arg option (see below).

                                 *        *        *

       On  some  systems,  SoX allows the audio playback volume to be adjusted
       while using play_ng.  Where supported, this is achieved by tapping  the
       `v'  &  `V'  keys  during playback. If there is a softvol effect in the
       chain, these keys will adjust that instead of the hardware mixer.

       To help with setting a suitable recording level, SoX  includes  a  peak
       level  meter  which can be invoked (before making the actual recording)
       as follows:

          rec_ng -n

       The recording level should be adjusted (using the system-provided mixer
       program, not SoX) so that the meter  is,  at  most,  occasionally  full
       scale  and  never `in the red' (an exclamation mark is shown).  See the
       -S (--show-progress) option below.

   Accuracy
       Many file formats that compress audio discard some of the audio  signal
       information.  Converting  to such a format and back again will not pro-
       duce an exact copy of the original audio.  This is the  case  for  many
       formats  used in telephony (e.g. A-law, GSM) where low signal bandwidth
       is more important than high audio fidelity  and  for  formats  used  in
       portable  music  players (e.g. MP3, Ogg Vorbis) where adequate fidelity
       can be retained even with the large compression ratios that are  needed
       to make portable players practical.

       Formats that discard audio signal information are called `lossy'.  For-
       mats  that do not are called `lossless'.  The term `quality' is used as
       a measure of how closely the original audio signal is  reproduced  when
       using a lossy format.

       Audio  file  conversion  with SoX is lossless when it can be, i.e. when
       not using lossy compression, when not reducing  the  sampling  rate  or
       number  of channels and when the number of bits used in the destination
       format is not less than in the source format.  For example,  converting
       from  an  8-bit  PCM format to a 16-bit PCM format is lossless but con-
       verting from an 8-bit PCM format to (8-bit) A-law isn't.

       SoX converts all audio files to an internal, uncompressed 32-bit format
       before performing any audio processing. This means that manipulating  a
       file that is stored in a lossy format can cause further losses in audio
       fidelity.  E.g. with

          sox_ng long.mp3 short.mp3 trim 10

       SoX first decompresses the input MP3 file, then applies the trim effect
       and finally creates the output MP3 file by recompressing the audio with
       a possible reduction in fidelity above that which occurred when the in-
       put  file was created.  Hence, if what is ultimately desired is lossily
       compressed audio, it is best to  perform  all  audio  processing  using
       lossless  file formats and then convert to the lossy format only at the
       final stage.  Applying multiple effects with a  single  SoX  invocation
       will, in general, produce more accurate results than those produced us-
       ing multiple SoX invocations.

   Dithering
       Dithering  is  a  technique used to maximize the dynamic range of audio
       stored at a particular bit-depth. Any distortion introduced by  quanti-
       zation  is  decorrelated by adding a small amount of white noise to the
       signal.  In most cases, SoX can determine whether the selected process-
       ing requires dither and will add it during output formatting if  appro-
       priate.

       By  default,  SoX  automatically  adds TPDF dither when the output bit-
       depth is less than 24 and any of the following are true:

       o   bit-depth reduction has been specified explicitly using a  command-
           line option

       o   the  output file format supports only bit-depths lower than that of
           the input file format

       o   an effect has increased the effective bit-depth within the internal
           processing chain

       For example, adjusting the volume with vol 0.25 requires two additional
       bits in which to losslessly  store  its  results  (since  0.25  decimal
       equals 0.01 binary) so, if the input file bit-depth is 16, SoX's inter-
       nal  representation  will  use  18  bits  after  processing this volume
       change.  In order to store the output at the same depth as  the  input,
       dithering is used to remove the additional bits.

       Use  the  -V option to see what processing SoX has automatically added.
       The -D (--no-dither) option may be given to override automatic  dither-
       ing.   To  invoke  dithering  manually  (e.g. to select a noise-shaping
       curve) use the dither effect.

   Clipping
       Clipping is distortion that occurs when an audio signal level (or `vol-
       ume') exceeds the range of the chosen representation.  In  most  cases,
       clipping  is  undesirable  and  so should be corrected by adjusting the
       level prior to the point in the processing chain at which it occurs.

       In SoX, clipping can happen when using the vol or gain effects  to  in-
       crease  the  audio  volume. Clipping can also occur with many other ef-
       fects, when converting one format to another and even when simply play-
       ing the audio.

       Playing an audio file often involves resampling and processing by  ana-
       log components and that can introduce a DC offset or amplification, all
       of which can produce distortion if the audio signal level was initially
       too close to the clipping point.

       For these reasons, it is usual to make sure that an audio file's signal
       level  has  some `headroom', i.e. it does not exceed a particular level
       below the maximum possible level of  the  given  representation.   Some
       standards  bodies recommend as much as 9dB headroom, but in most cases,
       3dB (~~ 70% linear) is enough.  Note that this  wisdom  seems  to  have
       been  lost  in  modern  music production; in fact, many CDs, MP3s, etc.
       are now mastered at levels above 0dBFS and the audio is clipped as  de-
       livered.

       SoX's stat and stats effects can assist in determining the signal level
       of  an  audio file. The gain or vol effect can be used to prevent clip-
       ping, e.g.

          sox_ng dull.wav bright.wav gain -6 treble +6

       guarantees that the treble boost will not clip.

       If clipping occurs at any point during processing, SoX displays a warn-
       ing message to that effect.

       See the global -G (--guard) option and the gain and norm effects.

   Input File Combining
       SoX's input combiner can be configured with the --combine global option
       to combine multiple files using one of the following methods:  concate-
       nate,  sequence, mix, mix-power, merge and multiply, with shorthands -m
       for --combine mix, -M for merge and -T for multiply.

       The default is sequence for play_ng, and concatenate for sox_ng.

       For methods other than sequence, multiple input  files  must  have  the
       same  sampling rate. If necessary, separate SoX invocations can be used
       to make sampling rate adjustments prior to combining them.   and,  with
       concatenate,  the  input  files must also have the same number of chan-
       nels.

       The sequence combining method is similar to concatenate in that the au-
       dio from each input file is sent serially to the output file but  here,
       the  output  file  may be closed and reopened at the transition between
       input files.  This may be just what is needed  when  sending  different
       types of audio to an output device but is not generally useful when the
       output is a normal file.

       With  the mix or mix-power combining methods, the number of channels in
       each input file need not be the same but SoX issues a warning  if  they
       are  not  and  some  channels in the output file will not contain audio
       from every input file.  A mixed audio file cannot  be  unmixed  without
       reference to the original input files.

       If  the  merge  combining  method is selected the number of channels in
       each input file need not be the same and a merged audio file  comprises
       all  channels  from all the input files and unmerging is possible using
       multiple invocations of SoX with the remix effect.   For  example,  two
       mono  files  could  be merged to form one stereo file and the first and
       second mono files would become the  left  and  right  channels  of  the
       stereo file.

       The  multiply  combining  method multiplies the sample values of corre-
       sponding channels treated as numbers in the interval -1 to +1.  If  the
       number  of  channels  in  the  input files is not the same, the missing
       channels will contain silence.

       When combining input files, SoX applies any specified effects  (includ-
       ing, for example, the vol volume adjustment effect) after the audio has
       been combined. However, it is often useful to be able to set the volume
       of the inputs individually (i.e. `balance' them) before combining takes
       place.  For all combining methods, input file volume adjustments can be
       made  manually  using the -v option, which can be given for one or more
       input files. If it is given for only some of the input files, the  oth-
       ers  receive  no  volume  adjustment.  In some circumstances, automatic
       volume adjustments may be applied.  The global -V option can be used to
       show the input file volume adjustments that have been selected manually
       or automatically.

       Some special considerations need to made when mixing input files:

       Unlike the other methods, mix combining can cause clipping in the  com-
       biner if no balancing is performed.  In this case, if manual volume ad-
       justments are not given, SoX tries to ensure that clipping does not oc-
       cur  by  automatically  adjusting  the volume (amplitude) of each input
       signal by a factor of ^1/n, where n is the number of input  files.   If
       this  results  in  audio that is too quiet or otherwise unbalanced, the
       input file volumes can be set manually with -v. Using the  norm  effect
       on the mix is another alternative.

       If  mixed  audio seems loud enough at some points but too quiet in oth-
       ers, dynamic range compression can be applied to correct this - see the
       compand effect.

       With the mix-power combine method, the mixed  volume  is  approximately
       equal to that of one of the input signals.  This is achieved by balanc-
       ing  using a factor of ^1/<sqrt>n instead of ^1/n.  Note that this bal-
       ancing factor does not guarantee that clipping will not occur  but  the
       number  of  clips  will  usually be low and the resulting distortion is
       usually imperceptible.

   Output Files
       SoX's default behaviour is to take one or more input  files  and  write
       them to a single output file.

       This  behaviour  can  be  changed  by placing the newfile pseudo-effect
       within the effects list and SoX will enter multiple output mode.

       In multiple output mode, a new file is created when the  effects  prior
       to  the  newfile indicate that they are done.  The effects chain listed
       after newfile is then started up and its output is  saved  to  the  new
       file.

       In  multiple  output mode, a unique number is appended automatically to
       all filenames and, if the filename has an extension, the number is  in-
       serted before the extension.  This behaviour can be customized by plac-
       ing  %n in the filename where the number should be substituted.  An op-
       tional number can be placed after the % to indicate a minimum width for
       the number with leading zeroes.  %n defaults to two digits or, if no %n
       is included, to three digits before the filename extension.

       Multiple output mode is not very useful unless an effect that stops the
       effects chain is specified before newfile. If the end of  the  file  is
       reached  before  the  effects chain stops, no new file is created as it
       would be empty.

       The following is an example of splitting the first 60 seconds of an in-
       put file into two 30 second files and ignoring the rest.

          sox_ng song.wav ringtone%1n.wav trim 0 30 : newfile : trim 0 30


   Stopping SoX
       Usually, SoX completes its processing and exits automatically  once  it
       has read all audio data from the input files.

       It  can  also  be terminated earlier by sending it an interrupt signal,
       usually by pressing the keyboard interrupt key which is normally  Ctrl-
       C.   This  is  required in some circumstances such as when using SoX to
       make a recording.  When SoX is playing multiple files,  Ctrl-C  behaves
       slightly differently: pressing it once skips to the next file; pressing
       it twice in quick succession causes SoX to exit.

       Another  way  to  stop  processing early is to use an effect that has a
       time period or sample count; the trim effect is  an  example  of  this.
       Once all effects chains have stopped, SoX stops.

FILENAMES
       Filenames  can  be  simple file names, relative or absolute path names,
       URLs (for input files only) or special filenames.  URL support requires
       one of the wget, wget2 or curl programs to be installed.

       Giving SoX an input or output filename that is the same as the name  of
       a  SoX effect does not work since SoX will treat it as an effect speci-
       fication.  You can work around this by calling  the  file  ./chorus  on
       Unix  or  .\chorus on MS/DOS but it is not usually a problem since most
       audio filenames have a filename extension after  a  dot,  which  effect
       names do not.

       Using  the same file name as an input and an output is unlikely to work
       as intended because it is likely to truncate the  file  before  reading
       all of it.

   Special Filenames
       The  following  filenames may be used in certain circumstances in place
       of a normal filename:

       -      SoX can be used in simple pipeline operations by using the  spe-
              cial  filename  `-'  which, if used as an input filename, causes
              SoX to read audio data from the `standard input' (stdin) and, if
              used as the output filename, will cause SoX to send  audio  data
              to  the  `standard output' (stdout).  When using this option for
              the output file, and sometimes when using it for an input  file,
              the file type (see -t below) must also be given.

       "|program [options] ..."
              An  initial  `pipe' character specifies that the given command's
              standard output (stdout) should be used as an input  file.   Un-
              like the special filename -, this can be used for several inputs
              to  one SoX command.  For example, if a program genw generates a
              mono WAV signal on its standard output,  the  following  command
              makes a stereo file from two generated signals:

                 sox_ng -M "|genw --imd -" "|genw --thd -" out.wav

              For  headerless  (raw) audio and some other formats, -t needs to
              be given before the input command.

       "wildcard-filename"
              Specifies that filename `globbing' (wildcard matching) should be
              performed by SoX instead of by the shell if  the  wildcard-file-
              name  contains  the characters *, ? or characters ranges such as
              [A-Z].  This allows a single set of file options to  be  applied
              to a group of files.  For example, if the current directory con-
              tains three files, file1.vox, file2.vox and file3.vox,

                 play_ng --rate 6k *.vox

              is expanded by the shell (in most environments) to

                 play_ng --rate 6k file1.vox file2.vox file3.vox

              which  only  treats the first `vox' file as having a sample rate
              of 6k.  With

                 play_ng --rate 6k "*.vox"

              the given sample rate option is applied to all the files.

              If you do not want SoX to glob the filenames, you  can  use  the
              option  --no-glob  before  each  filename  that  should  not  be
              globbed, which is necessary if you need to process  files  whose
              names contain wildcard characters.

       -p, --sox-pipe
              This  can be used in place of an output filename to specify that
              its output will be used as the input  to  another  SoX  command.
              For example, in the command:

                 play_ng "|sox_ng -n -p synth 2" "|sox_ng -n -p synth 2 tremolo 10"

              play_ng  thinks  it's  playing two files in succession that come
              from pipes, but in fact they both come from other invocations of
              SoX, each with different effects.

              -p is in fact an alias for `-t sox -'.

       -d, --default-device
              This can be used in place of an  input  or  output  filename  to
              specify  that  the  default  audio device (if one has been built
              into SoX) is to be used.  This is akin  to  invoking  rec_ng  or
              play_ng as described above.

       -n, --null
              This  can  be  used  in  place of an input or output filename to
              specify that a `null file' is to be  used.   Here,  `null  file'
              refers to a SoX-specific mechanism and is not related to any op-
              erating system mechanism with some special name.

              Using  a  null  file as an input is equivalent to using a normal
              audio file that contains an infinite amount of silence  and,  as
              such,  is  not  generally useful unless used with an effect that
              specifies a finite time length such as trim or synth.

              Using a null file as an output amounts to discarding  the  audio
              and is mainly useful with effects that produce information about
              the  audio  instead  of affecting it such as noiseprof, stat and
              spectrogram.

              The sampling rate associated with a  null  file  is  by  default
              48kHz  but,  as with a normal file, this can be overridden using
              format options such as -r (see below).

   Supported File and Audio Device Types
       See soxformat_ng(7) for a list and description of  the  supported  file
       formats and audio device drivers.

OPTIONS
   Global Options
       These  options can be specified on the command line at any point before
       the first effect name.

       The SOX_OPTS environment variable can be used  to  provide  alternative
       default values for SoX's global options.  See ENVIRONMENT (below).

       --buffer bytes, --input-buffer bytes
              Set  the  size in bytes of the buffers used for processing audio
              (default 8192).  --buffer applies to input, effects  and  output
              processing; --input-buffer applies only to input processing, for
              which it overrides --buffer if both are given.

              Large values for --buffer may cause SoX to be become slow to re-
              spond  to  requests  to  terminate  or to skip to the next input
              file.

       --clobber
              Don't prompt before overwriting an existing file  that  has  the
              same  name  as an output file. This is the default behaviour; to
              override it, use --no-clobber.

       --combine concatenate|merge|mix|mix-power|multiply|sequence
              Select the input file combining method.  See Input File  Combin-
              ing above for a description of them.

       -D, --no-dither
              Disable  automatic  dither  -  see Dithering above.  This may be
              useful to ensure that SoX produces the same output in successive
              runs or if a file has been converted from 16 to 24 bit with  the
              intention  of  doing  some processing on it, but in fact no pro-
              cessing is needed after all and the original  16  bit  file  has
              been lost, in which case no dither is needed when converting the
              file back to 16 bits.  See the stats effect for how to determine
              the actual bit-depth of the audio within a file.

       --effects-file filename
              Read a file to obtain all effects and their arguments.  The file
              is  parsed  as if the values were specified on the command line.
              A new line can be used in place of the special : marker to sepa-
              rate effect chains.  For convenience, such markers at the end of
              the file are normally ignored; if you want to specify  an  empty
              last effects chain, use an explicit : by itself on the last line
              of  the  file.   This option causes any effects specified on the
              command line to be discarded.

       -G, --guard
              Automatically invoke the gain effect to guard against  clipping.
              E.g.

                 sox_ng -G in.au -b 16 out.au rate 44100 dither -s

              is shorthand for

                 sox_ng in.au -b 16 out.au gain -h rate 44100 gain -rh dither -s

              See -V, --norm, and the gain effect.

       -h, --help
              Show SoX's version number and usage information.

       --help-effect name
              Show  usage  information for the specified effect.  The name all
              can be used to show it for all available effects.

       --help-format name
              Show information about the specified file format.  The name  all
              can be used to show information for all supported formats.

       --i, --info
              If given as the first parameter to sox_ng, behave as soxi_ng.

       -m|-M  Equivalent to --combine mix and --combine merge respectively.

       --magic
              If SoX has been built with the optional `libmagic' library, this
              option enables its use in helping to detect audio file types.

       --multi-threaded|--single-threaded
              By default, SoX is `single threaded' but if the --multi-threaded
              option  is  given,  SoX processes audio channels for most multi-
              channel effects in  parallel  on  hyperthreading  and  multicore
              processors.

              A  larger  buffer size than the default may be needed to benefit
              more from multithreaded processing (e.g.  131072;  see  --buffer
              above)  and  setting  OMP_WAIT_POLICY=PASSIVE in the environment
              before launching SoX avoids a defect in OpenMP that can make  it
              incredibly slow if several invocations of SoX are running simul-
              taneously.

       --no-clobber
              Prompt before overwriting an existing file with the same name as
              that given for the output file.

              Unintentionally  overwriting  a  file  is  easier than you might
              think, for example, if you accidentally enter

                 sox_ng file1 file2 effect1 effect2 ...

              when what you really meant was

                 play_ng file1 file2 effect1 effect2 ...

              then, without this option, file2 will  be  overwritten.   Hence,
              using  this option is recommended and can be set in the SOX_OPTS
              environment variable (see ENVIRONMENT below).

       --norm[=dB-level]
              Automatically invoke the gain effect to guard  against  clipping
              and to normalize the audio. E.g.

                 sox_ng --norm in.au -b 16 out.au rate 44100 dither -s

              is shorthand for

                 sox_ng in.au -b 16 out.au gain -h rate 44100 gain -nh dither -s

              Optionally,  the  audio can be normalized to a given level, usu-
              ally below 0 dBFS:

                 sox_ng --norm=-3 in.au out.au

              See -V, -G and the gain effect.

       --play-rate-arg arg
              Selects a quality option to be used when the rate effect is  in-
              voked  automatically  when  playing audio.  This option is typi-
              cally set via the SOX_OPTS environment variable (see ENVIRONMENT
              below) and its default value, when playing, is -l  (low  quality
              but fast).  See the rate effect for other alternatives.

       --plot gnuplot|octave|off
              If not set to off (the default if --plot is not given), run in a
              mode that can be used in conjunction with the gnuplot program or
              the GNU Octave program to assist with the selection and configu-
              ration  of many of the transfer function-based effects.  For the
              first given effect that supports the selected plotting  program,
              SoX  outputs commands to plot the effect's transfer function and
              stops without actually processing any audio.  E.g.

                 sox_ng --plot octave input-file -n highpass 1320 > highpass.plt
                 octave highpass.plt


       -q, --no-show-progress
              Run in quiet mode when SoX wouldn't otherwise do  so.   This  is
              the  opposite  of  the -S option.  To suppress error and warning
              messages, see -V below.

       -R     Run in `repeatable' mode.  When this option is given, SoX embeds
              a time stamp in the output file if its format supports  comments
              and  will  seed  pseudo  random  number  generators,  as used by
              dither, with that number, ensuring that successive  SoX  invoca-
              tions  with  the  same  inputs and the same parameters yield the
              same output.

       --replay-gain track|album|off
              Select whether or not to apply replay gain adjustment  to  input
              files.   The  default  is  off  for sox_ng and rec_ng, album for
              play_ng when (at least) the first two  input  files  are  tagged
              with  the same Artist and Album names and track for play_ng oth-
              erwise.

       -S, --show-progress
              Display  input  file   format/header   information,   processing
              progress as a percentage of the input file(s), elapsed time, re-
              maining  time  (if known, in brackets) and the number of samples
              written to the output file.  Also shown is a peak  level  meter,
              and  an  indication  of whether clipping has occurred.  The peak
              level meter shows up to two channels and is calibrated for digi-
              tal audio as follows:

                          dB FSD   Display       dB FSD   Display
                           -25     -              -11     ====
                           -23     =               -9     ====-
                           -21     =-              -7     =====
                           -19     ==              -5     =====-
                           -17     ==-             -3     ======
                           -15     ===             -1     =====!
                           -13     ===-

              A three-second peak-held value of the headroom in dBs  is  shown
              to the right of the meter if the headroom is less than 6dB.

              This  option  is  enabled  by  default when using SoX to play or
              record audio but can be disabled with -q.

       -T     Equivalent to --combine multiply

       --temp directory
              Specify that any temporary files should be created in the  given
              directory.   This  can be useful if there are permission or free
              space problems with the default location. In  this  case,  using
              `--temp  .' (to use the current directory) is often a good solu-
              tion.

       --version
              Show SoX's version number and exit.

       -V[level]
              Set verbosity. This is particularly useful for  seeing  how  any
              automatic effects have been invoked by SoX.

              SoX  displays  messages on the console (stderr) according to the
              following verbosity levels:

              0      No messages are shown at all; use the exit status to  de-
                     termine if an error has occurred.

              1      Only  error  messages  are shown.  These are generated if
                     SoX cannot complete the requested commands.

              2      Warning messages are also shown.  These are generated  if
                     SoX  can  complete the requested commands but not exactly
                     according to the  requested  command  parameters,  or  if
                     clipping occurs.  This is the default.

              3      Descriptions  of  SoX's processing phases are also shown,
                     to see exactly how SoX is processing your audio.

              4 to 6 Messages to help with debugging SoX are also shown.

              Each occurrence of the -V option increases the  verbosity  level
              by  1.  Alternatively,  the verbosity level can be set to an ab-
              solute number by specifying it immediately after  the  -V,  e.g.
              -V0 shuts it up.

   Input File Options
       These  options  apply  to the first input filename that follows them on
       the command line.

       --ignore-length
              Override the audio length given in an audio  file's  header.  If
              this  option  is given, SoX keeps reading audio until it reaches
              the end of the input file.

       -v, --volume factor
              Intended for use when combining multiple input files,  this  op-
              tion  adjusts the volume of the file that follows it on the com-
              mand line by a factor of factor. This allows it to be `balanced'
              w.r.t. the other input files.  This is a linear (amplitude)  ad-
              justment,  so  a  number  less than 1 decreases the volume and a
              number greater than 1 increases it.  If  a  negative  number  is
              given then, in addition to the volume adjustment, the audio sig-
              nal will be inverted.

              See  the  norm, vol and gain effects, Input File Balancing above
              and the Special Filenames' section on wildcard-filenames.

   Input & Output File Format Options
       These options apply to the input or output file whose name they immedi-
       ately precede on the command line and are used mainly when working with
       headerless file formats or when specifying a format for the output file
       that is different from that of the input file.

       -b bits, --bits bits
              Set the number of bits (a.k.a. bit-depth or word length) in each
              encoded sample.  It is not applicable to complex encodings  such
              as MP3 or GSM and not necessary with encodings that have a fixed
              number of bits such as A-law, <mu>-law and ADPCM.

              For an input file, the most common use for this option is to in-
              form  SoX  of the number of bits per sample in a `raw' (`header-
              less') audio file.  For example

                 sox_ng -r 16k -e signed -b 8 input.raw output.wav

              converts a particular `raw'  file  to  a  self-describing  `WAV'
              file.

              For  an  output  file, this option can be used to set the output
              encoding size.  By default, the output encoding size is  be  set
              to the input encoding size, provided it is supported by the out-
              put file type.  For example:

                 sox_ng input.cdda -b 24 output.wav

              converts  raw  CD  digital  audio  (16-bit, signed-integer) to a
              24-bit (signed-integer) `WAV' file.

       -c CHANNELS, --channels CHANNELS
              Sets the number of audio channels in the audio file. This can be
              any number greater than zero.

              For an input file, the most common use for this option is to in-
              form SoX of the number of channels in a `raw' (headerless) audio
              file.  Occasionally, it may be useful to use this option with  a
              headered  file  to  override the (presumably incorrect) value in
              the header but this is only supported with certain  file  types.
              For example:

                 sox_ng -r 48k -e float -b 32 -c 2 input.raw output.wav

              converts a `raw' file to a self-describing `WAV' file.

                 play_ng -c 1 music.wav

              interprets  the  file  data as belonging to a single channel re-
              gardless of what is indicated in the file header and if the file
              does in fact have two channels, it is played at half speed.

              For an output file, this option provides a shorthand for  speci-
              fying  that  the  channels  effect should be invoked in order to
              change (if necessary) the number of channels in the audio signal
              to the number given.  For example, the  following  two  commands
              are equivalent:

                 sox_ng input.wav -c 1 output.wav bass -b 24
                 sox_ng input.wav      output.wav bass -b 24 channels 1

              though  the second form is more flexible as it allows effects to
              be ordered arbitrarily.

       -e ENCODING, --encoding ENCODING
              Set the audio encoding type, sometimes needed  with  file  types
              that  support  more than one encoding scheme such as raw, WAV or
              AU but not with MP3 or FLAC.  The available encoding  types  are
              as follows:

              signed-integer
                     PCM  data stored as signed (`two's complement') integers.
                     Commonly used with a 16 or  24  -bit  encoding  size.   A
                     value of 0 represents minimum signal power.

              unsigned-integer
                     PCM data stored as unsigned integers.  Commonly used with
                     an  8-bit encoding size.  A value of 0 represents maximum
                     signal power.

              floating-point
                     PCM data stored as IEEE 753 single precision (32-bit)  or
                     double  precision  (64-bit)  floating point (`real') num-
                     bers.  A value of 0 represents minimum signal power.

              a-law  International telephony standard for logarithmic encoding
                     to 8 bits per sample.  It has a precision  equivalent  to
                     roughly 13-bit PCM and is sometimes encoded with reversed
                     bit-ordering (see the -X option).

              u-law/mu-law
                     The North American telephony standard for logarithmic en-
                     coding to 8 bits per sample, a.k.a. <mu>-law has a preci-
                     sion  equivalent  to  roughly 14-bit PCM and is sometimes
                     encoded with reversed bit-ordering (see the -X option).

              oki-adpcm
                     OKI (a.k.a. VOX, Dialogic, or Intel) 4-bit  ADPCM  has  a
                     precision  equivalent  to roughly 12-bit PCM.  ADPCM is a
                     form of audio compression that makes  a  good  compromise
                     between audio quality and encoding/decoding speed.

              ima-adpcm
                     IMA  (a.k.a.  DVI) 4-bit ADPCM has a precision equivalent
                     to roughly 13-bit PCM.

              ms-adpcm
                     Microsoft 4-bit  ADPCM  has  a  precision  equivalent  to
                     roughly 14-bit PCM.

              gsm-full-rate
                     GSM  is  currently  used  for the majority of the world's
                     digital wireless telephone calls.   It  utilizes  several
                     audio  formats  with  different  bit rates and associated
                     speech quality.   SoX  has  support  for  GSM's  original
                     13kbps `Full Rate' audio format.

              Encoding  names  can  be abbreviated where this would not be am-
              biguous; e.g. unsigned-integer can be given as  un,  but  not  u
              (ambiguous with u-law).

              For an input file, the most common use for this option is to in-
              form  SoX  of  the encoding of a `raw' (`headerless') audio file
              (see the examples in -b and -c above).

              For an output file, this option can be used (perhaps with -b) to
              set the output encoding. For example:

                 sox_ng input.cdda -e float output1.wav
                 sox_ng input.cdda -b 64 -e float output2.wav

              converts a raw CD  digital  audio  (16-bit  signed  integer)  to
              floating  point  `WAV'  files of single and double precision re-
              spectively.

              If this option is not given, the output  encoding  will  be  the
              same as the input encoding, provided it is supported by the out-
              put file type.

       --no-glob
              Specifies  that  filename  `globbing' (wildcard matching) should
              not be performed by SoX on the following filename.  For example,
              if the current  directory  contains  the  two  files  `five-sec-
              onds.wav' and `five*.wav', then

                 play_ng --no-glob "five*.wav"

              can be used to play just the single file `five*.wav'.

       -r, --rate rate[k]
              Gives  the  sample  rate  in Hz (or kHz if followed by k) of the
              file.

              For an input file, the most common use for this option is to in-
              form SoX of the sample rate of a `raw' (`headerless') audio file
              (see the examples in -b and -c above).  Occasionally it  may  be
              useful to use this option with a `headered' file to override the
              value  in the header, though this is only supported with certain
              file types.  For example, if audio was recorded  with  a  sample
              rate  of 48k from a source that played back a little too slowly,
              say 1.5%,

                 sox_ng -r 48720 input.wav output.wav

              would correct the speed by changing only the  file  header  (but
              see  the  speed effect for the more usual solution to this prob-
              lem).

              For an output file, this option provides a shorthand for  speci-
              fying  that the rate effect should be invoked in order to change
              (if necessary) the sample rate of the audio signal to the  given
              value.  For example, the following two commands are equivalent:

                 sox_ng input.wav -r 48k output.wav bass -b 24
                 sox_ng input.wav        output.wav bass -b 24 rate 48k

              though  the  second  form is more flexible as it allows rate op-
              tions to be given and allows the effects to be ordered arbitrar-
              ily.

       -t, --type FILE-TYPE
              Give the type of an audio  file.   For  both  input  and  output
              files,  this option is commonly used to inform SoX of the type a
              `headerless' audio file where the actual/desired type cannot  be
              determined from the filename extension.  For example:

                 another-command | sox_ng -t mp3 - output.wav
                 sox_ng input.wav -t raw output.bin

              It  can  also  be  used to override the type implied by an input
              filename extension but, if overriding with a  type  that  has  a
              header,  SoX  exit with an error message if such a header is not
              actually present.

              There are also pseudo filetypes that tell SoX to use a specified
              format module that handles more than one type of audio file such
              as -t sndfile and -t ffmpeg or when a type of file can  be  han-
              dled by several different format modules, such as WAV files con-
              taining MP3-encoded data.

              Furthermore,  the  file  type can be used to select a particular
              audio device driver for recording and playing.

              See soxformat_ng(7) for a list of supported file types.

       -L, --endian little
       -B, --endian big
       -x, --endian swap
              These options specify whether the byte order of the  audio  data
              is,  respectively, `little endian', `big endian' or the opposite
              to that of the system on which SoX is  being  used.   Endianness
              applies  only  to data encoded as floating point or as signed or
              unsigned integers of 16 or more bits.  It is often necessary  to
              specify  one of these options for headerless files and sometimes
              necessary for (otherwise) self-describing files.   A  given  en-
              dian-setting  option  may  be  ignored  for  an input file whose
              header contains a specific endianness identifier or for an  out-
              put file that is actually an audio device.

              N.B.  Unlike other format characteristics, the endianness (byte,
              nibble,  &  bit ordering) of the input file is not automatically
              used for the output file. For example, when the following is run
              on a little-endian system:

                 sox_ng -B audio.s16 trimmed.s16 trim 2

              trimmed.s16 will be created as little-endian;

                 sox_ng -B audio.s16 -B trimmed.s16 trim 2

              must be used to preserve big-endianness in the output file.

              The -V option can be used to check the selected orderings.

       -N, --reverse-nibbles
              Specifies that the nibble ordering of the samples  (i.e.  the  2
              halves  of a byte) should be reversed, which is sometimes useful
              with ADPCM-based formats.

              See the N.B. in the section on -x above.

       -X, --reverse-bits
              Specifies that the bit ordering of the  samples  should  be  re-
              versed, which is sometimes useful with a few (mostly headerless)
              formats.

              See the N.B. in the section on -x above.

   Output File Format Options
       These options only apply to output files and may only precede an output
       filename on the command line.

       --add-comment TEXT
              Append a comment to the output file header (where applicable).

       --comment TEXT
              Specify  the  comment  text  to  store in the output file header
              (where applicable).

              SoX provides a default comment `Processed by SoX' if this option
              (or --comment-file) is not given. To  specify  that  no  comment
              should be stored in the output file, use --comment "" .

       --comment-file FILENAME
              Specify  a file containing the comment text to store in the out-
              put file header (where applicable).

       -C, --compression FACTOR
              Set the compression factor for variably-compressed  output  file
              formats.  If this option is not given then a default compression
              factor  applies.   The compression factor is interpreted differ-
              ently for different compressed file formats; see the description
              of the file formats that use this option in soxformat_ng(7)  for
              more information.

EFFECTS
       In  addition  to converting, playing and recording audio files, SoX can
       be used to invoke a number of audio effects.  Multiple effects  may  be
       applied  by  specifying  them one after the other at the end of the SoX
       command line, forming an `effects chain'.  Note that applying  multiple
       effects  in real time (i.e. when playing audio) may require a high per-
       formance computer.

       Some of the SoX effects are primarily intended to be applied to a  sin-
       gle instrument or `voice'. To facilitate this, the remix effect and the
       global  SoX option -M can be used to isolate then recombine tracks from
       a multitrack recording.

   Multiple Effects Chains
       A single effects chain is made up of one or more effects.   Audio  from
       the input runs through the chain until either the end of the input file
       is reached or an effect terminates the chain.

       SoX  supports running multiple effects chains over the input audio.  In
       this case, when one chain indicates that it is done  processing  audio,
       the  audio data is sent through the next effects chain.  This continues
       until either no more effects chains exist or the input has reached  the
       end of the file.

       Effects chains can be separated by placing a : (colon) after an effect;
       any following effects are a part of a new effects chain.

       It  is  important to place the effect that stops the chain as the first
       effect in the chain because any samples that are buffered by effects to
       the left of the terminating effect will be discarded.   The  amount  of
       samples  discarded  is  related to the --buffer option and it should be
       kept small, relative to the sample rate, if the terminating effect can-
       not be first.  Further information on stopping effects can be found  in
       the Stopping SoX section.

       There  are  a  few pseudo-effects that can help when using multiple ef-
       fects chains.  These include newfile, which starts  writing  to  a  new
       output file before moving to the next effects chain, and restart, which
       moves  back  to the first effects chain.  Pseudo-effects must be speci-
       fied as the first effect in a chain and as the only effect in  a  chain
       (i.e. they must have a : before and after them).

       Here  is  an  example  of multiple effects chains.  It splits the input
       file into multiple files, each of 30 seconds in length and each  output
       filename will have unique number in its name, as documented in the Out-
       put Files section.

          sox_ng in.au out.au trim 0 30 : newfile : restart


   Common Notation And Parameters
       In  the  descriptions that follow, [square brackets] are used to denote
       parameters that are optional, {braces} to denote those  that  are  both
       optional  and repeatable, <angle brackets> to denote those that are re-
       peatable but not optional and pipe characters `|' separate options from
       which to choose one of several alternatives.  Where applicable, default
       values for optional parameters are shown (in parentheses).

       The following parameters are used with, and have the same meaning  for,
       several effects:

       frequency
              A  frequency  in Hz or, if followed by k, in kHz or, if preceded
              by %, in semitones relative to A (440Hz); alternatively,  scien-
              tific note names (e.g. E2) may be used.

       gain   A power gain in dB.  Zero gives no gain, less than zero gives an
              attenuation and greater than zero amplifies.

       duration
              See Time Specifications below.

       position
              A  position  within the audio stream; the syntax is [=|-|+]time-
              spec, where timespec is a time specification  (see  below).  The
              optional first character indicates whether the timespec is to be
              interpreted relative to the start (=) or end (-) of the audio or
              relative to the previous position (+) if the effect accepts mul-
              tiple  positional arguments.  The audio length must be known for
              end-relative locations to work, though some effects do accept -0
              for the end of the audio even if the length is  unknown.   Which
              of  =, - and + is the default depends on the effect and is shown
              in its syntax as, e.g., position(+).

              Examples: `=2:00' is two minutes into the audio stream,  `-100s'
              is  one hundred samples before the end of the audio, `+0:12+10s'
              is twelve seconds and ten samples after  the  previous  position
              and  `-0.5+1s'  is one sample less than half a second before the
              end of the audio.

       width[h|k|o|q]
              Used to specify the bandwidth of a filter.  A number of  differ-
              ent  methods  to specify the width are available (though not all
              for every effect).  One of the characters shown may be  appended
              to select the desired method as follows:

                           Method    Notes
                      h      Hz
                      k     kHz
                      b      Hz      Old non-frequency-warped response
                      o   octaves
                      q   Q-factor   See [2]
                      s    slope

              For each effect that uses this parameter, the default method (if
              no character is appended) is the one that is listed first in the
              first line of the effect's description.

   Time Specifications
       A timespec can be given in one the following two forms:

       [[hours:]minutes:]seconds[.frac][t]
              For example, a time specification of `1:30.5' corresponds to one
              minute,  thirty  and  1/2  seconds.  The component values do not
              have to be  normalized;  e.g.   `1:23:45',  `83:45',  `79:0285',
              `1:0:1425', `1::1425' and `5025' are all equivalent.

       sampless
              Specifies  the  number  of samples directly, as in `8000s'.  For
              large sample counts, e notation is supported:  `1.7e6s'  is  the
              same as `1700000s'.

       Time  specifications  can  also  be chained with + or - into a new time
       specification where the right part is added to or subtracted  from  the
       total  so far.  For example, `3:00-200s' means two hundred samples less
       than three minutes.

       If a time specification is a plain whole number with no t or s  suffix,
       whether  it  is taken as a number of seconds or a number of samples de-
       pends on the effect in question. At present, it  always  means  seconds
       except for the duration parameters of the silence effect.

   Supported Effects
       To  see whether SoX has support for an optional effect, enter sox_ng -h
       and look for its name in the EFFECTS list; a categorized  list  of  the
       effects can be found in the accompanying README file.

       allpass [-1|-2] frequency [width[h|k|o|q]]
              Apply  a  two-pole  all-pass  filter with central frequency fre-
              quency and filter width width.  An all-pass filter  changes  the
              audio's  frequency  to  phase  relationship without changing its
              frequency to amplitude relationship.  The filter is described in
              detail in [1].

              -1 or -2 use an experimental 1-pole or 2-pole filter,  in  which
              case width does not apply.

              This effect supports the --plot global option.

       band [-n] frequency [width[h|k|o|q]]
              Apply  a  band-pass  filter.  The frequency response drops loga-
              rithmically around the center frequency.   The  width  parameter
              gives  the  slope  of  the  drop: the frequencies at frequency +
              width and frequency - width will have half their original ampli-
              tudes.  Its default value is half of the center frequency.

              band defaults to a mode oriented to pitched audio,  i.e.  voice,
              singing  or  instrumental music.  The -n (for noise) option uses
              the alternate mode for unpitched audio (e.g. percussion), though
              -n introduces a power gain of about 11dB in the filter,  so  be-
              ware  of output clipping.  band introduces noise in the shape of
              the filter, peaking at the center frequency and settling  around
              it.

              This effect supports the --plot global option.

              See sinc for a band-pass filter with steeper shoulders.

       bandpass|bandreject [-c] frequency width[h|k|o|q|b]
              Apply  a  two-pole  Butterworth  band-pass or band-reject filter
              with central  frequency  frequency,  and  (3dB-point)  bandwidth
              width.   The  -c  option  applies only to bandpass and selects a
              constant skirt gain (peak gain = Q) instead of  the  default,  a
              constant  0dB peak gain.  The filters roll off at 6dB per octave
              (20dB per decade) and are described in detail in [1].

              These effects support the --plot global option.

              See sinc for a band-pass filter with steeper shoulders.

       bass|treble gain [frequency [width[s|h|k|o|q]]]
              Boost or cut the bass (lower) or treble (upper)  frequencies  of
              the audio using a two-pole shelving filter with a response simi-
              lar  to  that  of a standard hifi's tone controls.  This is also
              known as shelving equalization.

              gain gives the gain at 0Hz for bass or, for treble, whichever is
              the lower of ~22kHz and the Nyquist frequency.  Its useful range
              is about -20 (for a large cut) to +20 (for a large boost).   Be-
              ware of Clipping when using a positive gain.

              The  filter can be fine-tuned using the following optional para-
              meters:

              frequency sets the filter's central frequency and so can be used
              to extend or reduce the frequency range to be  boosted  or  cut.
              The default values are 100Hz for bass and 3kHz for treble.

              width determines how steep the filter's shelf transition is.  In
              addition to the common width specification methods, `slope' (the
              default)  may be used.  Its useful range is about 0.3 for a gen-
              tle slope to 1 (the maximum) for a steep slope; and its  default
              value is 0.5.

              The filters are described in detail in [1].

              These effects support the --plot global option.

              See equalizer for a peaking equalization effect.

       bend [-f frame-rate(25)] [-o oversampling(16)]
              {start-position(+),cents,end-position(+)}

              Changes  the pitch by specified amounts at specified times with-
              out changing  the  duration.   Each  given  triple:  start-posi-
              tion,cents,end-position specifies one bend.  cents is the number
              of  cents  (100  cents = 1 semitone) by which to bend the pitch.
              The other values specify the points in time at  which  to  start
              and  end  bending  the  pitch.   During each bend, the frequency
              changes logarithmically, i.e. by the same number  of  cents  per
              second.

              The  pitch bending algorithm uses the Discrete Fourier Transform
              (DFT) at a particular frame rate and oversampling rate.  The  -f
              (from  10 to 80) and -o (from 4 to 32) parameters may be used to
              adjust these parameters and thus control the smoothness  of  the
              changes in pitch.

              For  example,  an  initial  tone  is  generated, then bent three
              times, yielding four different notes in total:

                 play_ng -n synth 2.5 sin 667 gain 1 \
                   bend .35,180,.25  .15,740,.53  0,-520,.3

              Here, the first bend runs from 0.35 to 0.6 seconds and the  sec-
              ond  one from 0.75 to 1.28 seconds.  Note that the clipping that
              is produced in this example is deliberate;  to  remove  it,  use
              gain -5 in place of gain 1.

              See pitch.

       biquad b0 b1 b2 a0 a1 a2
              Apply  a  biquad Infinite Impulse Response filter with the given
              coefficients, where b* and a* are the numerator and  denominator
              coefficients respectively.

              See http://en.wikipedia.org/wiki/Digital_biquad_filter (where a0
              = 1).

              This effect supports the --plot global option.

       channels channels
              Invoke  a  simple  algorithm to change the number of channels in
              the audio signal to the given number: mixing if  decreasing  the
              number  of  channels  or duplicating if increasing the number of
              channels.

              The channels effect is invoked automatically if SoX's -c  option
              specifies  a number of channels that is different to that of the
              input file(s).  Alternatively, if this effect is  given  explic-
              itly,  SoX's -c option need not be given.  For example, the fol-
              lowing two commands are equivalent:

                 sox_ng input.wav -c 1 output.wav bass -b 24
                 sox_ng input.wav      output.wav bass -b 24 channels 1

              though the second form is more flexible as it allows the effects
              to be ordered arbitrarily.

              For example, when making a stereo file  quadraphonic,  the  left
              and right channels are copied into the third and fourth and when
              mixing  a  four-channel file down to stereo, the left channel is
              the mix of the first and third and the right of the  second  and
              fourth.

              See remix for an effect that allows channels to be mixed and se-
              lected arbitrarily.

       chorus [-n|-l|-q] [-s|-t] [gain-in [gain-out {delay [decay [speed
       [depth [-s|-t]]]]}]]
              Add  a chorus effect to the audio.  This can make a single voice
              sound like a chorus but can also be applied to instrumentation.

              Chorus resembles an echo effect with a short  delay  but,  while
              echo's  delay  is constant, chorus' delay varies by a sinusoidal
              or triangular modulation.

              See [3] for further discussion of the chorus effect.

              The -l flag makes chorus do linear interpolation between samples
              when the offset into the delay line is not a whole number, which
              is about 15% slower but makes it considerably less noisy and  -q
              asks  for  quadratic interpolation which is about 40% slower but
              makes it even less noisy.  -n explicitly asks for no  interpola-
              tion, the default, fast and fuzzy.

              -s  or -t before the stages change the default wave type for all
              of them.

              All parameters are all optional and, if missing, assume the fol-
              lowing values:

              Parameter   Range    Default   Description
              gain-in      -1-1      0.5     Proportion of input
                                             delivered clean to the adder
              gain-out     -1-1       1      Final volume adjustment
              delay       0-1000    40-60    Fixed delay in milliseconds
              decay        -1-1      0.5     Volume of delayed output
              speed       0-192k    0.25     Modulation frequency
              depth       0-1000      2      Extra delay in milliseconds
              wave        -s|-t      -s      Sinusoidal/triangular modulation

              Each delay ranges from the fixed delay to delay + depth.

              Gain-out is then applied to the sum of the input scaled by gain-
              in and the outputs from the delays scaled by their decays.

              For internal reasons regarding the speed of chorus, there  is  a
              limit of 256 chorus stages.

              A  typical delay is around 40ms to 60ms; the modulation speed is
              best near 0.25Hz and the modulation depth around 2ms.  For exam-
              ple, a single delay:

                 play_ng guitar1.wav chorus 0.7 0.9 55 0.4 0.25 2 -t

              Two delays of the original samples:

                 play_ng guitar1.wav chorus 0.6 0.9 50 0.4 0.25 2 -t \
                    60 0.32 0.4 1.3 -s

              A fuller-sounding chorus (with three additional delays):

                 play_ng guitar1.wav chorus 0.5 0.9 50 0.4 0.25 2 -t \
                    60 0.32 0.4 2.3 -t 40 0.3 0.3 1.3 -s

              flanger can do  everything  that  chorus  does  except  multiple
              stages  but  works in floating point internally instead of inte-
              gers so is slower without a floating point processor:

                 chorus -l gain-in gain-out delay decay speed depth -wave

              is equivalent to

                 flanger delay depth 0 100xdecay/gain-in speed wave 0 \
                    vol gain-out/(gain-in+decay)

              For a flow diagram of how chorus works, say sox_ng --help-effect
              chorus.

       compand attack1,decay1{,attack,decay}
              [soft-knee-dB:]in-dB1[,out-dB1]{,in-dB,out-dB}
              [gain [initial-volume-dB [delay]]]

              Compand (compress or expand) the dynamic range of the audio.

              The attack and decay parameters (in seconds) determine the  time
              over  which the instantaneous level of the input signal is aver-
              aged to determine its volume; attacks refer to increases in vol-
              ume and decays refer to decreases.  For most situations, the at-
              tack time (its response to the music getting louder)  should  be
              shorter than the decay time because the human ear is more sensi-
              tive  to  sudden  loud  music than sudden soft music.  When more
              than one pair of attack/decay parameters is specified, each  in-
              put channel is companded separately and the number of pairs must
              agree  with  the  number  of input channels.  Typical values are
              0.3,0.8 seconds.

              The second parameter is a list  of  points  on  the  compander's
              transfer function specified in dB relative to the maximum possi-
              ble  signal  amplitude.   The input values must be in a strictly
              increasing order but the transfer function does not have  to  be
              monotonically rising.  If omitted, the value of out-dB1 defaults
              to  the  same  value as in-dB1; levels below in-dB1 are not com-
              panded but may have gain applied to them.  The  point  `0,0'  is
              assumed  but  may  be overridden by `0,out-dBn'.  If the list is
              preceded by a soft-knee-dB value, then the points at where adja-
              cent line segments on the transfer function meet are rounded  by
              the  amount given.  Typical values for the transfer function are
              `6:-70,-60,-20'.

              The third (optional) parameter is an additional gain in dB to be
              applied at all points on the transfer function and  allows  easy
              adjustment of the overall gain.

              The  fourth  (optional)  parameter is an initial level to be as-
              sumed for each channel when companding starts.   This  lets  you
              supply  a  nominal  level initially so that, for example, a very
              large gain is not applied to initial signal  levels  before  the
              companding  action  has  begun  to operate: it is quite probable
              that in such an event, the  output  would  be  severely  clipped
              while  the  compander gain adjusts itself.  A typical value (for
              audio which is initially quiet) is -90 dB.

              The fifth (optional) parameter is a delay in seconds.  The input
              signal is analyzed immediately to control the compander, but  it
              is  delayed before being fed to the volume adjuster.  Specifying
              a delay approximately equal to the attack/decay times allows the
              compander to operate in a  predictive  rather  than  a  reactive
              mode.  A typical value is 0.2 seconds.

                                    *        *        *

              The  following  example  might  be used to make a piece of music
              with both quiet and loud passages suitable for listening to in a
              noisy environment such as a moving vehicle:

                 sox_ng asz.wav asz-car.wav compand 0.3,1 6:-70,-60,-20 -5 -90 0.2

              The transfer function (`6:-70,...') says that very  soft  sounds
              (below  -70dB)  remain unchanged.  This stops the compander from
              boosting the volume on `silent' passages such as  between  move-
              ments.   However, sounds in the range -60dB to 0dB (maximum vol-
              ume) are boosted so that the 60dB dynamic range of the  original
              music  is  compressed  3-to-1  into  a 20dB range, which is wide
              enough to enjoy the music but narrow enough to  get  around  the
              road  noise.   The  `6:'  selects 6dB soft-knee companding.  The
              -5 dB output gain is needed to avoid clipping (the number is in-
              exact and was derived by experimentation).  The -90 dB  for  the
              initial  volume  will work fine for a clip that starts with near
              silence and the delay of 0.2 seconds makes the  compander  react
              more quickly to sudden volume changes.

              In  the  next  example, compand is used as a noise-gate for when
              the noise is at a lower level than the signal:

                 play_ng in.au compand .1,.2 -inf,-50.1,-inf,-50,-50 0 -90 .1

              Here is another noise-gate, this time for when the noise is at a
              higher level than the signal (making it, in some  ways,  similar
              to a squelch effect):

                 play_ng in.au compand .1,.1 -45.1,-45,-inf,0,-inf 45 -90 .1

              This  effect supports the --plot global option (for the transfer
              function).

              For a flow diagram of how compand works, say  sox_ng  --help-ef-
              fect compand.

              See mcompand for a multiple-band companding effect.

       contrast [enhancement-amount(75)]
              Comparable  with compression, this effect modifies an audio sig-
              nal to make it sound louder.   enhancement-amount  controls  the
              amount  of  the  enhancement and is a number in the range 0-100.
              Note that enhancement-amount = 0 still gives a significant  con-
              trast enhancement.

              See the compand and mcompand effects.

       dcshift shift [limiter-gain]
              Apply  a  DC shift to the audio.  This can be useful to remove a
              known DC offset (caused perhaps by a  hardware  problem  in  the
              recording  chain)  from the audio.  The effect of a DC offset is
              reduced headroom and hence volume.  The stat or stats effect can
              be used to determine if a signal has a DC offset.

              The given dcshift value is a floating point number in the  range
              of +-2 that indicates the amount to shift the audio (which is in
              the range of +-1).

              An  optional  limiter-gain  can be specified as well.  It should
              have a value much less than 1 (e.g. 0.05 or 0.02)  and  is  used
              only on peaks to prevent clipping.

              An  alternative  approach to removing a DC offset (albeit with a
              short delay) is to use the highpass filter effect at a frequency
              of say 10Hz, as illustrated in the following example:

                 sox_ng -n dc.wav synth 5 sin %0 50
                 sox_ng dc.wav fixed.wav highpass 10


       deemph Apply Compact Disc (IEC 60908) de-emphasis with a treble attenu-
              ation shelving filter.

              Pre-emphasis was applied in the mastering of some CDs issued  in
              the early 1980s.  These included many classical music albums, as
              well  as  now sought-after issues of albums by The Beatles, Pink
              Floyd and others.  Pre-emphasis should be  removed  at  playback
              time  by  a de-emphasis filter in the playback device.  However,
              not all modern CD players have this filter and very  few  PC  CD
              drives have it; playing pre-emphasized audio without the correct
              de-emphasis filter results in audio that sounds harsh and is far
              from what its creators intended.

              With  the  deemph  effect, it is possible to apply the necessary
              de-emphasis to audio that has been extracted from  a  pre-empha-
              sized  CD  and then either burn the de-emphasized audio to a new
              CD (which will then play correctly on any CD player)  or  simply
              play the correctly de-emphasized audio files on the PC.  For ex-
              ample:

                 sox_ng track1.wav track1-deemph.wav deemph

              and then burn track1-deemph.wav to CD, or

                 play_ng track1-deemph.wav

              or simply

                 play_ng track1.wav deemph

              The  de-emphasis  filter is implemented as a biquad and requires
              the input audio sample rate to be either 44.1kHz or 48kHz.   Its
              maximum  deviation from the ideal response is only 0.06dB (up to
              20kHz).

              This effect supports the --plot global option.

       delay {position(=)}
              Delay zero or more audio channels such that they  start  at  the
              given position.

              For  example, delay 1.5 +1 3000s delays the first channel by 1.5
              seconds, the second channel by 2.5 seconds (one second more than
              the previous channel), the third channel  by  3000  samples  and
              leaves  other channels undelayed.  The following (one long) com-
              mand plays a chime sound:

                 play_ng -n synth -j 3 sin %3 sin %-2 sin %-5 sin %-9 \
                   sin %-14 sin %-21 fade h .01 2 1.5 delay \
                   1.3 1 .76 .54 .27 remix - fade h 0 2.7 2.5 norm -1

              and this an arpeggiated guitar chord:

                 play_ng -n synth pl G2 pl B2 pl D3 pl G3 pl D4 pl G4 \
                   delay 0 .05 .1 .15 .2 .25 remix - fade 0 4 .1 norm -1

              With no parameters it does nothing.  To delay  all  channels  by
              the same amount, use the pad effect.

       dither [-S|-s|-f filter] [-a] [-p precision]
              Apply  dithering  to  the  audio.  Dithering deliberately adds a
              small amount of noise to the signal in  order  to  mask  audible
              quantization effects that can occur if the output sample size is
              less than 24 bits.  With no options, this effect adds TPDF white
              noise.

              The  -S  option selects a slightly `sloped' TPDF, biased towards
              higher frequencies.  It can be used at any  sampling  rate  but,
              below  ~~22kHz,  plain  TPDF  is  probably  better and, above ~~
              37kHz, noise-shaping (if available) is probably better.

              The -s option enables noise-shaping with the shibata filter (the
              same as -f shibata) and with the -f option it is possible to se-
              lect a particular noise-shaping filter from the following  list:
              lipshitz,  f-weighted, modified-e-weighted, improved-e-weighted,
              gesemann,       shibata,       low-shibata,        high-shibata,
              shibata-(A|B)(0|1|2|3|4|5|6)  and shibata-A-saturated.  The lat-
              ter shibata- ones use the new  shaper  coefficients  from  Naoki
              Shibata's SSRC package, described at https://shibatch.org/ssrc

              The  filter types are distinguished by the following properties:
              audibility of noise, level of (inaudible, but  in  some  circum-
              stances  problematic) shaped high frequency noise and processing
              speed and they are available for the following sample rates:

                  Filter                Sample rates
                  lipshitz              44100
                  e- and f-weighted     48000
                  gesemann              44100, 48000
                  shibata               8000, 11025, 16000, 22050
                                        32000, 37800, 44100, 48000
                  low-shibata           44100, 48000
                  high-shibata          44100
                  shibaba-A0 and A1     8000, 11025, 22050, 44100,
                                        48000, 88200, 96000, 192000
                  shibata-A2            44100, 48000, 88200, 96000, 192000
                  shibata-A3 to A6      44100, 48000
                  shibata-B0 to B6      44100, 48000
                  shibata-A-saturated   8000, 11025, 22050

              The -a option enables a mode where dithering (and  noise-shaping
              if  applicable) are automatically enabled only when needed.  The
              most likely use for this is when applying fade in or out  to  an
              already  dithered  file, so that the redithering applies only to
              the faded portions.  However, auto dithering is  not  foolproof,
              so  the  fades should be checked carefully for any noise modula-
              tion; if this occurs, then either redither the whole file or use
              trim and fade and concatenate the results.

              The -p option overrides the target precision in bits and can  be
              from 1 to 24.

              If the SoX global option -R option is not given, the pseudo-ran-
              dom number generator used to generate the white noise is reseed-
              ed,  i.e. the generated noise will be different on every invoca-
              tion.

              If the target precision is 1-bit, the sdm effect is applied  au-
              tomatically with default settings. Invoke it manually to control
              its options.

              See the above section on Dithering.

       dolbyb [-e|-d] [-u upsamp] [-h] [-t gain(1.0)] [-a prec(-5.0)]
       [-f {1|2|3|4}]
              dolbyb  is a Dolby B decoder/encoder based on dolbybcsoftwarede-
              code which simulates the operation  of  a  Dolby B  en/decoder's
              electronic circuit.

              By default, dolbyb applies Dolby B decoding to its input signal;
              with  -e  it does Dolby B encoding. -d is also accepted but only
              for symmetry, as it is the default mode of operation.

              -u sets the upsampling ratio to use in the sliding filter.  Dig-
              ital filtering only works well if the sample rate is well  above
              the  cutoff  frequency of the filter. For Dolby B's sliding fil-
              ter, that frequency can be as high as 34kHz and  this  does  not
              work  well  if  the  sample  rate is only 44.1Khz. To get around
              this, it upsamples the audio to a higher  rate  when  it  passes
              through  this  filter.   By default, -u0, the upsampling rate is
              set automatically so that the upper  sample  rate  is  at  least
              200Khz; upsampling can be switched off with -u1.

              If  -h  is  given, upsampling is used throughout the effect from
              when the audio enters to when it leaves, not just in the sliding
              filter.  As dolbyb's up/downsampling algorithm  is  simple  (re-
              peating and averaging samples) you may obtain higher quality re-
              sults by upsampling with rate before dolbyb -u1 and downsampling
              it afterwards.

              -t  ("threshold")  adjusts the gain when the audio is fed to the
              Dolby gain control circuits.  When a tape deck  is  encoding  or
              decoding  a magnetic tape, it knows the signal level at the tape
              heads but with audio files the maximum signal level may not  ac-
              curately represent the tape's maximum flux density (200nWb/m for
              cassette  tapes),  giving  erroneous results.  The -t option ad-
              justs the volume level at which the  sliding  filter  reacts  to
              overcome this.  Its default value is 1.0, which assumes that the
              maximum amplitude of the signal represents the maximum recording
              level  on  tape;  higher  values assume that it was recorded too
              quietly and values below 1.0 are for when it  was  recorded  too
              loud.

              To  begin  with, when you have little idea of what level to use,
              try a wide range of levels like 5, 10, 15 and 20.  If the result
              sounds muffled, the threshold is too low and if it seems to have
              too much treble, the threshold is too high.  Once you  know  the
              approximate  level,  you  can try more closely-spaced levels and
              listen carefully to find the best level possible.   Logic  would
              suggest listening to where tracks fade out, to see if the treble
              increases,  but  this  method  doesn't seem to work well and the
              best way seems to be to see how low the level can be set  before
              the  results  sound  dull and muffled, then choose a level a bit
              higher than this; you can just about hear the difference between
              results that differ in threshold setting by about 2.

              In decode mode, the program has to use trial and  error  to  get
              the right output sample values. -a sets how accurate it needs to
              be  before it is considered OK. A figure of 0.0 dB would mean an
              accuracy of about 1 sample value. The default is -5.0 dB,  which
              is accurate to less than one sample value.

              -f  selects  one  of  four  types of filter to use.  The program
              originally simulated an analog circuit for a Dolby B  noise  re-
              ducer. However, too much filtering in the side path was altering
              the phase of the side path audio, which caused problems when the
              side path was recombined with the main signal. Basically signals
              don't  add together very well if there is too much difference in
              the phase.  To fix this, there are now 4 filter modes with hope-
              fully less of a phase change:

              -f1    is the original method.

              -f2    is a newer method that seems to work better than 1.

              -f3    is another rearrangement which in practice  doesn't  seem
                     to be any better than 1.

              -f4    seems to work best, hence it is the default mode.

              For  further detail on these parameters and advice on digitizing
              and processing Dolby B-encoded tapes, consult the wiki pages  at
              https://codeberg.org/sox_ng/libdolbyb

       dop    DSD  over  PCM. 1-bit DSD data is packed into 24-bit samples for
              transport over non-DSD-aware links.

       downsample [factor(2)]
              Downsample the signal by an integer factor: Only  the  first  of
              each factor samples is retained, the others are discarded.

              No  decimation filter is applied. If the input is not a properly
              band-limited baseband signal, aliasing will occur. This  may  be
              desirable, e.g., for frequency translation.

              The  new  lower  sample  rate  propagates forward in the effects
              chain but, unless you specify the new sample rate with -r before
              the output filename or with a final (no-op) rate effect, it will
              be resampled back up to the original sample rate.

              For a general resampling effect  with  antialiasing,  see  rate.
              See upsample.

       earwax This  effect  takes  a 44.1kHz stereo signal and adds audio cues
              that, when listened to on headphones, move the sound stage  from
              inside your head to outside and in front of you, as if listening
              to loudspeakers.

              To see how earwax works, say sox_ng --help-effect earwax.

       echo gain-in gain-out <delay decay>
              Add  echoes to the audio.  In nature, echoes are reflected sound
              and digital echo effects emulate this and are often used to help
              fill out the sound of a single instrument or vocal.

              Gain-in controls how much of the input signal is delivered clean
              to the output, delay is the time difference in milliseconds  be-
              tween the original signal and its reflection, decay is the loud-
              ness  of the reflected signal and gain-out is a final volume ad-
              justment of the result.

              There is no limit to the number of delay/decay pairs you can use
              and gains and decays can be negative or greater than  1  if  you
              wish.

              echo extends the length of the signal by the maximum delay time.

              For  example,  this makes it sound as if there are twice as many
              instruments as are actually playing:

                 play_ng lead.aiff echo 0.8 0.88 60 0.4

              If the delay is very short, it sound like a metallic robot:  mu-
              sic:

                 play_ng lead.aiff echo 0.8 0.88 6 0.4

              A longer delay sounds like an open air concert in the mountains:

                 play_ng lead.aiff echo 0.8 0.9 1000 0.3

              One mountain more, and:

                 play_ng lead.aiff echo 0.8 0.9 1000 0.3 1800 0.25

              For  a  flow diagram of how echo works, say sox_ng --help-effect
              echo.

       echos gain-in gain-out <delay decay>

              Echos stands for `Echo in Sequel' and adds a sequence of  echoes
              to the audio.  That is, the first echo takes the input, the sec-
              ond  the  input  and the first echo, the third the input and the
              output of the second echo and so on.  A  single  echos  has  the
              same  effect  as a single echo.  Each delay decay pair gives the
              delay in milliseconds (with a minimum of one sample) and the de-
              cay of that echo.  Gain-out is a final volume multiplier applied
              to the sum of the input x gain-in  and  the  delays'  outputs  x
              their respective decays.

              echos  extends  the  length  of  the signal by the maximum delay
              time.

              For example:

              The sample is bounced twice in symmetric echos:

                 play_ng lead.aiff echos 0.8 0.7 700 0.25 700 0.3

              The sample is bounced twice in asymmetric echos:

                 play_ng lead.aiff echos 0.8 0.7 700 0.25 900 0.3

              The sample sounds as if it were played in a garage:

                 play_ng lead.aiff echos 0.8 0.7 40 0.25 63 0.3

              For a flow diagram of how echos works, say sox_ng  --help-effect
              echos.

       equalizer frequency width[q|o|h|k] gain
              Apply a two-pole peaking equalization filter.  With this filter,
              the  signal  level at and around a selected frequency can be in-
              creased or decreased while,  unlike  band-pass  and  band-reject
              filters, the level at all other frequencies is unchanged.

              frequency  gives  the  filter's  central  frequency in Hz, width
              gives its bandwidth and gain the required gain or attenuation in
              dB.  Beware of Clipping when using a positive gain.

              In order to produce complex equalization curves, this effect can
              be given several times, each with a different central frequency.

              The filter is described in detail in [1].

              This effect supports the --plot global option.

       fade [type] fade-in-length [stop-position(=) [fade-out-length]]
              Apply a fade effect to the beginning, end, or both of the audio.

              An optional type can be specified to select  the  shape  of  the
              fade  curve:  q  for  quarter  of a sine wave, h for half a sine
              wave, t for linear (`triangular') slope, l for logarithmic,  and
              p for inverted parabola. The default is logarithmic.

              A fade-in starts from the first sample and ramps the signal lev-
              el  from 0 to full volume over the time given as fade-in-length.
              Specify 0 if no fade-in is wanted.

              For a fade-out, the audio is truncated at stop-position and  the
              signal level is ramped from full volume down to 0 over an inter-
              val  of  fade-out-length  before the stop-position. If fade-out-
              length is not specified, it defaults to the same value as  fade-
              in-length.   No  fade-out  is  performed if stop-position is not
              specified.  If the audio length can be determined from the input
              file header and any previous effects, then `-0' (or, for histor-
              ical reasons, `0') may be specified for stop-position  to  indi-
              cate  the  usual  case of a fade out that ends at the end of the
              input audio stream.

              See the splice effect.

       fir [coefs-file|coef <coef>]
              Use SoX's FFT convolution engine with given Finite  Impulse  Re-
              sponse  filter  coefficients.  If a single argument is given, it
              is the name of a file containing the filter coefficients  (white
              space  separated;  may contain `#' comments). If the filename is
              `-' or if no argument is given, the coefficients are  read  from
              the  `standard  input'  (stdin);  otherwise, coefficients may be
              given on the command line.  Examples:

                 sox_ng in.au out.au fir .0195 -.082 .234 .891 -.145 .043

                 sox_ng in.au out.au fir coefs.txt

              with coefs.txt containing

                 # HP filter: freq=10000
                   1.2311233052619888e-01
                  -4.4777096106211783e-01
                   5.1031563346705155e-01
                  -6.6502926320995331e-02

              This effect supports the --plot global option.

       firfit [knots-file|<freq gain>]
              Use SoX's FFT convolution engine to make a filter whose frequen-
              cy response approximates a spline passing through  a  series  of
              frequency/gain  pairs.  If a single argument is given, it is the
              name of a file containing the knots (white space separated;  may
              contain  `#'  comments).   If the given filename is `-' or if no
              argument is given, the knots are read from the `standard  input'
              (stdin); otherwise, knots may be given on the command line.

              Gains  are  in dB and the knot frequencies must be in increasing
              order.

              Examples:

                 sox_ng in.au out.au firfit 20 0 10000 -3

              gives a gentle low-pass filter and

                 sox_ng in.au out.au firfit knots.txt

              with knots.txt containing

                 # Approximate telephone response
                 300  -100
                 400   -10
                 480     0
                 2800    0
                 3000  -10
                 3400 -100

              approximates the response of a carbon microphone telephone.

              This effect supports the --plot global option.

       flanger [-n|-l|-q] [-s|-t] [delay(0) [depth(2) [regen(0) [width(71)
       [speed(0.5) [shape(sine)] [phase(25) [interp(linear)]]]]]]]
              Apply a flanging effect to the audio.  See [3]  for  a  detailed
              description of flanging.

              The  parameters give the base delay and the added swept delay in
              milliseconds, the percentage of regeneration (the delayed signal
              feedback), width the percentage of delayed signal that is  mixed
              with  the  original,  speed the number of sweeps per second, the
              shape of the swept wave (sine or triangle),  the  percentage  of
              phase shift of the swept wave in multichannel flanges (0 = 100 =
              the  same  phase  on each channel) and the type of digital delay
              line interpolation (none, linear or quadratic).

              sine, triangle, none, linear and quadratic can be abbreviated.

              The input and the delay's output are mixed and balanced so  they
              don't clip, so a width of 100 gives 50:50 mixing; to obtain only
              the delayed output and none of the input, specify width as inf.

              Despite  containing  a delay, flanger does not extend the length
              of the signal so, if you also want the last dregs of the delayed
              output and feedback, pad the signal beforehand.

              sine, triangle, none, linear and quadratic  can  be  abbreviated
              and  -s, -t, -n, -l and -q are alternative ways to set the wave-
              shape and the interpolation type without having to  specify  the
              rest of the parameters.

              For  a  flow diagram of how flanger works, say sox_ng --help-ef-
              fect flanger.

       gain [-e|-B|-b|-r] [-n] [-l|-h] [gain-dB(0)]
              Apply amplification or attenuation to the audio  signal  or,  in
              some  cases,  to  some of its channels.  Note that use of any of
              -e, -B, -b, -r and -n requires temporary file space to store the
              audio to be  processed,  so  may  be  unsuitable  for  use  with
              streamed audio.

              Without other options, gain-dB adjusts the signal power level by
              the given number of dB: positive amplifies (beware of clipping),
              negative attenuates.  With other options, the gain-dB amplifica-
              tion or attenuation is applied after the processing due to those
              options.

              With the -e option, the levels of the audio channels of a multi-
              channel file are equalized, i.e. gain is applied to all channels
              other than that with the highest peak level so that all channels
              attain the same peak level (but, without also giving -n, the au-
              dio is not normalized).

              The  -B  (balance) option is similar to -e, but with -B, the RMS
              level is used instead of the peak level.  -B might  be  used  to
              correct stereo imbalance caused by an imperfect record turntable
              cartridge.  Note that, unlike -e, -B might cause some clipping.

              -b  is similar to -B but has clipping protection, i.e. if neces-
              sary to prevent clipping whilst balancing,  attenuation  is  ap-
              plied  to  all  channels.  In conjunction with -n, -B and -b are
              synonymous.

              The -r option is used in conjunction with a prior invocation  of
              gain with the -h option - see below for details.

              The -n option normalizes the audio to 0dB FSD.  It is often used
              in conjunction with a negative gain-dB so that the audio is nor-
              malized to a given level below 0dB.  For example,

                 sox_ng in.au out.au gain -n

              normalizes to 0dB, and

                 sox_ng in.au out.au gain -n -3

              normalizes to -3dB.

              The -l option invokes a simple limiter. For example,

                 sox_ng in.au out.au gain -l 6

              applies  6dB  of  gain but never clips.  Note that limiting more
              than a few dBs more than occasionally in a piece of audio is not
              recommended as it can cause audible distortion.  See the compand
              effect for a more capable limiter.

              The -h option is used to apply gain to provide headroom for sub-
              sequent processing.  For example, with

                 sox_ng in.au out.au gain -h bass +6

              6dB of attenuation is applied prior to the bass boosting effect,
              ensuring that it does not clip.  Of course, with bass, it is ob-
              vious how much headroom is needed but, with other effects  (e.g.
              rate,  dither), it is not always as clear.  Another advantage of
              using gain -h rather than an explicit attenuation  is  that,  if
              the  headroom  is  not used by subsequent effects, it can be re-
              claimed with gain -r, for example:

                 sox_ng in.au out.au gain -h bass +6 rate 44100 gain -r

              The above effects chain guarantees never to clip nor amplify; it
              attenuates if necessary to prevent clipping, but by only as much
              as is needed to do so.

              Output formatting (dithering and bit-depth reduction)  also  re-
              quires headroom which cannot be reclaimed, e.g.

                 sox_ng in.au out.au gain -h bass +6 rate 44100 gain -rh dither

              Here,  the  second gain invocation reclaims as much of the head-
              room as it can from the preceding effects but  retains  as  much
              headroom as is needed for subsequent processing.  The SoX global
              option  -G can be given to automatically invoke gain -h and gain
              -r.

              See the norm and vol effects.

       highpass|lowpass [-1|-2] frequency [width[q|o|h|k]]
              Apply a high-pass or low-pass filter with 3dB  point  frequency.
              The  filter  can be either single-pole (with -1), or double-pole
              (the default, or with -2).  width applies  only  to  double-pole
              filters;  the  default  is Q = 0.707 and gives a Butterworth re-
              sponse.  The filters roll off at 6dB per pole per  octave  (20dB
              per  pole per decade).  The double-pole filters are described in
              detail in [1].

              These effects support the --plot global option.

              See sinc for filters with a steeper roll-off.

       hilbert [-n taps]
              Apply an odd-tap Hilbert transform filter,  phase  shifting  the
              signal by 90 degrees.

              This is used in many matrix coding schemes and for analytic sig-
              nal  generation.   The process is often written as a multiplica-
              tion by i (or j), the imaginary unit.

              An odd-tap Hilbert transform filter has a band-pass characteris-
              tic, attenuating the lowest and highest frequencies.  Its  band-
              width  can be controlled by the number of filter taps, which can
              be specified with -n.  By default, the number of taps is  chosen
              for a cutoff frequency of about 75 Hz.

              This effect supports the --plot global option.

       ladspa [-l] [-r] module [plugin] {argument}
              Apply  a  LADSPA [5] (Linux Audio Developer's Simple Plugin API)
              plugin.  Despite the name, LADSPA is not  Linux-specific  and  a
              wide  range  of  effects is available as LADSPA plugins, such as
              CMT [6] (the Computer Music Toolkit) and Steve  Harris's  plugin
              collection  [7].  The  first  argument is the plugin module, the
              second the name of the plugin (a module can  contain  more  than
              one plugin) and any other arguments are for the control ports of
              the  plugin. Missing arguments are supplied by default values if
              possible.

              Normally, the number of input ports of the plugin must match the
              number of input channels and the number of output  ports  deter-
              mines the output channel count.  However, the -r (replicate) op-
              tion allows cloning a mono plugin to handle multichannel input.

              Some  plugins introduce latency which SoX may optionally compen-
              sate for.  The -l (latency  compensation)  option  automatically
              compensates  for latency as reported by the plugin via an output
              control port named "latency".

              If it is set, the environment variable LADSPA_PATH  is  used  as
              the search path for plugins.  See LADSPA_PATH in the section EN-
              VIRONMENT.

       loudness [gain [reference]]
              Loudness  control  is  similar  to  the gain effect but provides
              equalization   for   the    human    auditory    system.     See
              http://en.wikipedia.org/wiki/Loudness for a detailed description
              of  loudness.   The gain is adjusted by the given gain parameter
              (usually negative) and the signal equalized according to ISO 226
              w.r.t. a reference level of 65dB, though an  alternative  refer-
              ence level may be given if the original audio has been equalized
              at  some other level.  A default gain of -10dB is used if a gain
              value is not given.

              See the gain effect.

       lowpass [-1|-2] frequency [width[q|o|h|k]]
              Apply a low-pass filter.  See the description  of  the  highpass
              effect for details.

       mcompand "compand-args" {frequency "compand-args"}

              The quoted compand-args are as for the compand effect:
              attack1,decay1{,attack,decay}
              [soft-knee-dB:]in-dB1[,out-dB1]{,in-dB,out-dB}
              [gain [initial-volume-dB [delay]]]

              The multi-band compander is similar to the single-band compander
              but  the  audio is first divided into bands using Linkwitz-Riley
              crossover filters and a separately specifiable compander is  run
              on  each band.  See the compand effect for the definition of its
              parameters.  Compand parameters  are  specified  between  double
              quotes  and  the  crossover  frequency for that band is given by
              crossover-freq; these can be repeated to create multiple bands.

              The following examples approximate Dolby A compression  and  de-
              compression,  as  used  for tape noise reduction in professional
              recording studios:

                 # Dolby A compressor
                 sox_ng in.au dolbyA.au mcompand \
                    ".1,.1 4:-56,-46,-36,-26,-26,-20,-17,-15,-9,-9" 80 \
                    ".1,.1 4:-56,-46,-36,-26,-26,-20,-17,-15,-9,-9" 3k \
                    ".1,.1 4:-56,-46,-36,-26,-26,-20,-17,-15,-9,-9" 9k \
                    ".1,.1 4:-56,-42,-36,-23,-26,-18,-17,-14,-9,-9"

                 # Dolby A decompressor
                 sox_ng dolbyA.au out.au mcompand \
                    ".1,.1 4:-46,-56,-26,-36,-20,-26,-15,-17,-9,-9" 80 \
                    ".1,.1 4:-46,-56,-26,-36,-20,-26,-15,-17,-9,-9" 3k \
                    ".1,.1 4:-46,-56,-26,-36,-20,-26,-15,-17,-9,-9" 9k \
                    ".1,.1 4:-42,-56,-23,-36,-18,-26,-14,-17,-9,-9"

              Real Dolby A probably compands each channel separately but  that
              is left as an exercise to interested readers.

              See compand for a single-band companding effect.

       noiseprof [profile-file]
              Calculate  a  profile  of  the audio for use in noise reduction.
              See the description of the noisered effect for details.

       noisered [profile-file [amount]]
              Reduce noise in the audio signal  by  profiling  and  filtering.
              This effect is moderately effective at removing consistent back-
              ground noise such as hiss or hum.  To use it, first run SoX with
              the  noiseprof  effect  on a section of audio that ideally would
              contain silence but in fact contains noise - such  sections  are
              typically  found  at  the  beginning  or the end of a recording.
              noiseprof writes a noise profile to profile-file or to stdout if
              no profile-file or if `-' is given.  E.g.

                 sox_ng speech.wav -n trim 0 1.5 noiseprof speech.noise-profile

              To actually remove the noise, run SoX again, this time with  the
              noisered  effect;  noisered  reduces  noise according to a noise
              profile generated by noiseprof, from profile-file if it is given
              or from stdin if no profile-file or if `-' is given.  E.g.

                 sox_ng speech.wav cleaned.wav noisered speech.noise-profile 0.3

              How much noise should be removed is specified by amount-a number
              between 0 and 1 with a default of 0.5.   Higher  numbers  remove
              more  noise  but present a greater likelihood of removing wanted
              components of the audio signal.  Before  replacing  an  original
              recording  with a noise-reduced version, experiment with differ-
              ent amount values to find the optimal one for  your  audio;  use
              headphones  to check that you are happy with the results, paying
              particular attention to quieter sections of the audio.

              On most systems, the two stages - profiling and reduction -  can
              be combined using a pipe, e.g.

                 sox_ng noisy.wav -n trim 0 1 noiseprof | \
                    play_ng noisy.wav noisered


       norm [dB-level(0)]
              Normalize the audio.  norm is just an alias for gain -n; see the
              gain effect for details.

       oops   Out  Of  Phase  Stereo  effect.  Mixes stereo to twin mono where
              each mono channel contains the difference between the  left  and
              right stereo channels.  This is sometimes known as the `karaoke'
              effect as it often has the effect of removing most or all of the
              vocals from a recording.  It is equivalent to remix 1,2i 1,2i.

       overdrive [gain(20) [color(20)]]
              Non-linear  distortion.  The color parameter controls the amount
              of even harmonic content in the overdriven output. Both  parame-
              ters range from 0 to 100.

       pad { [%]length[@position(=)] }
              Pad  the  audio  with silence at the beginning, at the end or at
              any specified points throughout the audio.  length is the amount
              of silence to insert and position the position in the input  au-
              dio stream at which to insert it.  Any number of lengths and po-
              sitions  may be specified, provided that a specified position is
              not less that the previous one.  Position is  optional  for  the
              first  and  last  lengths specified and if omitted correspond to
              the beginning and the end of the audio respectively.  For  exam-
              ple,  pad 1.5 1.5 adds 1.5 seconds of silence at each end of the
              audio, whilst pad 4000s@3:00 inserts 4000 samples of  silence  3
              minutes into the audio.  If silence is wanted only at the end of
              the  audio,  either  specify the end position or specify a zero-
              length pad at the start.

              If a pad specification starts with with a % sign, the output  is
              padded  to  a  multiple of length at the specified position. For
              example, pad 0 %10 adds silence at the end of the  audio  up  to
              the next multiple of 10 seconds.

              See delay for an effect that can add silence at the beginning of
              the audio on a channel-by-channel basis.

       phaser [-n|l|q] [-s|t] [gain-in(.4) gain-out(.74) delay(3) regen(.4)
       speed(.5) [-s|-t(-s)]
              Add  a  phasing effect to the audio.  See [3] for a detailed de-
              scription of phasing.

              delay gives the maximum delay in milliseconds from  0  to  1000,
              regen  the  amount  of feedback from the delay from -1 to +1 and
              speed the frequency of delay-time modulation wave in Hz.

              The modulation is either sinusoidal (-s),  which  is  preferable
              for  multiple instruments, or triangular (-t) which gives single
              instruments a sharper phasing effect.  regen can be from  -1  to
              +1  but  should  usually  be less than 0.5 to avoid clipping and
              gain-out is the final volume adjustment from -1 to +1.

              The -l flag makes phaser do linear interpolation between samples
              when the offset into the delay line is not a whole number, which
              is about 15% slower but much less noisy and  -q  does  quadratic
              interpolation,  which  is  about 50% slower but even less noisy.
              -n explicitly asks for no interpolation, the default,  fast  and
              fuzzy.

              In  sox_ng, -s or -t can be given at the start or the end; to be
              compatible with earlier versions of SoX, supply all the  parame-
              ters with one of these at the end, use gain-in and gain-out from
              0 to 1, delay from 0 to 5, speed from 0.1 to 2 and don't use in-
              terpolation.

              Technically, the SoX phaser is not a phaser; it is a flanger.  A
              flanger  does  comb  filtering  with  equidistant  spacing (e.g.
              100Hz, 200Hz, 300Hz, 400Hz, ...), while a real phaser does  comb
              filtering  with  factored  spacing  (e.g.  100Hz,  200Hz, 400Hz,
              800Hz, ...) that sounds more harmonic.

              For example:

                 play_ng snare.flac phaser 0.8 0.74 3 0.4 0.5 -t

              Gentler:

                 play_ng snare.flac phaser 0.9 0.85 4 0.23 1.3 -s

              A popular sound:

                 play_ng snare.flac phaser 0.89 0.85 1 0.24 2 -t

              More severe:

                 play_ng snare.flac phaser 0.6 0.66 3 0.6 2 -t

              For a flow diagram of how phaser works, say sox_ng --help-effect
              phaser.

       pitch [-q] shift [segment [search [overlap]]]
              Change the audio pitch but not the tempo.

              shift gives the pitch shift  as  positive  or  negative  `cents'
              (i.e. 100ths of a semitone).

              Pitch  and  tempo  share the same fundamental algorithm; see the
              tempo effect for a description of the other parameters.

              See the bend, speed and tempo effects.

       rate [-q|-l|-m|-g|-h|-e|-v|-u] [override-options] [frequency]
              Change the audio sampling rate (i.e. resample the audio) to  any
              given  frequency  (even  non-integer if this is supported by the
              output file format) using a quality level defined as follows:

                    Quality    B/W    Rej dB     Typical Use
              -q     quick     n/a   ~=30@Fs/4   playback on ancient hardware
              -l      low      80%      100      playback on old hardware
              -m    medium     95%      100      audio playback
              -g    generic    95%      100      16-bit
              -h     high      95%      125      20-bit for 16-bit mastering
              -e    extreme    95%      150      24-bit
              -v   very high   95%      175      28-bit for 24-bit mastering
              -u     ultra     95%      200      32-bit

              These can also be selected with -Q n with n from 0 to 7.

              B/W (bandwidth) is the percentage of the  audio  frequency  band
              that  is  preserved  and Rej dB is the level of noise rejection.
              Increasing levels of resampling quality come at the  expense  of
              increasing  amounts of time to process the audio.  If no quality
              option is given, the quality level used is `high' when  process-
              ing  audio  and  `low' when playing it.  See Playing & Recording
              Audio above.

              The `quick' algorithm uses cubic interpolation; all  others  use
              band-limited  interpolation.   By default, all algorithms have a
              linear phase response; for `medium' and  above,  the  phase  re-
              sponse is configurable (see below).

              The  rate  effect  is  invoked  automatically if SoX's -r option
              specifies a rate that is different to that of the input file(s).
              Alternatively, if this effect is given explicitly, then SoX's -r
              option need not be given.  For example, the following  two  com-
              mands are equivalent:

                 sox_ng input.wav -r 48k output.wav bass -b 24
                 sox_ng input.wav        output.wav bass -b 24 rate 48k

              though the second command is more flexible as it allows rate op-
              tions  to  be  given, and allows the effects to be ordered arbi-
              trarily.

              A user notes that resampling tracks and then concatenating  them
              is  more likely to create clicks at the joints than joining them
              first and resampling the result, due to edge effects.

              Override Options

              The simple quality selection described above  provides  settings
              that satisfy the needs of the vast majority of resampling tasks.
              Occasionally,  however, it may be desirable to fine-tune the re-
              sampler's filter response; for  qualities  `medium'  and  above,
              this can be achieved using the override options in the following
              table:

              -M/-I/-L   Phase response=minimum/intermediate/linear
              -s         Steep filter (bandwidth=99%)
              -a         Allow aliasing/imaging above the pass band
              -b width   Any bandwidth % (74-99.7 or 85-99.7 with 0
              -p phase   Any phase response (0=minimum, 25=intermediate,
                         50=linear, 100=maximum)

              All  resamplers  use  filters  that  can sometimes create `echo'
              (a.k.a.  `ringing') artefacts with  transient  signals  such  as
              those  that occur with `finger snaps' or other highly percussive
              sounds.  Such artefacts are much more noticeable  to  the  human
              ear if they occur before the transient (`pre-echo') than if they
              occur  after  it  (`post-echo').  Note that the frequency of any
              such artefacts is related to the smaller of the original and new
              sampling rates but if this is at least  44.1kHz,  the  artefacts
              will lie outside the range of human hearing.

              A phase response setting may be used to control the distribution
              of  any  transient  echo  between `pre' and `post': with minimum
              phase, there is no pre-echo but the longest post-echo; with lin-
              ear phase, pre- and post-echo are in equal  amounts  (in  signal
              terms,  but  not  in audibility); the intermediate phase setting
              attempts to find the best compromise by selecting a small length
              (and level) of pre-echo and a medium-length of post-echo.

              A minimum, intermediate or linear phase response is selected us-
              ing the -M, -I and -L options; a custom phase  response  can  be
              created  with  the -p option.  Note that phase responses between
              `linear' and `maximum' (greater than 50) are rarely useful.

              A resampler's bandwidth setting determines how much of the  fre-
              quency  content of the original signal (w.r.t. the original sam-
              ple rate when upsampling or the new sample  rate  when  downsam-
              pling)  is preserved during conversion.  The term `pass band' is
              used to refer to all frequencies up to the bandwidth point (e.g.
              for a 44.1kHz sampling rate and a resampling bandwidth  of  95%,
              the  pass  band  represents  frequencies  from 0Hz (DC) to circa
              21kHz).  Increasing the resampler's bandwidth results in a slow-
              er conversion and can increase  transient  echo  artefacts  (and
              vice versa).

              The  -s  `steep  filter' option changes the resampling bandwidth
              from the default of 95% (based on the 3dB point) to 99%.  The -b
              option allows the bandwidth to be set to any value in the  range
              74-99.7%  but  bandwidth  values greater than 99% are not recom-
              mended for normal use as  they  can  cause  excessive  transient
              echo.

              If  the -a option is given, aliasing/imaging above the pass band
              is allowed.  For example, with 44.1kHz sampling rate and  a  re-
              sampling  bandwidth  of  95%,  this means that frequency content
              above 21kHz can be distorted. However, since this is  above  the
              pass  band  (i.e.  above the highest frequency of interest/audi-
              bility), this may not be a problem.  The  benefits  of  allowing
              aliasing/imaging are reduced processing time and reduced (by al-
              most half) transient echo artefacts.

              The -d option sets the bit-accuracy in the range 15 to 33, or -R
              sets  the bit-accuracy to obtain rejection of a specified number
              of dB.

              Examples:

                 sox_ng input.wav -b 16 output.wav rate -s -a 44100 dither -s

              is default (high) quality resampling with overrides for a  steep
              filter,  to  allow aliasing, at a 44.1kHz sample rate and noise-
              shaped dithering to a 16-bit WAV file.

                 sox_ng input.wav -b 24 output.aiff rate -v -I -b 90 48k

              is very high quality resampling with overrides for an intermedi-
              ate phase, a bandwidth of 90%, at a 48k sampling rate and  stor-
              ing the output to a 24-bit AIFF file.

              Advanced Options

              The  -i option forces the use of a particular interpolator coef-
              ficient from -1 to 2.

              The -c option tries to limit the number  of  coefficients  to  a
              number of kilobytes; its argument can be from 100 up.

              The  -B option sets the percentage of the pass-band to preserve,
              from 53 to 95.

              The -A option sets  the  percentage  of  the  bandwidth  without
              aliasing, from 85 to 100.

              -f sets zero pass-band roll-off instead of 0.01dB for -Q 0-2.

              -n disables internal small-integer optimizations and

              -t increases the irrational ratio accuracy.

       remix [-a|-m] [-p] <out-spec>
              out-spec  = 0 | in-spec{,in-spec}
              in-spec   = [in-chan][-[in-chan2]][vol-spec]
              vol-spec  = p|i|v[volume]

              Select  and mix input audio channels into output audio channels.
              Each output channel is specified in turn  by  a  given  out-spec
              which is a list of contributing input channels and volume speci-
              fications.

              Note  that this effect operates on the audio channels within the
              SoX effects processing chain; it should not be confused with the
              -m global option, where multiple files are  mix-combined  before
              entering the effects chain.

              An  out-spec  contains comma-separated input channel numbers and
              hyphen-delimited channel number ranges; alternatively, 0 may  be
              given to create a silent output channel.  For example,

                 sox_ng input.wav output.wav remix 6 7 8 0

              creates  an output file with four channels, where channels 1, 2,
              and 3 are copies of channels 6, 7, and 8 in the input file,  and
              channel 4 is silent.  Whereas

                 sox_ng input.wav output.wav remix 1-3,7 3

              creates  a  (somewhat bizarre) stereo output file where the left
              channel is a mix-down of input channels 1, 2, 3 and  7  and  the
              right channel is a copy of input channel 3.

              Where  a  range of channels is specified, the channel numbers to
              the left and right of the hyphen are optional and default  to  1
              and to the number of input channels respectively. Thus

                 sox_ng input.wav output.wav remix -

              performs a mix-down of all input channels to mono.

              By default, where an output channel is mixed from multiple input
              channels,  each  input  channel  is  scaled by a factor of ^1/n.
              Custom mixing volumes can be set  by  following  a  given  input
              channel or range of input channels with a vol-spec (volume spec-
              ification) which is one of the letters p, i, or v, followed by a
              volume number, the meaning of which depends on the given letter:

              Letter   Volume number        Notes
                p      power adjust in dB   0 = no change
                i      power adjust in dB   As for p but invert the audio
                v      voltage multiplier   1 = no change; 0.5 ~= 6dB attenua-
                                            tion; 2 ~= 6dB gain; -1 = invert

              If  an out-spec includes at least one vol-spec then, by default,
              ^1/n scaling is not applied to any other channels  in  the  same
              out-spec  (though maybe in other out-specs) though the -a (auto-
              matic) option can be given to retain the  automatic  scaling  in
              this case.  For example,

                 sox_ng input.wav output.wav remix 1,2 3,4v0.8

              results  in  channel  level  multipliers  of  0.5,0.5 and 1,0.8,
              whereas

                 sox_ng input.wav output.wav remix -a 1,2 3,4v0.8

              results in channel level multipliers of 0.5,0.5 and 0.5,0.8.

              The -m (manual) option disables  all  automatic  volume  adjust-
              ments, so

                 sox_ng input.wav output.wav remix -m 1,2 3,4v0.8

              results in channel level multipliers of 1,1 and 1,0.8.

              The  volume number is optional and omitting it corresponds to no
              volume change; however, the only case in which this is useful is
              in conjunction with i.  For example,  if  input.wav  is  stereo,
              then

                 sox_ng input.wav output.wav remix 1,2i

              is a mono equivalent of the oops effect.

              If  the  -p  option  is given, any automatic ^1/n scaling is re-
              placed by ^1/<sqrt>n (`power') scaling; this gives a louder  mix
              but one that may occasionally clip.

              One use of the remix effect is to split an audio file into a set
              of files, each containing one of the constituent channels in or-
              der  to  perform subsequent processing on individual audio chan-
              nels.  When more than a few channels are involved, a script such
              as the following is useful:

              #! /bin/sh
              chans=`soxi_ng -c "$1"`
              while [ $chans -ge 1 ]; do
                 chans0=`printf %02i $chans`   # 2 digits hence up to 99 chans
                 out=`echo "$1"|sed "s/\(.*\)\.\(.*\)/\1-$chans0.\2/"`
                 sox_ng "$1" "$out" remix $chans
                 chans=`expr $chans - 1`
              done

              If a file input.wav containing six audio  channels  were  given,
              the  script  would  produce  six output files: input-01.wav, in-
              put-02.wav, ..., input-06.wav.

              See the swap effect.

       repeat [count(1)|-]
              Repeat the entire audio count times, or once  if  count  is  not
              given.   The  special  value - requests infinite repetition.  It
              requires temporary file space to store the audio to be repeated.
              Note that repeating once yields two copies: the  original  audio
              and the repeated audio.

       reverb [-w] [reverberance(50%) [HF-damping(50%) [room-scale(100%)
              [stereo-depth(100%) [pre-delay(0ms) [wet-gain(0dB)]]]]]]

              Add  reverberation  to the audio using the `freeverb' algorithm.
              A reverberation effect is sometimes desirable for concert  halls
              that  are  too  small  or contain so many people that the hall's
              natural reverberance is diminished.  Applying a small amount  of
              stereo  reverb  to a dry mono signal usually makes it sound more
              natural.  See [3] for a detailed description of reverberation.

              This effect increases the volume of the audio and  continues  to
              reverberate after the input finishes so, to prevent clipping and
              keep  the audible part of the final reverberation, a typical in-
              vocation might be:

                 play_ng dry.au gain -3 pad 0 1 reverb

              The -w option can be given to select only the `wet' signal, thus
              allowing it to be processed further, independently of the  `dry'
              signal.  E.g.

                 play_ng -m in.au "|sox_ng in.au -p reverse reverb -w reverse"

              for a reverse reverb effect.

       reverse
              Reverse  the audio completely.  Requires temporary file space to
              store the audio to be reversed.

       riaa   Apply RIAA vinyl playback equalization.  The sampling rate  must
              be 44.1, 48, 88.2, 96 or 192kHz.

              This effect supports the --plot global option.

       saturation [type [blend [offset [drive|color|threshold]]]]
              Add  saturation,  which  can produce effects ranging from subtle
              warmth to crunchy fuzz. The type parameter selects  the  satura-
              tion type: tanh (the default), sqrt or diode.

              For all types, the blend parameter (default 1) controls the mix-
              ture  of  wet  and dry signals in the output, with 1 being fully
              wet. The offset parameter (default 0) adds a DC  offset  to  the
              input  to  produce  asymmetric distortion. The offset is removed
              from the output, so that a zero input level produces a zero out-
              put level, but when the input is non-zero the output waveform is
              likely to be asymmetric.

              The tanh saturation type uses the hyperbolic tangent function to
              apply soft clipping. The drive parameter  (default  1)  controls
              the input gain and thus the amount of distortion.

              The  sqrt  saturation  type  uses  a  mixture  of two functions:
              x*sqrt(|x|) and sgn(x)*sqrt(|x|),  which  give  different  tonal
              qualities  to the output. The color parameter (default 0.5) con-
              trols the mixture  of  these  functions,  with  0  being  purely
              x*sqrt(|x|) and 1 being purely sgn(x)*sqrt(|x|).

              The  diode  saturation type models the effect of using a pair of
              diodes to clip the signal when it exceeds a  threshold  (default
              0.5).  The  blend  parameter, by mixing the wet and dry signals,
              effectively controls the amount of attenuation that occurs above
              the threshold, from no attenuation when blend is 0  to  complete
              attenuation (hard clipping) when blend is 1.

              See  the overdrive effect for another kind of non-linear distor-
              tion.  When the offset parameter is used to  produce  asymmetric
              distortion,  the  highpass  effect  can be used to rebalance the
              waveform's positive and negative amplitude.

       sdm [-f filter] [-t order] [-n num] [-l latency]
              Apply a 1-bit sigma-delta modulator producing DSD  output.   The
              input should be previously upsampled, e.g. with the rate effect,
              to  a high rate, 2.8224MHz for DSD64.  The -f option selects the
              noise-shaping filter from the following list  where  the  number
              indicates the order of the filter:

                                      clans-4   sdm-4
                                      clans-5   sdm-5
                                      clans-6   sdm-6
                                      clans-7   sdm-7
                                      clans-8   sdm-8

              The  noise filter may be combined with a partial trellis/viterbi
              search by supplying the following options:

              -t order
                     Trellis order, max 32.

              -n num Number of paths to consider, max 32.

              -l latency
                     Output latency, max 2048.

              The result of using these parameters is hard to predict and  can
              include high noise levels or instability.  Caution is advised.

       silence [-l] above-periods [duration threshold[d|%]]
              [below-periods duration threshold[d|%]]

              Removes  silence from the beginning, middle or end of the audio,
              where `silence' is determined by a specified threshold.

              The above-periods value is used to indicate whether audio should
              be trimmed at the beginning of the audio. A value of zero  indi-
              cates  that  no  silence should be trimmed from the beginning in
              which case duration and threshold are omitted.  When a  non-zero
              above-periods is specified, you must also specify a duration and
              threshold  and  it  trims  audio until it finds non-silence.  It
              will normally be 1 when trimming silence from the  beginning  of
              the  audio, but it can be increased to higher values to trim all
              audio up to the Nth non-silence period.   For  example,  if  you
              have  an  audio file with two songs that each contains 2 seconds
              of silence before the song, you could specify an above-period of
              2 to strip out both silences and the first song.

              duration indicates the amount of time for which non-silence must
              be detected before it stops trimming the silence before it.   By
              increasing  duration, short bursts of quiet noise can be treated
              as silence and trimmed off.  duration has the peculiarity that a
              bare number is interpreted as a sample count, not as a number of
              seconds.  To specify seconds, either use the  t  suffix  (as  in
              2t),  a  decimal point (as in 2.0) or specify minutes too (as in
              0:02).

              threshold indicates the maximum sample value in any  channel  is
              considered  silence. For digital audio, a value of 0 may be fine
              but for audio recorded from analog you may wish to increase  the
              value  to  include  background  noise.  threshold numbers may be
              suffixed with d to indicate that the value is in decibels  or  %
              to  indicate  a percentage of the maximum possible sample value.
              By default, it is in percent.

              To trim silence from the end of the audio, specify a below-peri-
              ods count, which means to remove all audio after the last  onset
              of  silence is detected.  Normally, this will be 1 but it can be
              increased to leave shorter periods of silence and the audio that
              follows them intact.  For example, if you have a  track  with  1
              second  of  silence  in  the middle and 1 second at the end, you
              could set below-period to 2 to leave the middle silence and what
              follows it and remove from the final silence on.

              When below-periods is given, its duration specifies  the  length
              of  silence that must exist before audio is not copied any more.
              By specifying a higher duration, shorter silences that are want-
              ed can be left in the audio.  For example, if you  have  a  song
              with  1 second of silence in the middle and 2 seconds of silence
              at the end, a duration of 2 could be used to skip over the  mid-
              dle  silence  and trim the end instead of starting trimming from
              half way through.

              Unfortunately, the length of the silence at the end  has  to  be
              longer  than  any preceding silence for this to work so you must
              know the length of the silence at the end.

              A more reliable way to trim silence from the end is to  use  the
              silence effect in combination with the reverse effect.  By first
              reversing  the audio, you can use the above-periods to trim from
              what looks like the front of the file, then reverse it again  to
              get back to normal.

              To remove silence from the middle of a file, give a negative be-
              low-periods.   This  value is then treated as positive value and
              is also used to indicate that the effect should restart process-
              ing as specified by the above-periods, making  it  suitable  for
              removing periods of silence in the middle of the audio.

              The -l option indicates that below-periods' duration of `silent'
              audio  should  be left intact at the beginning of each period of
              silence, for example, if you want to remove long pauses  between
              words but do not want to remove the pauses completely.

              The  following example shows how this effect can be used to make
              a recording that does not contain the silence that  usually  oc-
              curs  between  pressing  the  record button and the start of the
              performance:

                 rec_ng parameters filename other-effects silence 1 5 2%

              This example should remove the  start  of  the  recording  until
              there's a period of non-silence longer than 0.2s and louder than
              0.1%,  then  start searching for a silence that's longer than 1s
              and quieter than 3% and remove it if found, leaving the first 1s
              of it in place, then start copying  again  until  a  silence  is
              found that's longer than 1s and quieter than 3%, trim that to 1s
              and so on.

                 sox_ng in.au out.au silence -l 1 0.2 0.1% -1 1.0 3%


       sinc [-a att|-b beta] [-p phase|-M|-I|-L] [-t tbw|-n taps]
              [freqHP][-freqLP [-t tbw|-n taps]] [-r]] [-d]]

              Apply  a kaiser-windowed low-pass, high-pass, band-pass or band-
              reject filter to the signal.  The freqHP and  freqLP  parameters
              give  the  frequencies of the 6dB points of a high-pass and low-
              pass filter that may be invoked individually  or  together.   If
              both are given, freqHP less than freqLP creates a band-pass fil-
              ter and freqHP greater than freqLP creates a band-reject filter.
              For example, the invocations

                 sinc 3k
                 sinc -4k
                 sinc 3k-4k
                 sinc 4k-3k

              create  a  high-pass, low-pass, band-pass and band-reject filter
              respectively.

              The default stop band attenuation of  120dB  can  be  overridden
              with -a; alternatively, the kaiser window's `beta' parameter can
              be given directly with -b.

              The  default transition bandwidth of 5% of the total band can be
              overridden with -t (and tbw in Hertz); alternatively, the number
              of filter taps can be given directly with -n and is  limited  to
              the range of 11-32767.

              If  both freqHP and freqLP are given, a -t or -n option given to
              the left of the frequencies applies to both frequencies; one  of
              these options given to the right of the frequencies applies only
              to freqLP.

              The  -p,  -M,  -I  and -L options control the filter's phase re-
              sponse; see the rate effect for details.

              The -r option controls whether the filter should round the  num-
              ber of taps to the closest integer instead of truncating it.

              The -d option specifies that, if a low-pass filter is being cre-
              ated  and  the  cutoff frequency is at or above the Nyquist fre-
              quency, the sinc effect should be deleted from the effects chain
              instead of failing.

              This effect supports the --plot global option.

       softvol [volume(1.0) [double-time(0) [headroom(0)]]]
              The soft volume effect applies a simple multiplier to the  audio
              ensuring that it does not clip. When a sample would have clipped
              the volume multiplier is automatically reduced to compensate.

              It  is  a  simple compander with the advantages of running fast,
              having no pre- or post-echo and reacting on the  crests  of  the
              wave, so its volume-reduction glitches don't add audible noise.

              volume  sets  the  initial volume multiplier; the default of 1.0
              means no change.

              double-time says that the volume should  slowly  increase  at  a
              rate  that  makes  it  double every double-time seconds.  A good
              value for usual music is 10 and the default value of 0 says that
              the volume should not increase automatically.

              headroom is in dB and limits the loudest amplitude to less  than
              the  32-bit  maximum.  This may be necessary when the final bit-
              depth reduction and/or dithering make it clip.  A value  of  0.1
              is  sufficient  to protect down to a bit-depth of 8 with dither-
              ing.

              When playing sound in interactive mode, the `v' and `V' keys re-
              duce and increase the volume if there is a softvol  in  the  ef-
              fects chain. If there are more than one, which one it adjusts is
              probably random.

       spectrogram [options]
              Create  a  spectrogram of the audio. The audio is passed unmodi-
              fied through the SoX processing chain.  This effect is  optional
              -  type sox_ng --help and check the list of supported effects to
              see if it has been included.

              The spectrogram is rendered in a Portable Network Graphic  (PNG)
              file  and  shows time in the X axis, frequency in the Y axis and
              audio signal magnitude in the Z axis, represented by  the  color
              (or  optionally  the  intensity) of the pixels in the X-Y plane.
              If the audio signal contains multiple channels, these are  shown
              from  top  to  bottom starting from channel 1, which is the left
              channel for stereo audio.

              For example, if `my.wav' is a stereo file, then

                 sox_ng my.wav -n spectrogram

              creates a spectrogram of the entire file in the  file  `spectro-
              gram.png'.   More often though, analysis of a smaller portion of
              the audio is required; e.g. with

                 sox_ng my.wav -n remix 2 trim 20 30 spectrogram

              the spectrogram shows information only from the  second  (right)
              channel  of thirty seconds of audio starting from twenty seconds
              in.  To analyze a small portion of  the  frequency  domain,  the
              rate effect may be used, e.g.

                 sox_ng my.wav -n rate 6k spectrogram

              allows  detailed  analysis  of  frequencies up to 3kHz (half the
              sampling rate) i.e. where the human auditory system is most sen-
              sitive.  See also the -R option below. With

                 sox_ng my.wav -n trim 0 10 spectrogram -x 600 -y 200 -z 100

              the given options control the size of the spectrogram's X, Y & Z
              axes (in this case, the spectrogram area of the  produced  image
              will  be  600 by 200 pixels in size and the Z axis range will be
              100 dB).  Note that the produced image  includes  axes,  legends
              etc.  and will be larger than the specified spectrogram size un-
              less the -r option is given: if each spectrogram  is  x x y  and
              there are c channels, the image will be x + 144 by (y x c) + 78,
              plus  c - 1  if -a was not given, and 20 pixels higher than this
              if you gave -t Title.  A raw spectrogram will be x by y x c.

              In this example

                 sox_ng -n -n synth 6 tri 10k:14k spectrogram -z 100 -w kaiser

              an analysis window with high dynamic range is selected  to  best
              display the spectrogram of a swept triangular wave.  For a simi-
              lar  example, append the following to the `chime' command in the
              description of the delay effect (above):

                 rate 2k spectrogram -X 200 -Z -10 -w kaiser

              Options are also available to control the appearance (color set,
              brightness, contrast etc.) and filename of the spectrogram; e.g.
              with

                 sox_ng my.wav -n spectrogram -m -l -o print.png

              a spectrogram is created suitable for printing on  a  black  and
              white printer.

              Options

              -x num Change  the  (maximum)  width (X axis) of the spectrogram
                     from its default value of 800 pixels to  a  given  number
                     between 100 and a million.  See -X and -d.

              -X num X  axis pixels per second; the default is auto-calculated
                     to fit the audio to the X axis size if  its  duration  is
                     known or given with -d, or 100 otherwise.  If given with-
                     out  a  -x  option when the length of the audio is known,
                     this option determines the width of the spectrogram; oth-
                     erwise, it affects the duration of the spectrogram.   num
                     can  be  from  1 (low time resolution) to 5000 (high time
                     resolution) and need not be an integer.  SoX may  make  a
                     slight  adjustment  to  the  given  number for processing
                     quantization reasons; if so, SoX reports the actual  num-
                     ber  used  (viewable  when the SoX global option -V is in
                     effect).

              -y num Sets the size of the Y axis per channel in  pixels;  this
                     is  the  number  of  frequency `bins' used in the Fourier
                     analysis that produces the spectrogram.  By default the Y
                     axis size is chosen automatically, depending  on  the  -Y
                     height and the number of channels, with a minimum of 64.

              -Y num Sets the total height of the spectrogram(s).  The default
                     value is 550 pixels and the maximum is a million.  If num
                     is  not  an exact multiple of the number of channels, the
                     actual total height will be a few pixel rows less.

              -z num Z axis (color) range in dB, default 120.  This  sets  the
                     dynamic  range  of  the  spectrogram  to  be -num dBFS to
                     0 dBFS.  Num may range from 20 to 180.  Decreasing dynam-
                     ic range effectively increases the contrast of the  spec-
                     trogram display and vice versa.

              -Z num Sets  the  upper limit of the Z axis in dBFS.  A negative
                     num effectively increases the brightness of the  spectro-
                     gram display and vice versa.

              -n     Normalizes  the  upper  limit  of  the Z axis so that the
                     loudest pixels are shown using the brightest color in the
                     palette - a kind of automatic -Z flag.

              -q num Sets the Z axis quantization, i.e. the number of  differ-
                     ent  colors  (or  intensities)  in which to render Z axis
                     values.  A small number (e.g. 4) gives a poster-like  ef-
                     fect making it easier to discern magnitude bands of simi-
                     lar  level and results in a smaller PNG file.  The number
                     given specifies the number of colors to use in the Z axis
                     range; two colors are reserved to represent  out-of-range
                     values.

              -w name
                     Select  a  window  function: Hann (the default), Hamming,
                     Bartlett, Rectangular, Kaiser or Dolph.  The  spectrogram
                     is  produced  using  the Discrete Fourier Transform (DFT)
                     algorithm and a significant parameter of  this  algorithm
                     is  the  choice of window function.  By default, SoX uses
                     the Hann window, which has good all-round properties  for
                     frequency  resolution and dynamic range.  For better fre-
                     quency resolution but lower dynamic range, select a  Ham-
                     ming window; for higher dynamic range but poorer frequen-
                     cy resolution, select a Dolph window.

              -W num Window  adjustment  parameter.   This can be used to make
                     small adjustments to the Kaiser  and  Dolph  windows.   A
                     positive  number (up to ten) increases its dynamic range,
                     a negative number decreases it.

              -s     Allow slack overlapping of DFT  windows.   This  can,  in
                     some cases, increase image sharpness and give greater ad-
                     herence  to  the  -x value but at the expense of a little
                     spectral loss.

              -m     Creates a monochrome spectrogram (the default is color).

              -h     Selects a  high-color  palette  which  is  less  visually
                     pleasing  than  the default color palette but it may make
                     it easier to differentiate different levels.  If this op-
                     tion is used in conjunction with -m, the result is hybrid
                     monochrome/color palette.

              -p num Permute the colors in a color or hybrid palette.  The num
                     parameter, from 1 (the default) to 6, selects the  permu-
                     tation.

              -l     Creates  a  `printer-friendly'  spectrogram  with a light
                     background (the default has a dark background).

              -a     Suppress the display of the axis lines.   This  is  some-
                     times useful in helping to discern artefacts at the spec-
                     trogram edges.

              -r     Raw  spectrogram:  suppress  the display of axes and leg-
                     ends.

              -A     Selects an alternative, fixed color set. This is provided
                     only for compatibility with spectrograms produced by  an-
                     other  package.  It should not normally be used as it has
                     some problems, not least, a lack  of  differentiation  at
                     the  bottom  end  which  results  in masking of low-level
                     artefacts.

              -t text
                     Set the image title, the text to display above the  spec-
                     trogram.  If you need it to be `chorus' or some other ef-
                     fect's name, surround it by spaces inside double quotes.

              -c text
                     Set (or clear) the image comment, the text to display be-
                     low and to the left of the spectrogram.

              -o file
                     The  name  of  the  spectrogram  output PNG file, default
                     `spectrogram.png'.  If `-' is given, the  spectrogram  is
                     sent to the `standard output' (stdout).

              -L     Plot the frequency on a logarithmic axis.

              -R L:H Specify  the frequency range (from L to H).  The frequen-
                     cies can have an optional suffix

                        sox_ng mymusic.mp3 -n spectrogram -L -R 100:8k

                     By default, the lowest frequency  is  0Hz  for  a  linear
                     graph  or  1Hz for a logarithmic graph and the highest is
                     the Nyquist frequency.

              Advanced Options
              In order to process a smaller section of audio without affecting
              other effects or the output signal (unlike when the trim  effect
              is used), the following options may be used:

              -d duration
                     This  option  sets  the X axis resolution such that audio
                     with the given duration (a time specification)  fits  the
                     selected  (or default) X axis width.  It defaults, if the
                     audio length is known, to  the  audio  length  minus  the
                     start time.  For example,

                        sox_ng input.mp3 output.wav -n spectrogram -d 1:00 stats

                     creates a spectrogram showing the first minute of the au-
                     dio,  while the stats effect is applied to the entire au-
                     dio signal.

                     See -X for an alternative way of setting the X axis reso-
                     lution.

              -S position(=)
                     Start the spectrogram at the given  point  in  the  audio
                     stream.  For example

                        sox_ng input.aiff output.wav spectrogram -S 1:00

                     creates a spectrogram showing all but the first minute of
                     the  audio (the output file, however, receives the entire
                     audio stream).

              For the ability to perform off-line processing of spectral data,
              see stat -freq.

       speed factor[c]
              Adjust the audio speed (pitch and tempo  together).   factor  is
              either the ratio of the new speed to the old speed (greater than
              1  speeds  it up, less than 1 slows it down) or, if the letter c
              is appended, it's the number of cents (100ths of a semitone)  by
              which  the  pitch (and tempo) should be adjusted: greater than 0
              increases, less than 0 decreases.

              Technically, the speed effect only changes the sample  rate  in-
              formation,  leaving  the  samples themselves untouched. The rate
              effect is invoked automatically to resample to the output sample
              rate, using its default quality/speed.  For  higher  quality  or
              higher speed resampling, in addition to the speed effect, speci-
              fy the rate effect with the desired quality option.

              See the bend, pitch and tempo effects.

       speexdsp [-agc [target_level(100)]] [-denoise [max_db(15)]] [-dereverb]
              [-fps frames_per_second(20)] [-spf samples_per_frame]

              Use the Speex DSP library to improve perceived sound quality.

              If  no options are specified, the -agc and -denoise features are
              enabled.

              -agc [target_level]
                     Enable automatic gain control and  optionally  specify  a
                     target volume level from 1 to 100.

              -denoise [max_db]
                     Enable noise reduction and optionally specify the maximum
                     attenuation from 1 to 100.

              -dereverb
                     Enable reverb reduction.

              -fps frames_per_second
                     Specify the number of frames per second from 1-100.

              -spf samples_per_frame
                     Specify  the number of samples per frame.  The default is
                     derived from the -fps setting so that frames abut but  do
                     not overlap.

       splice  [-h|-t|-q] {position(=)[,excess[,leeway]]}
              Splice audio sections together.  This effect provides two things
              over simple audio concatenation: a (usually short) cross-fade is
              applied  at the join and a wave similarity comparison is made to
              help determine the best place at which to make the join.

              One of the options -h, -t, or -q may be given to select the fade
              envelope as half cosine wave (the default),  triangular  (a.k.a.
              linear), or quarter cosine wave (e.g. for a cross-fade of corre-
              lated audio).

                           Audio          Fade level       Transitions
                      -h   correlated     constant gain    smooth
                      -t   correlated     constant gain    abrupt
                      -q   uncorrelated   constant power   smooth

              To perform a splice, first use the trim effect to select the au-
              dio  sections  to be joined together.  As when performing a tape
              splice, the end of the section to  be  spliced  onto  should  be
              trimmed  with  a  small excess (default 0.005 seconds) after the
              ideal joining point.  The beginning  of  the  audio  section  to
              splice on should be trimmed with the same excess before the ide-
              al  joining  point plus an additional leeway (default 0.005 sec-
              onds).  SoX should then be invoked with the two  audio  sections
              as  input files and the splice effect given with the position at
              which to perform the splice - this is length of the first  audio
              section (including the excess).

              The  following  diagram  uses the tape analogy to illustrate the
              splice operation.  The effect simulates the  diagonal  cuts  and
              joins the two pieces:

                   length1   excess
                 -----------><--->
                 _________   :   :  _________________
                          \  :   : :\     `
                           \ :   : : \     `
                            \:   : :  \     `
                             *   : :   * - - *
                              \  : :   :\     `
                               \ : :   : \     `
                 _______________\: :   :  \_____`____
                                   :   :   :     :
                                   <--->   <----->
                                   excess  leeway

              where * indicates the joining points.

              For  example, a long song begins with two verses which start (as
              determined e.g. by using  the  play_ng  command  with  the  trim
              (start)  effect)  at times 0:30.125 and 1:03.432.  The following
              commands cut out the first verse:

                 sox_ng too-long.wav part1.wav trim 0 30.130

              (5 ms excess, after the first verse starts)

                 sox_ng too-long.wav part2.wav trim 1:03.422

              (5 ms excess plus 5 ms leeway, before the second verse starts)

                 sox_ng part1.wav part2.wav just-right.wav splice 30.130

              For another example, the SoX command

                 play_ng "|sox_ng -n -p synth 1 sin %1" "|sox_ng -n -p synth 1 sin %3"

              generates and plays two notes, but there is a nasty click at the
              transition; the click can be removed by splicing instead of con-
              catenating the audio, i.e. by appending splice 1 to the command.
              Clicks at the beginning and end of the audio can be  removed  by
              preceding the splice effect with fade q .01 2 .01.

              Provided your arithmetic is good enough, multiple splices can be
              performed  with a single splice invocation.  For example, with a
              Bourne shell script `acpo':

                 #! /bin/sh
                 # Audio Copy and Paste Over
                 # acpo infile copy-start copy-stop paste-over-start outfile
                 # No chained time specifications allowed for the parameters
                 # (i.e. such that contain +/-).
                 e=0.005                      # Using default excess
                 l=$e                         # and leeway.
                 sox_ng "$1" piece.wav trim $2-$e-$l =$3+$e
                 sox_ng "$1" part1.wav trim 0 $4+$e
                 sox_ng "$1" part2.wav trim $4+$3-$2-$e-$l
                 sox_ng part1.wav piece.wav part2.wav "$5" \
                    splice $4+$e +$3-$2+$e+$l+$e

              two splices are used to `copy and paste' audio.

              It is also possible to use this effect to perform general cross-
              fades, e.g. to join two songs.  In this case, excess would typi-
              cally be a number of seconds, the -q option would  typically  be
              given to select an `equal power' cross-fade and leeway should be
              zero  (which  is  the  default if -q is given).  For example, if
              f1.wav and f2.wav are audio files to be cross-faded, then

                 sox_ng f1.wav f2.wav out.wav splice -q $(soxi_ng -D f1.wav),3

              cross-fades the files where the point of  equal  loudness  is  3
              seconds  before  the end of f1.wav, i.e. the total length of the
              cross-fade is 2 x 3 = 6 seconds ($(...) is POSIX shell  notation
              that is replaced by the output of the enclosed command).

       stat [-s scale] [-rms] [-freq] [-v] [-d] [-a] [-h]
              Display  time and frequency domain statistical information about
              the audio.  Audio is passed unmodified through the SoX  process-
              ing chain.

              The  information  is  output  to  the  `standard error' (stderr)
              stream and is calculated (where n is the duration of  the  audio
              in  samples,  c  is the number of audio channels, r is the audio
              sample rate and xk represents the value (in the range -1 to  +1)
              of each successive sample in the audio), as follows:

              Samples read         nxc
              Length (seconds)     n/r
              Scaled by            See -s below.
              Maximum amplitude    max(xk) The maximum sample value in the au-
                                   dio;  usually  this will be a positive num-
                                   ber.
              Minimum amplitude    min(xk) The minimum sample value in the au-
                                   dio; usually this will be a  negative  num-
                                   ber.
              Midline amplitude    1/2min(xk)+1/2max(xk)
              Mean norm            ^1/n<Sigma>|xk| The average of the absolute
                                   value of each sample in the audio.
              Mean amplitude       ^1/n<Sigma>xk The average of each sample in
                                   the  audio.   If  this  figure is non-zero,
                                   then it indicates the presence of a DC off-
                                   set which could be removed  using  the  dc-
                                   shift effect.
              RMS amplitude        <sqrt>(^1/n<Sigma>xk^2)  The  level of a DC
                                   signal that would have the  same  power  as
                                   the audio's average power.
              Maximum delta        max(|xk-xk-1|)
              Minimum delta        min(|xk-xk-1|)
              Mean delta           ^1/n-1<Sigma>|xk-xk-1|
              RMS delta            <sqrt>(^1/n-1<Sigma>(xk-xk-1)^2)
              EBUR128 Momentary    The maximum momentary loudness over 400ms
              EBUR128 Short Term   The maximum short term loudness over 3 sec-
                                   onds
              EBUR128 Integrated   The integrated loudness over the whole file
              EBUR128 True Peak    The  maximum of the True Peak of each chan-
                                   nel
              Rough frequency      In Hz.
              Volume Adjustment    The parameter to the vol effect which would
                                   make the audio as loud as possible  without
                                   clipping.   See  the discussion on Clipping
                                   above for reasons why it is rarely  a  good
                                   idea actually to do this.

              Note  that  the  delta measurements are not applicable to multi-
              channel audio and EBU R 128 (=ITU-R BS.1770) measurements are in
              Loudness Units referenced to Full Scale (LUFS),

              The -s option can be used to scale the input  data  by  a  given
              factor.   The  default value of scale is 2147483647 (the maximum
              value of a 32-bit signed integer)  as  internal  effects  always
              work  with  those.   A lower value means that a different sample
              value should be considered as the full-scale amplitude.

              The -rms option  converts  all  average  values  to  `root  mean
              square' format.

              The   -freq   option  outputs  the  input's  power  spectrum  (a
              4096-point DFT) instead of the statistics  listed  above.   This
              should only be used with a single-channel audio file.

              The -v option displays only the `Volume Adjustment' value.

              The  -d option displays a hex dump of the 32-bit signed PCM data
              audio in SoX's internal buffer.  This is  mainly  used  to  help
              track  down  endian problems that sometimes occur in cross-plat-
              form versions of SoX.

              The -a option outputs the average power spectrum instead of  the
              power spectrum for each 4096-point DFT.

              The  -h  option  uses the "histogram algorithm" to calculate the
              integrated EBU R-128 loudness, which requires less memory but is
              less accurate.

              The -j option outputs the statistics in JSON format, e.g.:

              {
                "samples_read": 22699008,
                "length": 236.448,
                "scaled_by": 2.14748e+09,
                "maximum_amplitude": 0.818604,
                "minimum_amplitude": -0.532471,
                "midline_amplitude": 0.143066,
                "mean_norm": 0.0352694,
                "mean_amplitude": 0.00180676,
                "rms_amplitude": 0.056726,
                "maximum_delta": 0.367126,
                "minimum_delta": 0,
                "mean_delta": 0.0177341,
                "rms_delta": 0.0268538,
                "rough_frequency": 3616,
                "volume_adjustment": 1.22159
              }

              If -rms was given, "scaled_by" will be "scaled_by_rms" and if -e
              was given, you also get

                "ebur128_momentary": -30.3408,
                "ebur128_short_term": -35.4501,
                "ebur128_integrated": -21.3583,

              Some fields may be  absent  if  their  values  are  incalculable
              (EBUR128  figures)  or  would  be  infinite (like the RMS of si-
              lence).

              As JSON uses scientific notation, it can  shows  the  values  of
              very small numbers that the usual output shows as zero.

              The most common use of stat is to measure the characteristics of
              a single audio file, for which the syntax is:

                 sox_ng file.wav -n stat

              where -n means "No audio output is required."

       stats [-b bits|-x bits|-s scale] [-w time] [-j]
              Display  time  domain  statistical  information  about the audio
              channels; audio is passed unmodified through the SoX  processing
              chain.   Statistics  are calculated and displayed for each audio
              channel and, where applicable, an overall figure is also given.

              For example, for a typical well-mastered stereo music file:

                              Overall     Left      Right
                 DC offset   0.000803 -0.000391  0.000803
                 Min level  -0.750977 -0.750977 -0.653412
                 Max level   0.708801  0.708801  0.653534
                 Pk lev dB      -2.49     -2.49     -3.69
                 RMS lev dB    -19.41    -19.13    -19.71
                 RMS Pk dB     -13.82    -13.82    -14.38
                 RMS Tr dB     -85.25    -85.25    -82.66
                 Crest factor       -      6.79      6.32
                 Flat factor     0.00      0.00      0.00
                 Pk count           2         2         2
                 Bit-depth      16/16     16/16     16/16
                 Num samples    7.72M
                 Length s     174.973
                 Scale max   1.000000
                 Window s       0.050

              DC offset, Min level, and Max level are shown,  by  default,  in
              the  range  +-1.  If the -b (bits) options is given, these three
              measurements are scaled to a signed integer with the given  num-
              ber  of  bits from 2 to 32.  For example, for 16 bits, the scale
              would be -32768 to +32767.  The -x option behaves the  same  way
              as  -b  except  that  the signed integer values are displayed in
              hexadecimal.  The -s option scales the three measurements  by  a
              given floating point number.

              Pk lev dB  and  RMS lev dB  are the standard peak and RMS levels
              measured in dBFS.  RMS Pk dB and RMS Tr dB are peak  and  trough
              values  of  the RMS level measured over a short window (default:
              50ms).  That can be changed with the -w option in  seconds  from
              0.01 to 10.

              Crest factor  is  the  ratio  of peak to RMS level (note: not in
              dB).

              Flat factor is a measure of the flatness (i.e. consecutive  sam-
              ples with the same value) of the signal at its peak levels (i.e.
              either Min level or Max level).

              Pk count  is the number of occasions (not the number of samples)
              that the signal attained either Min level,  or  Max level.   The
              primary  goal  of the Peak Count value is to answer the question
              "has this audio been clipped?", quite possibly as  a  result  of
              the frowned-upon-by-some but common practice of 'brick wall lim-
              iting' in modern mastering. The closer the "Peak Count" is to 1,
              the higher the confidence that the audio has not been clipped.

              The  right-hand  Bit-depth  figure is the standard definition of
              bit-depth, i.e. that all bits other than this number of the most
              significant bits are always zero.  The left-hand figure  is  the
              number  of  bits at the least significant end of those most sig-
              nificant bits that would be sufficient to represent  all  sample
              values accurately (including the sign bit).

              In  mathematical  terms,  the  right-hand figure is the ordinal,
              counting from the most significant bit, of the least significant
              bit that is set to one in at least one  sample.   The  left-hand
              figure  is  the ordinal, counting from the least significant re-
              peated sign bit across all samples, of the least significant bit
              that is set to one in at least one sample.

              Bit-depths are not intended to be properties of the  signal  per
              se but properties of its 2's-complement PCM encoding.

              The primary use case of bit-depth measurement concerns manipula-
              tion  of  PCM  audio by simple bit shifting, to answer questions
              such as: "Is it likely that this 24-bit PCM file was created  by
              simply  converting a 16-bit PCM file to 24-bit?" or "Can I loss-
              lessly shift all the samples in this PCM audio file m-bits  left
              or n-bits right?"

              For  multichannel audio, an overall figure for each of the above
              measurements is given and derived from the  channel  figures  as
              follows:  DC offset:  maximum  magnitude;  Max level, Pk lev dB,
              RMS Pk dB, Bit-depth: maximum;  Min level,  RMS Tr dB:  minimum;
              RMS lev dB,  Flat factor,  Pk count:  average; Crest factor: not
              applicable.

              Length s is the duration in seconds of  the  audio  and,  unlike
              stat,  Num samples  is  equal  to  the sample rate multiplied by
              Length.  Scale max is the scaling applied  to  the  first  three
              measurements;  specifically,  it is the maximum value that could
              apply to Max level.  Window s is the length of the  window  used
              for the peak and trough RMS measurements.

              The -j option outputs JSON with three fields:

              "channel_count"
                     An integer.

              "overall"
                     An  object with a member for each row of the first column
                     of the usual output, which are  all  numbers  except  for
                     "bit_depth", which is an array of two numbers.

              "channels"
                     An array of objects with the per-channel values.
       To  know the overall and the channels' member names, have a look at the
       output.

       Like stat, the usual way to measure the characteristics of a single au-
       dio file is:

          sox_ng file.wav -n stats


       stretch [factor [window [fade [shift [fading]]]]]
              Change the audio duration but not its pitch by cross-fading  be-
              tween  short windows of samples.  This effect is broadly equiva-
              lent to the tempo effect with factor inverted and search set  to
              zero  so,  in general, its results are comparatively poor; it is
              retained as it can sometimes outperform tempo for small factors.

              factor determines the change in  length:  >1  lengthens  and  <1
              shortens.  By default, it is 1 (no change)

              window  is the length of the cross-fading window in milliseconds
              with a default of 20.

              The fade option chooses the  type  of  crossfading:  linear  and
              half-cosine  give  equal-gain  crossfading and cannot clip; sqrt
              and quarter-cosine give two kinds of equal-power crossfading.

              The shift ratio can be from 0 to 1 and its  default  depends  on
              the stretch factor: 1 when speeding up, 0.8 when slowing down.

              The  fading  ratio,  from 0 to 0.5, seems to be how much of each
              window is cross-faded with the adjacent ones.  The default value
              depends on factor and shift: 1.0 - (factor x shift) if  speeding
              up, 1.0 - shift if slowing down, with a maximum of 0.5.

              The duration of stretch's output is slightly longer than the du-
              ration  of the input multiplied by factor as it has to empty the
              delay line it uses; tempo is more precise.

       swap   Swap stereo channels.  If the input  is  not  stereo,  pairs  of
              channels  are  swapped and a possible odd last channel is passed
              through.  E.g., for seven channels, the output order will be  2,
              1, 4, 3, 6, 5, 7.

              See remix for an effect that allows arbitrary channel selection,
              ordering and mixing.

       synth [-j key] [-n] [length [offset [phase [p1 [p2 [p3]]]]]]
              {type [combine [fixed[,extra[,mix]]]]
              [freq[:|+|/|-freq2] [offset [phase [p1 [p2 [p3]]]]]]}

              synth  generates fixed or swept frequency audio tones with vari-
              ous wave shapes and wide-band noise of various colors.  Multiple
              synth effects can be cascaded to produce more complex  waveforms
              and at each stage it is possible to choose whether the generated
              waveform  is mixed with or modulated onto the output of the pre-
              vious stage, and the audio for each channel  in  a  multichannel
              audio file can be synthesized independently.

              It  generates  audio at maximum volume (0dBFS), which means that
              there is a high chance of clipping so, in many cases,  you  will
              want to follow it with the gain effect to prevent this from hap-
              pening. (See Clipping above.)

              Though this effect is used to generate audio, an input file must
              still be given, the characteristics of which are used to set the
              synthesized  audio  length,  the number of channels and the sam-
              pling rate.  However, since the input file's audio is  not  nor-
              mally needed, a `null file' (with the special input filename -n)
              is  often  given instead and the length specified as a parameter
              to synth or by some other effect that has an associated length.

              By default, the tuning used with note notations  is  equal  tem-
              perament;  the  -j key option selects just intonation, where key
              is a whole number of semitones relative to A (so for example, -9
              or 3 selects the key of C) or a  note  in  scientific  notation.
              ,SP  By default, the synth effect incorporates the functionality
              of gain -h (see the gain effect for details); synth's -n  option
              may be given to disable this behaviour.

              length is the length of audio to synthesize.  A value of 0 indi-
              cated  to use the input length, which is also the default.  Note
              that, if the input is -n and the length is 0 or absent, it  con-
              tinues generating audio until it is stopped in some other way.

              type is one of

              sine   A sinusoidal wave is the default type and ignores all the
                     p parameters.

              square A square wave.  p1 sets the percentage of each cycle that
                     is `on' with a default of 50.

                       |_______        | +1
                       |       |       |
                       |_______|_______|  0
                       |       |       |
                       |       |_______| -1
                       |               |
                       0       p1      1


              triangle
                     p1  sets  the  percentage  of each cycle that is `rising'
                     with a default of 50.

                       |    .    | +1
                       |   / \   |
                       |__/___\__|  0
                       | /     \ |
                       |/       \| -1
                       |         |
                       0    p1   1


              sawtooth
                     A sawtooth wave. With a phase of 0 it starts  at  -1  and
                     rises  to  1,  and  of  10 it starts at -0.9.  The offset
                     makes no difference.

                       |    /| +1
                       |   / |
                       |__/__|  0
                       | /   |
                       |/    | -1
                       0     1


              trapezium
                     The trapezoidal wave starts at -1, rises linearly  to  1,
                     stays  there,  falls  linearly to -1, stays there and re-
                     peats.  p1 sets the percentage of the cycle in which  the
                     wave is rising with a default of 10, p2 sets the percent-
                     age through each cycle at which falling begins with a de-
                     fault of 50 and p3 sets the percentage through each cycle
                     at which falling ends with a default of 60.

                       |    ______             |+1
                       |   /      \            |
                       |__/________\___________| 0
                       | /          \          |
                       |/            \_________|-1
                       |                       |
                       0   p1    p2   p3       1


              exp    The  exponential  wave  rises from -1 to 1 where it peaks
                     and immediately begins an exponential fall.  p1 sets  the
                     position  of  the  maximum with a default of 50.  p2 sets
                     the minimum amplitude in multiples of 2dB down  from  the
                     maximum  with  a  default  of 50 (100dB); values below 50
                     raise the shoulders of the wave and values above 50 lower
                     the shoulders, increasing the pointedness of the spike.

                       |                           | +1
                       |            /\             |
                       |          _'  `_           | 0
                       |        _-      -_         |
                       |____---'          `---____ | f(p2)
                       |                           |
                       0             p1             1


              whitenoise
                     Random noise with equal power at  every  frequency.   All
                     noise  generators  ignore the frequency and phase parame-
                     ters but if a DC offset is given, the signal's  amplitude
                     is  automatically  adjusted  to  prevent clipping so, for
                     noise in the range 0 to 1, an offset of 0.5 would give  a
                     signal ranging from 0.0 to 1.0 and -0.9 from -1.0 to -0.8

                     noise is a handy alias for whitenoise

              tpdfnoise
                     Noise with a Triangular Probability Density Function.

              pinknoise
                     Random  noise  with the power at each frequency inversely
                     proportional to the frequency.

              brownnoise
                     Random noise with the power at each  frequency  inversely
                     proportional to the frequency squared.

              pluck  A  plucked  string simulation in which an array of sample
                     values representing a taut string is set in motion with a
                     burst of noise and decayed over time.

                     A plucked note's frequency can be from 27.5 to 4220Hz and
                     the sampling rate must be between 44100 and 48000Hz.

                     If a DC offset is used, the  amplitude  is  automatically
                     adjusted to prevent clipping.

                     p1 affects the sustain with a default of 40 (2dB per sec-
                     ond);  higher values give a slower decay and lower values
                     a faster one.

                     p2 and p3 are tone controls for the  initial  excitation,
                     with  default values of 20 and 90 and a special case when
                     p3 is exactly 100.  If the phase is non-zero, it  uses  a
                     different kind of random numbers.

              If the offset, phase and p parameters are given before the first
              type, they set the default values for all the following stages.

              combine is one of

              create Puts  each  stage's output in a new output channel and is
                     the default:

              mix    Mixes the generated audio 50:50 with the input signal.

              amod   Amplitude-modulates (multiplies) the input signal by  the
                     synthesized  one  considered  as  a value from 0 (for the
                     most negative value) to 1 (for the most positive value).

              fmod   Multiplies the input  signal  with  the  synthesized  one
                     (ring modulation).

              vdelay Mixes the input signal with a delayed version of it using
                     the  synthesized  signal to modulate the depth of the de-
                     lay.  The following three-part option fixed[,extra[,mix]]
                     specifies the fixed and additional parts of the delay  in
                     milliseconds  and  what percentage of the output consists
                     of the delayed signal from 0 for all input signal to  100
                     for  all  delayed  signal  with a default of 50 (half and
                     half).

                     The synthesized signal's value from -1 to +1  varies  the
                     delay from fixed seconds to fixed + (0 to extra) seconds.

                     It  interpolates  linearly  between the input samples and
                     can be used to make precision phaser, flanger and chorus-
                     like effects, vibrato and frequency modulation (FM)  syn-
                     thesis  (actually phase modulation, as used in the Yamaha
                     DX7).

                     A chorus-like effect:

                        sox_ng solo.au -d synth sine vdelay 50,2,50 .25 0 75

                     A flanger:

                        sox_ng solo.au -d synth triangle vdelay 0,2,41.52 0.5 0 0


              freq and freq2 are the frequencies at the beginning and  end  of
              the synthesis and the default frequency is 440Hz.

              If freq2 is given, length must also have been given and the gen-
              erated  tone  is  swept  between the given frequencies.  The two
              given frequencies must be separated by  one  of  the  characters
              `:',  `+', `/' and `-', which specify the sweep function as fol-
              lows:

              :      Linear: the tone changes by a fixed number of  hertz  per
                     second.

              +      Square:  a  second-order  function  is used to change the
                     tone.

              /      Exponential: the tone changes by a fixed number of  semi-
                     tones per second.

              -      Exponential:  as `/', but the initial phase is always ze-
                     ro, and with stepped (less smooth) frequency changes.

              The frequency or frequency range  is  not  used  for  the  noise
              types.

              offset  is  the  bias  (DC offset) of the signal in percent; de-
              fault=0.

              phase is the phase shift as a percentage of 1 cycle with  a  de-
              fault of 0 (not used for noise).

              For  example, the following produces a 3-second 48kHz audio file
              containing a sine wave swept from 300 to 3300Hz:

                 sox_ng -n output.wav synth 3 sine 300-3300

              Multiple channels can be synthesized by specifying  the  set  of
              parameters  shown  between curly braces multiple times; the fol-
              lowing puts the swept tone in the left channel and  brown  noise
              in the right:

                 sox_ng -n output.wav synth 3 sine 300-3300 brownnoise

              The following example shows how two synth effects can be cascad-
              ed to create a more complex waveform:

                 play_ng -n synth 0.5 sine 200-500 synth 0.5 sine fmod 700-100

              The following could be used to help tune a guitar:

                 for n in E2 A2 D3 G3 B3 E4; do
                   play_ng -n synth 4 pluck $n repeat 2; done


       tempo [-q] [-m|-s|-l] factor [segment(82) [search(14.68) [over-
       lap(12)]]]
              Change  the  audio playback speed but not its pitch. This effect
              uses the WSOLA (Waveform Similarity OverLap and Add)  algorithm.
              The  audio is chopped up into segments which are then shifted in
              the time domain and overlapped  (cross-faded)  at  points  where
              their  waveforms  are most similar as determined by the measure-
              ment of `least squares'.

              By default, linear searches are used to find the  best  overlap-
              ping points. If the optional -q parameter is given, tree search-
              es  are  used  instead. This makes the effect work more quickly,
              but the result may not sound as good. However, if you  must  im-
              prove  the  processing  speed,  this generally reduces the sound
              quality less than reducing the search or overlap values.

              The -m option is used to optimize the default values of segment,
              search and overlap for music processing.

              The -s option is used to optimize  default  values  of  segment,
              search and overlap for speech processing.

              The  -l  option  is  used to optimize default values of segment,
              search and overlap for `linear' processing that tends  to  cause
              more  noticeable  distortion  but  may  be useful when factor is
              close to 1.

              If -m, -s or -l is specified, the default value  of  segment  is
              based  on  factor,  while  default search and overlap values are
              based on segment.  Any values you provide override these default
              values.

              factor gives the ratio of new tempo to the  old  tempo,  so  1.1
              speeds the tempo up by 10% and 0.9 slows it down by 10%.

              The  optional  segment parameter selects the algorithm's segment
              size in milliseconds.  If no other flags are specified, the  de-
              fault value is 82, which is suited to small changes in the tempo
              of music. For larger changes (e.g. a factor of 2), 41 may give a
              better result.  The -m, -s, and -l flags cause segment's default
              value to be adjusted automatically based on factor.

              The  optional  search  parameter  gives the audio length in mil-
              liseconds over which  the  algorithm  searches  for  overlapping
              points.   If  no other flags are specified, the default value is
              14.68.  Larger values use more processing time and  may  or  may
              not  produce  better  results.   A practical maximum is half the
              value of segment. Search can be reduced to cut  processing  time
              at the risk of degrading output quality. The -m, -s and -l flags
              cause  the  search default to be adjusted automatically based on
              segment.

              The optional overlap parameter gives the segment overlap  length
              in  milliseconds.  Its default value is 12 but the -m, -s and -l
              flags automatically adjust it based on the  segment  size.   In-
              creasing  overlap  increases  processing  time  but may increase
              quality.  A practical maximum for overlap is a little less  then
              search.

              See  speed  for an effect that changes tempo and pitch together,
              pitch and bend for effects that change pitch  only  and  stretch
              for  an  effect  that  changes the tempo using a different algo-
              rithm.

       treble gain [frequency [width[s|h|k|o|q]]]
              Apply a treble tone control effect.  See the description of  the
              bass effect for details.

       tremolo speed [depth]
              Apply  a tremolo (low frequency sinusoidal amplitude modulation)
              effect to the audio.  The frequency of the tremolo in Hz is giv-
              en by speed and its depth is a percentage with a default of 40.

       trim {position(+)}
              Cuts out portions of the audio.  Any number of positions may  be
              given;  audio is not sent to the output until the first position
              is reached.  The effect then alternates between copying and dis-
              carding audio at each position.  Using a  value  of  0  for  the
              first  position  parameter  allows copying from the beginning of
              the audio.

              For example,

                 sox_ng in.au out.au trim 0 10

              copies the first ten seconds, while

                 play_ng in.au trim 12:34 =15:00 -2:00

              and

                 play_ng in.au trim 12:34 2:26 -2:00

              both play from 12 minutes 34 seconds into the  audio  up  to  15
              minutes  in  (i.e.  2  minutes  and 26 seconds long) then resume
              playing two minutes before the end.

       upsample [factor(2)]
              Upsample the signal by an integer factor:  factor-1  zero-valued
              samples  are  inserted between each pair of input samples.  As a
              result, the original spectrum is replicated into  the  new  fre-
              quency space and attenuated.  This attenuation can be compensat-
              ed  for  by adding vol factor.  The upsample effect is typically
              used in combination with filtering effects.

              For a general resampling effect  with  antialiasing,  see  rate.
              See downsample.

       vad [options]
              The  Voice  Activity Detector attempts to trim silence and quiet
              background sounds from the ends of (fairly high resolution  i.e.
              16-bit, 44-48kHz) recordings of speech.  The algorithm currently
              uses a simple cepstral power measurement to detect voice, so may
              be  fooled  by  other  things, especially music.  The effect can
              trim only from the front of the audio, so in order to trim  from
              the back, the reverse effect must also be used.  E.g.

                 play_ng speech.wav norm vad

              to trim from the front,

                 play_ng speech.wav norm reverse vad reverse

              to trim from the back and

                 play_ng speech.wav norm vad reverse vad reverse

              to  trim  from  both ends.  The use of the norm effect is recom-
              mended, but remember that neither reverse nor norm  is  suitable
              for use with streamed audio.

              Options
              Default  values  are  shown in parentheses, the allowed range in
              square brackets.

              -t num (7) [0 - 20]
                     The measurement level used to trigger activity detection.
                     This might need to be changed depending on the noise lev-
                     el, signal level and other characteristics of  the  input
                     audio.

              -T num (0.25) [0.01 - 1]
                     The  time constant (in seconds) used to help ignore short
                     bursts of sound.

              -s num (1) [0.1 - 4]
                     The amount of audio  (in  seconds)  to  search  for  qui-
                     eter/shorter  bursts of audio to include prior to the de-
                     tected trigger point.

              -g num (0.25) [0.1 - 1]
                     Allowed gap (in seconds) between  quieter/shorter  bursts
                     of audio to include prior to the detected trigger point.

              -p num (0) [0 - 4]
                     The  amount  of audio (in seconds) to preserve before the
                     trigger point and any found quieter/shorter bursts.

              Advanced Options
              These allow fine tuning of the algorithm's internal parameters.

              -b num (0.35) [0.1 - 10]
                     The algorithm uses adaptive noise estimation/reduction in
                     order to detect the start of the wanted audio.  This  op-
                     tion sets the time in seconds for the initial noise esti-
                     mate.

              -N num (0.1) [0.1 - 10]
                     Time  constant  used by the adaptive noise estimator when
                     the noise level is increasing.

              -n num (0.01) [0.001 - 0.1]
                     Time constant used by the adaptive noise  estimator  when
                     the noise level is decreasing.

              -r num (1.35) [0 - 2]
                     Amount  of  noise reduction to use in the detection algo-
                     rithm.

              -f num (20) [5 - 50]
                     Frequency of the algorithm's processing/measurements.

              -m num (0.1) [0.01 - 1]
                     Measurement duration. By default, it is  twice  the  mea-
                     surement  period;  i.e.  with 50% overlap, but if you set
                     -f, you also need to change -m to 2 divided by its  value
                     to keep a 50% overlap.

              -M num (0.4) [0-1 - 1]
                     Time constant used to smooth spectral measurements.

              -h freq (50) [10 -]
                     `Brick-wall' frequency of the high-pass filter applied at
                     the detector algorithm's input.

              -l freq (6000) [1000 -]
                     `Brick-wall'  frequency of the low-pass filter applied at
                     the detector algorithm's input.

              -H freq (150) [10 -]
                     `Brick-wall' frequency of the high-pass  lifter  used  in
                     the detector algorithm.

              -L freq (2000) [1000 -]
                     `Brick-wall' frequency of the low-pass lifter used in the
                     detector algorithm.

              See the silence effect.

       vol gain [type [limiter-gain]]
              Apply  amplification or attenuation to the audio signal.  Unlike
              -v, which is used for balancing multiple input files as they en-
              ter the SoX effects processing chain, vol is an effect like  any
              other  so  can  be  applied anywhere in the processing chain and
              several times if necessary.

              The amount to change the volume is given by gain which is inter-
              preted, according to the given type, as follows: if type is  am-
              plitude  (or is omitted), gain is an amplitude ratio (voltage or
              linear), if power, a power ratio (wattage  or  voltage  squared)
              and if dB, a power change in dB.

              When  type  is amplitude or power, a gain of 1 leaves the volume
              unchanged, less than 1 decreases it, and greater than 1 increas-
              es it; a negative gain inverts the audio signal in  addition  to
              adjusting its volume.

              When  type  is dB, a gain of 0 leaves the volume unchanged, less
              than 0 decreases it and greater than 0 increases it.

              See [4] for a detailed discussion on electrical (and hence audio
              signal) voltage and power ratios.

              Beware of Clipping when the increasing the volume.

              The gain and the type parameters can be concatenated if desired,
              e.g.  vol 10dB.

              An optional limiter-gain value can be specified and should be  a
              value  much  less than 1 (e.g. 0.05 or 0.02) and is used only on
              peaks to prevent clipping.  Not specifying this parameter causes
              no limiter to be used.  In verbose mode,  this  effect  displays
              the percentage of the audio that needed to be limited.

              See  gain  for a volume-changing effect with different capabili-
              ties and compand for a dynamic range  compression/expansion/lim-
              iting effect.

ENVIRONMENT
       SoX reacts to the following environment variables.  To set them on Unix
       with most shells, use, for example:

          AUDIODRIVER=oss
          export AUDIODRIVER
          play_ng ...

       with Unix csh:

          setenv AUDIODRIVER oss

       or, on Microsoft Windows:

          set AUDIODRIVER=waveaudio

       MS-Windows  GUI:  via  Control  Panel : System : Advanced : Environment
       Variables

       Mac OS X GUI: Refer to Apple's Technical Q&A QA1067 document.

       AUDIODRIVER
              On some systems, SoX may have more than one type of  audio  dri-
              ver,  e.g.  ALSA  and OSS or SUNAU and AO and they can have more
              than one audio device (a.k.a. `sound card').  If more  than  one
              audio driver has been built into SoX and the default selected by
              SoX when recording or playing is not the one that is wanted, the
              AUDIODRIVER environment variable can be used to override the de-
              fault.  For example, on Unix systems:

                 AUDIODRIVER=oss
                 export AUDIODRIVER
                 play_ng ...

              If  it is unset, SoX tries to use, in order, coreaudio, pulseau-
              dio, alsa, waveaudio, sndio, oss, sunau and ao.  For further de-
              tails on these, see their entries in soxformat_ng(7).

       AUDIODEV
              Override the default audio device, e.g.

                 AUDIODEV=/dev/dsp2
                 export AUDIODEV
                 play_ng ...
                 sox_ng ... -t oss

              or

                 AUDIODEV=hw:soundwave,1,2
                 export AUDIODEV
                 play_ng ...
                 sox_ng ... -t alsa

              If AUDIODEV is unset and the audio driver is oss, SoX  also  re-
              sponds to the standard environment variable OSS_AUDIODEV.

       LADSPA_PATH
              A  colon-separated  list  of  directories in which to search for
              LADSPA plugins.  The default depends on how SoX was built but on
              Unix it defaults to /usr/lib/ladspa, on MacOS/X to  /Library/Au-
              dio/Plug-Ins/LADSPA.   Windows  doesn't have a "usual place" for
              LADSPA plugins, but Ardour puts  them  in  C:\Program  Files\Ar-
              dour6\lib\ardour6\ladspa or similar.

       LD_LIBRARY_PATH
              When  searching  for the dynamic libraries in which most effects
              and format handlers may be stored, according to how your SoX was
              built, look in this colon-separated list of  directories  before
              the default location.

       MIXERDEV
              When  playing a file, use the specified mixer device for the 'v'
              and 'V' volume control keys.

       SOX_OPTS
              Provide alternative default values  for  SoX's  global  options.
              For example:

                 SOX_OPTS="--buffer 20000 --play-rate-arg -hs --temp /mnt/temp"
                 export SOX_OPTS

              Note  that  setting  SOX_OPTS can create unwanted changes in the
              behaviour  of  scripts  or  other  programs  that  invoke   SoX.
              SOX_OPTS might best be used for things that reflect the environ-
              ment  in  which  SoX  is  being run and enabling options such as
              --no-clobber by default might be handled better  using  a  shell
              alias  since  that will not affect SoX's operation in scripts or
              when it is used by other programs.

              One way to ensure that scripts and programs cannot  be  affected
              by SOX_OPTS is to clear SOX_OPTS at the start of the script, but
              this  loses  the  benefit  of  SOX_OPTS carrying system-wide de-
              faults.

       TEMP and TMP
              On Windows, tmpfile() is broken  -  it creates the file  in  the
              root directory of the current drive instead of in a valid tempo-
              rary  directory   -  but if TEMP or TMP are set, it creates them
              in the directory indicated, otherwise in the current  directory.
              To force use of tmpfile(), use --temp .

              Alternatively,  and  on  Unix,  use  --temp  (see Global Options
              above).

EXIT STATUS
       SoX exits 0 when there is no error, 1 if there is a  problem  with  the
       command-line  parameters  or 2 if an error occurs during audio process-
       ing.

BUGS
       Please report any bugs found in this version of SoX to the mailing list
       <sox-ng@groups.io>.

CITATION
       To cite SoX in publications please use:

       Lance Norskog, Chris Bagwell et al. (2015).
       SoX: Sound eXchange, the Swiss Army knife of audio manipulation.
       URL http://sox.sourceforge.net

       A BibTeX entry for SoX users is

       @manual{SoX2015,
         title = "SoX: Sound eXchange, the Swiss Army knife of audio manipulation",
         author = "Norskog, Lance and Bagwell, Chris and others",
         edition = "14.4.2",
         year = 2015,
         url = "http://sox.sourceforge.net",
       }


SEE ALSO
       soxi_ng(1),  soxformat_ng(7),  libsox_ng(3)  audacity(1),  ecasound(1),
       gnuplot(1), octave(1).
       The sox_ng web site at https://codeberg.org/sox_ng/sox_ng
       SoX         scripting         examples         at         https://code-
       berg.org/sox_ng/sox_ng/src/branch/main/scripts

   References
       [1]    R. Bristow-Johnson, Cookbook formulae for audio EQ biquad filter
              coefficients,
              https://www.w3.org/TR/audio-eq-cookbook

       [2]    Wikipedia, Q-factor,
              http://en.wikipedia.org/wiki/Q_factor

       [3]    Scott Lehman, Effects Explained,
              https://codeberg.org/sox_ng/Effects-Explained

       [4]    Wikipedia, Decibel,
              http://en.wikipedia.org/wiki/Decibel

       [5]    Richard Furse, Linux Audio Developer's Simple Plugin API,
              http://www.ladspa.org

       [6]    Richard Furse, Computer Music Toolkit,
              http://www.ladspa.org/cmt/overview.html

       [7]    Steve Harris, LADSPA plugins,
              http://plugin.org.uk

LICENSE
       Copyright 1998-2013 Chris Bagwell and SoX Contributors.
       Copyright 1991 Lance Norskog and Sundry Contributors.

       This program is free software; you can redistribute it and/or modify it
       under the terms of the GNU General Public License as published  by  the
       Free Software Foundation, version 2.

       This  program  is  distributed  in the hope that it will be useful, but
       WITHOUT ANY  WARRANTY;  without  even  the  implied  warranty  of  MER-
       CHANTABILITY  or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
       Public License for more details.

AUTHORS
       Lance Norskog, Chris Bagwell and many others listed in the AUTHORS file
       that is distributed with the source code.

sox_ng                         December 05, 2024                        SoX(1)
