----------------------------------------------------------------- The Pitch-Scaled Harmonic Filter (PSHF) developed by Philip Jackson and David Moreno ----------------------------------------------------------------- Copyright (c) 2001-2010 Philip Jackson 2002-2004 David Moreno All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. - Neither the name of the PSHF working group nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ----------------------------------------------------------------- Version 3.13 of the PSHF, a public release, has been prepared and is ready for you to use for both research and commercial purposes. Executables run under Linux and Windows, and the source code is available, but they come with no warranty or support. The PSHF may be used, modified and copied without charge, while the copyright is retained by the authors. The Linux edition is available at http://www.ee.surrey.ac.uk/Personal/P.Jackson/PSHF/code/pshf_3.13_linux.tar.gz and the Windows edition at http://www.ee.surrey.ac.uk/Personal/P.Jackson/PSHF/code/pshf_3.13_win32.zip Once downloaded, it should be simply a question of unpacking it, e.g.: > gunzip pshf_3.13_linux.tar.gz > tar -xvf pshf_3.13_linux.tar and then it can be run from the command line (with the appropriate arguments): > ./pshf to return the usage info, or from the directory, > ../pshf -E 8 -e 4 -i 1 -d 1 -S ./scriptfile.scp This call is already in the shell file there, and the script file can be used to list all the files you wish to process, each of which requires: (1) a set of raw pitch estimates as input, along with (2) the waveform. If you have Matlab, it's a good idea to run the script to inspect the f0 tracks and speech waveforms and check that the PSHF is behaving as expected! In Windows, you should be able to run the PSHF from the command line, e.g., in an MS-DOS window, or use to have the functionality of Unix scripts from within the Windows environment. FREQUENTLY ASKED QUESTIONS =========================================== --------------------------------------------------------------------- Q.1: The supplied waveform is "1473533", if I wanted to supply other waveform, what should I do? --------------------------------------------------------------------- A.1: If you want to try other input files you can either use the command directly with the appropriate arguments, e.g., > ./pshf -E 8 -e 4 -i 1 -d 1 raw_pitch.f0 waveform.wav output or change the script file , each line of which should contain the path and filename of the input pitch file, the input waveform path and filename and then the output path and the stem of the filename (e.g., "result1", in order to produce "result1_v.wav" and "result1_u.wav). The current contents of the file look like this: in/pitch.f0 in/waveform.wav out/result1 Therefore, if you want to supply a different waveform, you just have to change "in/waveform.wav" to your new filename and path, but you will have to provide a new input pitch file, containing initial estimates of the fundamental frequency, f0, through the course of the utterance. Also, if you just type "./pshf" at the command line (or "pshf.exe" in Windows) with no arguments, you should get back this help about usage: -- PSHF v3.13 by Philip J.B. Jackson & David M. Moreno, (c) 2010 -- USAGE: PSHF [options] pitch_file wave_file [output_path] Flag Default Explanation ---- ------- ----------- -b [4] Number of periods used in algorithm. -d [2] Initial step size (as a power of 2). -e [10.0] External pitch sampling period (ms). -i [10.0] Internal pitch sampling period (ms). -m [40.0] Minimum fundamental frequency (Hz). -t False Whether fast optimisation pitch is performed. -E [20.0] External pitch offset (ms). -H [8] Number of harmonics in cost function. -M [500.0] Maximum fundamental frequency (Hz). -P False Whether power-based pair are produced. -T [0] Different levels of reporting. -S None Script of pitch & wave files, and output path. ---------------- PSHF terminated without operation ---------------- --------------------------------------------------------------------- Q.2: When I want to process other sound/speech samples, do I have to use the same filenames, like and ? --------------------------------------------------------------------- A.2: No, you are free to edit them at your will. There could be problems if you use filenames with spaces in or non-alphanumeric characters (like "&" or "%", perhaps), but otherwise you can just put the new names in at the command line or edit the script file. Also, you can put as many lines as you want (to process more than one file at a time), e.g., dir1/file1.f0 in/file1.wav out/file1 dir2/file2.f0 in/file2.wav out/file2 dir3/name3.f0 in/wav3.wav out/output... --------------------------------------------------------------------- Q.3: What do you normally use to view the results? --------------------------------------------------------------------- A.3: For plotting the waveforms, I usually use Matlab, but once you have produced the files it's entirely up to you. Alternatives include CoolEdit, Audacity, Praat and SFS. --------------------------------------------------------------------- Q.4: How should I set the internal pitch rate? --------------------------------------------------------------------- A.4: Common default values are 5 ms and 10 ms, given the typical rate of pitch pulses, but I have also used 1 ms to reduce effects of sudden changes in f0. There are four main options for the internal rate (-i), in relation to the external pitch rate (-e, defined by your input raw pitch file): (a) if -e <= -i then -i = -e, no upsampling is performed; (b) if -e > -i then -i is rounded according the resolution set by 1/FS, and upsampling is performed, (e.g. FS=8000 Hz, resolution=0.125 ms); (c) if -i = 0 then -i = 1000/FS ms, and upsampling is performed; (d) if -i is omitted then -i = -e. =======================================================================