Factors to be considered when
recording for Real Singer
a passing noise interferes with your recording, simply re-record it. If
you have companions, suggest that they go elsewhere, since their small
movements may escape your attention but still be heard in the
recording. Yes, your friend sitting behind you is giggling while you
are trying to record...
Depending on your country, alternating current
(AC) hum has a frequency of 60Hz or 50Hz, with overtones throughout the
vocal range, particularly at 180Hz (150Hz). At right is the spectrum of
AC hum from a noisy setup. It is very important to reduce AC hum, since
it is difficult to remove this noise without distorting your voice.
you use a laptop computer, the simplest way to reduce AC hum is to use
battery power, and have no peripheral devices connected. If you do use
peripherals, the cables should be disconnected at the computer when
power is off, not left dangling from the computer.
a preamplifier (or tape deck) and a computer are connected, and when
both of them use AC power, the amount of hum depends on how the power
cords are plugged in. The loudest hum is produced when the
two devices are plugged into different wall sockets. If one device has
an AC auxiliary outlet, plug the second device into it, rather than to
the wall. Or, plug both devices into a single extension cord. If the
power plugs are not polarized (that is, if they can be inserted into
the outlet with prongs reversed), try reversing the prongs.
Some microphones will pick up a lot of AC hum
when you touch them. If that happens, mount the mike on an insulating
stand, instead of holding it. If you do not have a mike stand, try
taping the mike to a wooden stick, held vertical by taping it to the
back of a chair. Pay careful attention to this. Just because a
microphone can be held in the hand, does not mean that it should be
held. If you use a headset microphone, see if AC hum is reduced when
you remove the headset.
sure that your microphone cable does not run near any power cords.
There may be power lines underneath your floor, so try moving the
microphone cable. The same applies to the cord between computer and
preamp or tape deck, if you are using one. It is especially important
to stay away from motor-driven devices, including ceiling fans.
you see a noise spectrum like the one above, but the fundamental
frequency of your AC is not the first peak, then the source of noise is
probably a motor-driven appliance. Machine sounds are common.
You have learned to ignore your refrigerator, heating, ventilation,
computer fans, and ticking clocks. But if they are present, they will
be included in your voice recording. Consider turning off machines -
but don't forget to turn them back on again! If you have a lot of noise
distributed evenly across the spectrum, it may be caused by air rushing
through a ventilation system.
noise is caused by the electrical properties of your system. If this
noise is small, Real Singer can analyze it and reduce its effect. But
if the system noise is too large, you will have to try a different
your computer's sound card is poor, it will detect
electrical noise from the surrounding circuitry and include it in your
recording. This is especially true if you are using a microphone
connected directly to the computer's microphone input. If you have
eliminated all other possible noise sources and still have too much
unexplained noise, this may be the culprit. Try recording your voice to
a tape deck, or using a pre-amplifier, so that you can feed the preamp
line-out to your computer's sound card line-in, instead of to the
microphone jack. Remember that external recording equipment usually
requires a different kind of microphone than the kind used directly by
you are using a tape deck, it is better to use high-bias or metal tapes
and noise reduction. Do not use automatic gain control. Do not use a
microphone "built in" to the recorder.
Sound quality factors
human voice contains important frequency components across a broad
range. The fundamental pitch of sung notes is below 500Hz (even lower
male voice), with important overtones at higher frequencies. The range
around 2-6KHz contains frequencies that add color and definition to the
voice, especially during some consonants and transitions.
sure that your microphone has a smooth frequency response across this
spectrum. If the mike is normally sensitive only to low frequencies,
but has an artificial boost for the highest frequencies, then your
recorded voice will sound too bright. Some computer mikes intended for
speech recognition (conversion of words to text) may have such an
artificial frequency response. But as long as the mike responds
across the vocal range, it is not necessary to have a level (flat)
frequency response, since Real Singer includes an equalizer.
Saturation and clipping
and clipping occur when an input signal is too large. This can occur at
the microphone, or at any stage of signal processing.
your voice is too loud, the microphone will distort the sound, even if
the electrical output from the mike is within the acceptable range.
Computer microphones often have a low dynamic range, meaning that there
is not much difference between the softest sounds they can detect above
the noise, and the loudest sounds they can accept without distortion.
When recording to Real Singer, it is important to keep your voice at
uniform loudness. This is especially true if you are using a computer
audio microphones have a much greater range of loudness that they can
accept without distortion. But the range of electrical signals produced
is also large. This kind of microphone is used with a preamplifier (or
tape deck, acting as preamp). Be sure to pay attention to the VU or
other signal amplitude meter. It is OK to briefly exceed a limit if the
sound is in an unimportant part of a word, far from the phoneme that
you are trying to validate.
Do not use automatic gain control (AGC) for
recording to Real Singer. The distortions introduced by AGC are likely
to be greater than the distortions removed. It is better to move away
from the microphone, or manually adjust volume controls. Portable tape
recorders, and office-style voice recorders, usually use AGC. Avoid
using these devices, if you can.
you are transferring a signal into your computer from a preamp or tape
deck, be sure to use the correct jacks. Never take a signal from a jack
intended to directly drive loudspeakers. The best connection is
line-out to line-in.
you are using an audio editor to apply digital filters to a
pre-recorded waveform, be sure that the filter does not clip your sound.
Some consonants are difficult to record, because they
are soft and
create a lot of breath wind. In English, these are f, h, s, and th
You will need to place your mouth close to the microphone, but not
allow the breath wind to touch it. It helps to feel the air stream
coming from your mouth when you make these sounds, to ensure that the
mike is correctly placed.
Some other consonants are difficult to record, because
they are abrupt. In English, these are b, d, hard g (go), k, and p.
sounds have a moment of high intensity that quickly tapers to a short
sound. If spoken too loudly, the intense part will saturate or clip. If
spoken too softly, the tapered part will not be detected. Or, if you
naturally speak these consonants softly, Real Singer may decide that
your voice is "too loud" based on the part of the recorded word leading
up to the consonant. Resist the temptation to speak these in an
un-natural manner, to "help" Real Singer find them. If you do that,
Real Singer will find an un-natural sound!
you are having difficulty producing a satisfactory recording of these
consonants, or if you would generally like to change what Real Singer
hears from you, then pre-record your voice and use an audio editor. You
can reduce the amplitude of an unnecessary part of a word that is "too
loud," so that a necessary, softer part can be accepted. But it is
usually not advisable to edit the volume in the portion of sound that
contains the desired phoneme, because that will interfere with
Using an audio editor
audio editor is a program that will open an audio file, change its
contents, and export the result to a new audio file. One such program
is the free Audacity (Windows or Macintosh) available from
sourceforge.net. In addition to opening and exporting WAV files, it can
open and export Vorbis OGG files. These files can be used by Real
Singer in place of a live voice.
an audio editor, you can: (1) Import a lengthy recording or several
words, and slice it into individual words. (2) Adjust the volume or
equalization. (3) Inspect sounds for the presence of sudden noise
events. (4) Apply special effects (not recommended for Real Singer).
With an audio editor, you can help find sources
of noise by looking at noise amplitude and spectra. The most valuable
use is to inspect the recording waveforms for the presence of
saturation and clipping. For this reason, it is a good idea to pre-test
your method of recording, inspect its results with an audio editor, and
make any necessary changes to your setup. Then Real Singer will have
quality sound to use for your voice.
occurs when an increase in sound power produces less than the
proportional increase in recorded signal power. Saturation is often
desirable; it is certainly better than clipping. But in Real Singer, it
is better to avoid saturation, because the recorded tone will
be used in soft and loud passages. If you look at sample recordings of
your voice with an audio editor, and see that the recorded amplitude
is always about the same during both loud and soft parts of your
then you may have saturation. (Or, you may be a master at keeping your
voice at an even level!) Try recording at lower volume, or move the
microphone slightly farther from your mouth. Be sure that automatic
gain control is not in use. Avoid saturation in, or near, any part of
the word that will be used for the phoneme.
At right are some images (at reduced size) from an
audio editor. The top image shows a waveform that has been properly
recorded. Even though Virtual Singer plays a sample word with very
uniform amplitude, the live human voice varies in amplitude. These
irregularities can be seen in the envelope of the waveform.
image is the same sound, recorded with saturation. Notice how the
irregularities of the envelope have been smoothed. Examining the
spectra would show that certain frequencies are more prominent in the
the waveform with saturation than in the unsaturated wave.
The third image shows
clipping, in this case caused by too large an electrical signal at the
sound card input. Notice how the envelope has been flattened
(flattening may be symmetrical or asymmetrical).
The fourth image also
shows clipping, even though the recorded waveform has lower amplitude
than before. In this case, the clipping occurred at the microphone,
because the sound was too loud. The electrical signal was reduced by
the sound card volume control. However, once a wave is clipped, it
cannot be un-clipped.
At left is a composite image of two spectra, for
the same word recorded by two different microphones. Areas of concern
with an asterisk. One of the microphones (purple spectrum) shows
excessive response in the second overtone (third harmonic), which is
one characteristic of saturation. Also, that microphone shows excessive
response in the high frequency range - probably due to artificial
enhancement - which makes the sound bright and harsh. This microphone
was intended for computer speech recognition.
The other microphone was
a pre-amplified dynamic type, normally used for audio recording. It had
a more satisfactory sound quality (green spectrum).