Network Audio Synthesis Technologies
The future of Loop Based Music
by Anthony Matarazzo

Translation

Many facets of music production are assisted by the use of computer technology. In fact, many studios would be out of business if they had to rely on older technology alone. Recently, music composition, that is the actual rendering of the audio musical notes, has become available on the Personal Computer. Programs like Acid, Fruity Loops, Adobe Audition, Reason, and Cakewalk all facilitate this new technology of audio rendering on a client machine. However, limitations still exist when using these programs and that is a reduction on polyphonic notes and sound quality.

Ever heard the Korg Karma play an electric rendering or a TRITON EXTREME render a symphony? The depth quality is far superior in many ways than real time synthesis on modern computing platforms. A keyboard is a highly specialized embedded device, and many would go further on the description. Specialized keyboard synthesis technology is just that advanced. It seems with the new forms of software rendering using VST, a plug-in format for the software render, you can only get so much quality out of them. But they are extremely flexible in implementation so perhaps as a pooled resource on a server they would perform well. One of the main purposes of the INET vision is to offload computing power and resources to external devices using network technology. The highest quality sound setting does mean expensive clock cycles on your CPU. If you know anything about musicians, they are either rich or extremely poor. Either way they both listen for high quality sounds. I propose that by offloading the rendering of the audio signal to a remote network resource, higher quality and better scalability can be achieved effectively while controlling costs. That is, controlling costs for the manufacture and the consumer at the same time.

Modifying modern audio synthesis components to be a networked resource will not be an easy job. Playing audio is an extremely time sensitive task; one millisecond off and it could be an entirely new sound. Most modern keyboards respond to and send MIDI data. MIDI stands for Musical Instrument Data Interface. MIDI is a condensed binary format that describes several characteristics of a play note as well as containing a plethora of commands that are used to modify musical equipment parameters. By sending this information through the internet, using Universal Data Packets (UDP) for example, the receiving computer could route MIDI enhanced (internet) commands to a rack of sound bank synthesizers which in turn will output a raw sound wave. This raw wave form must be routed through the host’s internal high speed network for further processing.

The sound in raw form actually has too much data in it to be transferred over the web in a real time capacity. Just one second of high quality CD audio in raw PCM form takes two channels * 16bits (or 24bits) * 44100 and this equals 1764000 bytes (1.7mb). That is more than one floppy disk just for one audio second!

A compression must be used for near real-time play back when using a the internet network; currently an electrical based networking solution. By routing the sound signal to a network audio compression unit, the information can be reduced for the given real time internet bandwidth limitation. However this is not to say that near CD quality cannot occur. Modern streaming technologies are a testament. High quality can be achieved for live broadcast.

Effects processing may also be a part of the instrument setup or a channel layer depending on the composition requirements. This will happen before the signal compression resulting in the highest logical quality. The network flow control of the audio data on the server side should reflect this. A network audio effects processing unit will perform its operations on the raw PCM (pulse code modulation) wave form. The potential for a limitless network based audio and video studio exists distributed to an INET VLISM compatible device. The device, because its main purpose is music production may have sound loaded and have a small music keyboard input device.

Sound in its natural form is an analog resource so the capacity to transform audio using a high speed DAC (digital to analog converter) for analog audio operations such as tube amps could be a combination. This may also be needed for some analog synthesis technologies and a hybrid step to integrating current methods to the process. The signal will need to be sampled from this analog resource back to digital PCM data. It is important to note that the possibility of signal interference may occur giving the artifact of noise.

Finally the stream will be reduced for the requested medium transport. An audio compression network unit must be designed for the task. When the audio stream has been reduced for internet transfer it can be routed back to the client computer. The client computer should store the sample along with the associated time stamp and mix the audio in real-time during local playback. The end result will be high quality sounds at a low CPU cost. Most central processing units over the 1.0 GHZ range can mix several audio channels together without any problem.

All of these audio operations will be controlled using preset values while being overridden from values contained within the connection stream. Applying pooling technologies with a rack bank architecture will be the mid grade goal. This new form of connectivity should not prevent plug and play of synthesis technologies at the local domain; a network plug device to a INET VLISM smart computing terminal. Also building a specialized hardware server computing platform (mainframe) that manages user MIDI state banks, provides audio synthesis, and manages storage for the sample will enable high quality creativity at the consumer level using the INET VLISM smart computing terminal.

One thing that is an interesting side effect of this vision is that now audio synthesis banks are reusable as a network resource. This adds a new level of resource allotment, procurement and could possibly offset accounting principles of music equipment. Can I rent the Mars network for two hours while I render my tracks? As well, it presents the opportunity of synthesis resources being located globally. So for example I could play a Russian, Japanese, and American synthesizer all in the same song. As far as I know nothing like this project has ever been completed.

In conclusion, I believe the future of looped based music production is a network based solution. One that is versatile enough and consisting of open source architectures to allow world wide distribution. This does permit reuse of many resources and this offsets costs for the user and manufacturer. Imagine the Krell network lighting up all the way in a 1024 polyphonic compilation on a device that costs sixty five bucks and is disposable. Network rental and software distribution, public domain computing.

See Also



Close Window