H. Alvestrand
UNINETT
May 1999
The Audio/L16 MIME content type
Status of this Memo
-
This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.
Copyright Notice
-
Copyright © The Internet Society (1999). All Rights Reserved.
1. Introduction
-
This document defines the audio/L16 MIME type, a reasonable quality audio format for use in Internet applications.
Possible application areas include E-mail, Web served content, file upload in Web forms, and more.
2. The need for the Audio/L16 MIME type
-
The set of IETF standard MIME types for audio is small; it consists of only the audio/basic and audio/32kadpcm types, which have a sampling rate of 8000 samples/second.
Rates below 11025 may obscure consonant information, even for single-voice speech. Common compressions, such as LPC, are known to be microphone-dependant and lossy. Thus far all IETF MIME Audio types either default to 8000 samples per second or use LPC.
In order for advanced speech recognition and related educational applications to make use of internet transports (such as RFC 1867 file uploading) which use MIME typing, higher standards are required.
This type repairs that lack by registering a very simple MIME type that allows higher rate, linear-encoded audio with multiple channels.
This is an IESG approved MIME type, and its definition is therefore published as an RFC.
Please note that there are many other Audio types described in RFC 1890 [1] which IANA may wish to formally register; this one, of all of them, seems to be most immediately needed. This document may also serve as a template for further registrations of these audio types.
3. The definition of Audio/L16
-
Audio/L16 is based on the well know audio format "L16" described in RFC 1890 section 4.4.8 for use with RTP transport. L16 denotes uncompressed audio data, using 16-bit signed representation in twos- complement notation and network byte order. (From section 4.4.8 of RFC 1890)
It may be parametrized by varying the sample rate and the number of channels; the parameters are given on the MIME type header.
In order to promote interoperability, only a few rate values are standardized here. Other values may NOT be used except by bilateral agreement.
If multiple audio channels are used, channels are numbered left-to- right, starting at one. Samples are put into the data stream from each channel in succession; information from lower-numbered channels precedes that from higher-numbered channels.
For more than two channels, the convention followed by the AIFF-C audio interchange format should be followed [1], using the following notation:
l left r right c center S surround F front R rear channels description channel 1 2 3 4 5 6 ___________________________________________________________ 2 stereo l r 3 l r c 4 quadrophonic Fl Fr Rl Rr 4 l c r S 5 Fl Fr Fc Sl Sr 6 l lc c r rc S
(From RFC 1890 section 4.1)
4. IANA registration form for Audio/L16
-
MIME media type name : Audio
MIME subtype name : L16Required parameters
rate: number of samples per second -- Permissible values for
rate are 8000, 11025, 16000, 22050, 24000, 32000, 44100, and
48000 samples per second.
-
Optional parameters
channels: how many audio streams are interleaved -- defaults
to 1; stereo would be 2, etc. Interleaving takes place
between individual two-byte samples.
-
Encoding considerations
-
Audio data is binary data, and must be encoded for non-binary transport; the Base64 encoding is suitable for Email. Note that audio data does not compress easily using lossless compression.
Security considerations
Audio data is believed to offer no security risks.
-
Interoperability considerations
-
This type is compatible with the encoding used in the WAV (Microsoft Windows RIFF) and Apple AIFF union types, and with the public domain "sox" and "rateconv" programs.
Published specification
RFC 2586
Applications which use this media
-
The public domain "sox" and "rateconv" programs accept this type.
1. Magic number(s) : None
2. File extension(s) : WAV L16
3. Macintosh file type code : AIFF
Person to contact for further information
1. Name : James Salsman 2. E-mail : jps-L16@bovik.org
Intended usage
Common
-
It is expected that many audio and speech applications will use this type. Already the most popular platforms provide this type with the rate=11025 parameter referred to as "radio quality speech."
Author/Change controller
James Salsman
5. Security considerations
-
The audio data is believed to offer no security risks.
Note that RFC 1890 permits an application to choose to play a single channel from a multichannel tranmission; an attacker who knows that two different users will pick different channels could concievably construct some confusing messages; this, however, is ridiculous.
This type is perfect for hiding data using steganography.
6. References
-
[1] Schulzrinne, H., "RTP Profile for Audio and Video Conferences with Minimal Control", RFC 1890, January 1996.
7. Authors' Addresses
-
James Salsman 575 S. Rengstorff Avenue Mountain View, CA 94040-1982 US
EMail:
James@bovik.org
Harald Tveit Alvestrand
UNINETT
N-7034 TRONDHEIM
NORWAYPhone: +47 73 59 70 94 EMail: Harald.T.Alvestrand@uninett.no
8. Full Copyright Statement
-
Copyright © The Internet Society (1999). All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
-
Funding for the RFC Editor function is currently provided by the Internet Society.