The challenge is scheduling an AAC buffer in a AVAudioPlayerNode
which only consumes uncompressed linear pcm buffers.
The function playAudio takes a Data()
object that contains a chunk of AAC being streamed to the client, normally a few kilobytes in size. The process goes like this:
- Write AAC chunk to a
AVAudioCompressedBuffer
. - Create a pcm buffer and convert the
AVAudioCompressedBuffer
into it. - Schedule the buffer in the Player Node
This is my attempt. However it does not work, the audio just does not play at all in the speakers. I wrote a bunch of debugging statements across the functions to figure out what is going on.
func preparePlayer() {
guard let engine = audioEngine else {
print("Audio engine is not initialized")
return
}
// Initialize a player node and attach it to the engine
playerNode = AVAudioPlayerNode()
engine.attach(playerNode!)
print("Output Node Format: \(engine.outputNode.outputFormat(forBus: 0))")
pcmFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 48000, channels: 1, interleaved: false)
engine.connect(playerNode!, to: engine.outputNode, format: pcmFormat)
var asbd = AudioStreamBasicDescription()
asbd.mSampleRate = 24000 // 24 kHz
asbd.mFormatID = kAudioFormatMPEG4AAC // AAC
asbd.mChannelsPerFrame = 1 // Mono
asbd.mBytesPerPacket = 0 // Varies (compressed format)
asbd.mFramesPerPacket = 1024
asbd.mBytesPerFrame = 0 // Varies (compressed format)
asbd.mBitsPerChannel = 0 // Varies (compressed format)
// Create an AVAudioFormat with the AudioStreamBasicDescription
sourceFormat = AVAudioFormat(streamDescription: &asbd)
// Initialize the audio converter with the source (AAC) and destination (PCM) formats
audioConverter = AVAudioConverter(from: sourceFormat!, to: pcmFormat!)
}
func playAudio(audioData: Data) {
guard let playerNode = self.playerNode else {
print("Player node is not initialized")
return
}
guard let engine = audioEngine else {
print("Audio engine is not initialized")
return
}
let compressedBuffer = AVAudioCompressedBuffer(format: sourceFormat!, packetCapacity: 1024, maximumPacketSize: audioConverter!.maximumOutputPacketSize)
compressedBuffer.byteLength = AVAudioPacketCount(audioData.count)
print("audioData contains \(audioData.count) counts")
let middleIndex = audioData.count / 2
let middleRangeAudioData = middleIndex..<(middleIndex + 10)
print("Middle bytes of audioData: \(Array(audioData[middleRangeAudioData]))")
audioData.withUnsafeBytes {
compressedBuffer.data.copyMemory(from: $0.baseAddress!, byteCount: audioData.count)
}
print("compressedBuffer contains \(compressedBuffer.packetCount) packet counts")
print("compressedBuffer contains \(compressedBuffer.byteCapacity) byte capacity")
print("compressedBuffer contains \(compressedBuffer.byteLength) valid bytes")
print("compressedBuffer contains \(compressedBuffer.packetCapacity) packet capacity")
let bufferPointer = compressedBuffer.data.bindMemory(to: UInt8.self, capacity: audioData.count)
let bufferBytes = Array(UnsafeBufferPointer(start: bufferPointer, count: audioData.count))
let middleRangeCompressedBuffer = middleIndex..<(middleIndex + 10)
print("Middle bytes of compressedBuffer: \(Array(bufferBytes[middleRangeCompressedBuffer]))")
// Create a PCM buffer
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: pcmFormat!, frameCapacity: 1024)
let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
outStatus.pointee = AVAudioConverterInputStatus.haveData
return compressedBuffer
}
var error: NSError?
let conversionResult = audioConverter!.convert(to: pcmBuffer!, error: &error, withInputFrom: inputBlock)
if conversionResult == .error {
print("Conversion failed with error: \(String(describing: error))")
} else {
print("Conversion successful")
print("buffer contains \(pcmBuffer?.frameLength ?? 123456) frames")
}
if let frameLength = pcmBuffer?.frameLength, frameLength > 0 {
let channelCount = pcmFormat?.channelCount ?? 0
for channel in 0..<channelCount {
if let channelData = pcmBuffer?.floatChannelData?[Int(channel)] {
let channelDataPointer = UnsafeBufferPointer(start: channelData, count: Int(frameLength))
let firstFewSamples = Array(channelDataPointer.prefix(10))
print("First few samples of pcmBuffer in channel \(channel): \(firstFewSamples)")
}
}
}
if !engine.isRunning {
do {
try engine.start()
} catch {
print("Error starting audio engine: \(error)")
}
}
playerNode.scheduleBuffer(pcmBuffer!, completionHandler: nil)
}
This is an example debugging log:
audioData contains 1369 counts
Middle bytes of audioData: [250, 206, 86, 76, 254, 10, 221, 187, 190, 243]
compressedBuffer contains 0 packet counts
compressedBuffer contains 4096 byte capacity
compressedBuffer contains 1369 valid bytes
compressedBuffer contains 1024 packet capacity
Middle bytes of compressedBuffer: [250, 206, 86, 76, 254, 10, 221, 187, 190, 243]
Conversion successful
buffer contains 0 frames
The converter does not output any errors, but the pcm buffer contains 0 frames. Looking backwards this is probably because the compressedBuffer
also does contain 0 frames. The audio gets written to the compressedBuffer
but there is where things stop making sense. I have worked almost always with pcm audio before and since it is linear I can interpolate how many frames it has per packet based on the audio settings, no big deal. In compressed audio that is not possible because bitrate is generally variable. Maybe this issue is related to having 0 frames, or maybe something else.