After importing a multitrack MP4 file - or even single audio files that you have extracted from some footage - you might have noticed something like this: Some audio tracks seem to be delayed by various milliseconds and appear to be out of sync to each other.
On the first sight it looks completely wrong and it appears as a faulty encoder. Well, it actually fully complies with the AAC specification and is completely valid.
If you take a closer look you notice that tracks encoded with AAC have a delay of a multiple of 1024 samples. Usually AAC encoders produces an offset of 1024 samples (or 21.33 ms at 48 kHz) and the FDK-AAC encoder adds an offset of 1024 samples per channel (like 42,66 ms for stereo at 48 kHz).
Each audio sample/package can be identified by a timestamp. In a 48 kHz audio file you have 48000 samples per second (or one sample is 20.83 µs long).
When playing the file in a software- or hardware player the player syncs all streams based on their timestamps and everything should be perfectly in sync. The offsets are ignored.
Lots of people asked me to add an option in Voukoder to correct the audio track shift by a certain amount of milliseconds (usually 43ms) to make it rather look like this:
Well, it is possible to this and it can be done using FFmpegs "atrim" audio filter (which will be added to Voukoder soon) but the best solutions is to not use AAC for intermediate clips. Consider using raw PCM instead. Only use a lossy codec (like AAC) for the final file - if necessary.