Faxing over VoIP is a challenge. In reality, there are a number of reasons, and no certain universal cures. VoIP networks are designed to do a good job with speech. Carrying any sound other than a single voice speaking is not generally a system requirement.

The most common problem with sending a FAX over VoIP networks is the easiest to deal with. A low bit rate voice codec is unable to carry a fast modem signal without severe distortion. Would you really expect an 8kbps G.729 codec to convey a 9.6kbps FAX modem signal correctly? The only common codecs capable of adequately preserving FAX modem signals up to 14,400bps (V.17) are u-law and A-law (aka g711u/a)

Lower bit rate codecs have zero chance of working for any standard FAX image modem. Many will convey the 300bps (V.21) FAX control messages OK. They will not convey the fast modem signals, used for the actual image data.

Faxing through the PSTN

In the PSTN world, the network provides a constant delay for any particular call. The speed at which data enters the network is always the same as the speed at which it leaves. The end to end delay does not jitter, or make step changes in anything but exceptional circumstances (e.g. on automatic fail-over, if a fiber link fails). Modems require this. In an IP network, jitter it a fact of life. It can be kept to a modest level through the use of the QoS (quality of service) features available in a lot of IP equipment, but only if you control the network end-to-end. If the call passes across the open Internet, there is no QoS control. It is hard to see a business model that would ever encourage QoS to be introduced across the open Internet. So, in the long term the timing of a voice signal entering a VoIP network is the same as the timing as it leaves, but in the short term they can be very different.

Depending on its implementation in particular equipment, silence suppression can destroy a fax call. If silence suppression is enabled, a voice detector continuously monitors the call, looking for the presence of a real voice. Some of these are designed to focus purely on voice, and tend to reject other kinds of sound - e.g. modem tones. They may not switch the audio path on and off cleanly when the modem signal starts and stops. Even if they do switch cleanly, the suppression algorithms usually modify the audio around the switching points.

During silent periods, comfort noise is usually introduced, to simulate the background noise you normally hear in a conversation. This might mean a period which should be silent, is actually significantly noisy. The receiving modem might not see a good enough "silence" for its signal detector to correctly declare the boundaries of the modem signal.

Modems need a continuous audio path. If there is packet loss the consequences are severe, but the actual effect depends a lot on the equipment in use. Let's say a 20ms packet of audio is lost in the middle of a page of fax. Obviously this is going to lose a bit of the image, but will it affect just a small stripe, or the rest of the page? If the receiving end emits 20ms of silence, the receiving modem will probably declare the end of the page. If the receiving end emits 20ms of fill in sound, the receiving modem might be able to ride over the gap, depending on its design. If more or less than 20ms of some fill in sound is emitted, the remainder of the page will definitely not be received correctly. The receiving modem will not tolerate a jump in timing of that sort.

Fax and other modem applications, operating over VoIP channels, are quirky, and unreliable. This will not get better over time. It will get worse. In general, the more sophisticated the equipment gets in trying to make speech work smoothly, the worse it behaves for modems. In the short term (i.e. until all data applications are native IP applications) store and forward protocols, and protocols tailored to reasonably conveying modem data across an IP channel are the only way to achieve consistent results.

T.38 (FoIP)

T.38 is the real-time FAX over IP protocol (FoIP). This means it is designed to work like traditional FAXing. You call another FAX machine, and send the FAX as you wait. Either FAX machine could be a traditional FAX machine connected to the PSTN, an ATA box, or similar; it could be a FAX machine with an RJ-45 connector plugged straight into an IP network; it could be a computer pretending to be a FAX machine.  Currently, Teliax only supports T.38 on its 2.0 (Dashboard) platform.  We do so by enabling Inbound FAX-to-email.  While we have found that some devices, like the T38 enabled PAT2T is successful at sending FAXes from a Teliax DID, we do not configure or troubleshoot it, or any other stand-alone FAX devices.

Losing a packet in a T.38 stream does not cause the modems to loose sync. This means two successive lost packets should only corrupt a section of an image. If the optional FAX error correction (ECM) mode is used, there is a good chance that with a retry or two, a perfect image will be transferred. Not ideal, but functional.

Much of the robustness of T.38 comes not from what the spec. says, but from the potential it offers for smart implementation. The trick is to work out the smartest implementation, which will not cause trouble with the many buggy implementations of T.30 which exist in commercial FAX products.

A T.38 gateway can start sending a page as soon as it gets some data, without performing any jitter buffering. When there is little jitter, transmission delay is minimized. When jitter is bad, things will be delayed only as much as necessary. If packets are lost, and FEC is in use, the outgoing gateway can simply wait a while, to try to reconstruct the stream from the redundant information available when further packets arrive. If the required data is irretrievably lost, due to a burst of lost packets, transmission can continue with only the minimum possible page corruption.

The weaknesses of T.38

T.38 cannot avoid the basic problem that it needs to deal with old FAX machines made before the idea of FoIP was ever considered. These machines expect certain timing constraints to be met. For these machines T.38 eliminates some problems, and reduces the scale of others. However, it is nothing like an FTP or HTTP transfer of an image in its ability to deal with poor network performance.