T1: A Survival GuideBy Matthew Gast
0-596-00127-4, Order Number: 1274
304 pages, $29.95
Timing, Clocking, and Synchronization in the T-carrier System
Time is the extension of motion.
Faster networks depend on accurate timing. As the number of bits per second increases, the time in which to look for any particular bit decreases. Getting both sides to agree on timing becomes more difficult at higher speeds. Synchronous networking is largely about distribution of accurate timing relationships.
A Timing Taxonomy
Synchronous communications do not depend on start and stop flags to mark the beginning and end of meaningful data. Instead, the network constantly transmits data and uses a separate clock signal to determine when to examine the incoming stream to extract a bit. Distributing clock information to network nodes is one of the major challenges for synchronous network designers. Three major types of timing are used on networks: asynchronous, synchronous, and plesiochronous. All three terms derive from the Greek word kronos, meaning time. The three differ in how they distribute timing information through the network.
Asynchronous systems do not share or exchange timing information. Each network element is timed from its own free-running clock. Analog modems are asynchronous because timing is derived from start and stop bits in the data stream. Free-running clocks are adequate for dial-up communications because the time slots are much longer than on higher-speed digital networks.
Synchronous systems distribute timing information from an extremely accurate primary system clock. Each network element inherits its timing from the primary clock and can trace its lineage to the common shared clock. When AT&T operated the U.S. telephone network, the system derived its timing from the primary reference source (PRS), a cluster of cesium clocks located in Hillsboro, Missouri.
Synchronous networks may have several layers of accuracy, but the important feature is that each clock can trace timing to a single reference source. In the case of the Bell system, the primary source was labeled Stratum 1. Less-accurate devices were in higher-numbered strata. Tandem offices, also called toll offices, serviced the long-haul portions of the telephone network and were located in Stratum 2. Local switching offices were located in Stratum 3, with end-user devices such as CSU/DSUs in Stratum 4.
Figure 5-1 sketches the basic system, along with its end goal: distribution of accurate timing to peers at each stratum level. The master PRS at the top of the picture is the source of all timing goodness throughout the network. The network distributes timing information from the primary reference to the toll offices and from the toll offices to local switching offices. Customers attach to the local switching offices. Because timing information even at the lowly customer-equipment stratum is derived from the master clock, two pieces of customer equipment at the end of a T1 link between different local offices can operate within the strict timing tolerances required with 648-nanosecond bit times.
Figure 5-1. Timing in the Bell system
Maintaining a single network timing source is extremely expensive, and the timing distribution must be carefully engineered. Having only one timing source did not fit into the model of the post-divestiture telecommunications landscape in the U.S.
Plesiochronous networks are networks in which the elements are timed by separate clocks, which are very precise and operate within narrow tolerances. Within one telco's network, there may be multiple "primary" (Stratum 1) clocks; each telco maintains its own set of timing information. For this reason, the U.S. telephone network is really a plesiochronous network. For contrast with a synchronous network, Figure 5-2 illustrates a plesiochronous network.
Figure 5-2. A plesiochronous network
Figure 5-2 divides the network into its timing components on top and the data-transport facilities on the bottom. It also shows the facilities of two different carriers, one on the left and one on the right. A T1 transports data between the customer CSU/DSU on the left and the CSU/DSU on the right. Both telcos maintain their own primary reference sources to feed timing information to switching offices and to the devices making up the transport facilities. Even though the T1 is provided by two different carriers, precise timing tolerances allow them to cooperate in providing the T1 without needing a single shared (and trusted) source of timing information.
T-carrier systems are technically known as the plesiochronous digital hierarchy (PDH). In practice, though, the distinction between plesiochronous and synchronous is a hair-splitting one. "Synchronous" has acquired a connotation of describing any system that depends on extremely accurate timing, and the T-carrier system is occasionally referred to as the synchronous digital hierarchy (SDH). In mafsny cases, the two are combined into one acronym: PDH/SDH.
T1 Circuit Timing
CSU/DSUs are like bridges. They have one interface in telco territory and one interface in data-communications territory. Both are serial interfaces that make use of tight timing tolerances. Appropriate configuration of the CSU/DSU to work within the timing straitjacket is essential.
Receive Clock Inference on the Network Interface
In the T1 world, clock signals are not transmitted separately from the data stream. Instead, receivers must extract the clock from the data signal based on the stream itself. Each bit time slot is 648 nanoseconds. Pulses are transmitted with a 50% duty cycle, meaning that for the middle half of the time slot, the voltage is at its peak. Based on these characteristics, the receiving CSU/DSU infers time slot boundaries from incoming pulses. Ideally, each pulse comes in the middle of a time slot, so finding time-slot boundaries is simply a matter of going 324 ns in each direction. Figure 5-3 illustrates clock inference from pulse reception.
Figure 5-3. T1 clock inference from pulse reception
In practice, of course, things are never quite as simple, and CSU/DSUs must compensate for a variety of non-ideal conditions. Clock signals may exhibit both short-term and long-term irregularities in their timing intervals. Short-term deviation is called jitter, and long-term deviation is referred to as wander.
TIP: Timing on the T1 network interface from the telco is implicit and based on the content of the pulse stream. On the other hand, the serial circuit that connects the CSU/DSU to the router makes use of explicit timing. V.35, for example, includes two pairs (four leads) for sending timing signals and one pair (two leads) for receiving timing.
Transmit Clocking at the Network Interface
At the interface to the telco network, clocking on the received data is based on inferring where the bit times fall. The CSU/DSU does not send an explicit clock for use with transmitted data, but uses internal circuitry to determine when to send a pulse. The internal clocking circuitry can typically operate in one of three modes, which go by different names for different vendors. Descriptively speaking, the typical modes of operation are to derive the transmit clock from the telco, to use an internally generated timing signal, or to take the transmit clock from the attached DTE. Of the three options, the first two are by far the most common in data-transmission applications.
Master/slave timing (also called network or loop timing)
In master/slave timing, the CSU/DSU takes its timing from the telco network. The telco network maintains an extremely accurate timing source and uses that to send pulses to customer locations. At the customer CSU/DSU, the receive clock is extracted from the incoming pulses. In master/slave timing, the extracted receive clock is used for the transmit clock on the network interface, as shown in Figure 5-4.
Figure 5-4. Master/slave (loop) timing
Master/slave timing ensures that the less-accurate clock in the customer premises equipment does not drift significantly, relative to the telco's accurate timing system. Several sources may drive the telco transmit clock. One common source is the building-integrated timing supply (BITS), which ensures that all the equipment in the CO is running from the same signal. BITS can be linked to an external clock, illustrated in Figure 5-4 as the PRS. At the customer side of the link, the CSU/DSU extracts the receive clock, rather than relying on an internal oscillator in the CSU/DSU. Master/slave timing is also called loop timing because the clock is extracted from signals on the digital loop, or network timing because the clock source is from the telco network interface.
Internal timing uses an internal oscillator in the CSU/DSU as the transmit clock source. No special measures are taken to ensure that the timing of transmitted pulses matches the timing of received pulses because the two operations are logically independent, as Figure 5-5 illustrates.
Figure 5-5. Internal timing
All circuits must have one timing source. In most cases, the telco will supply timing because the entire telco network must operate with unified timing to deliver the T1 circuit. In some cases, however, a simple copper wire pair can be leased from the telco. For spans with less than 30 dB attenuation, an unrepeatered copper pair can cost much less than a full-service line. In private-line applications, one end of the line must provide the clock, as Figure 5-6 shows. The remote end is set to loop timing, so the remote transmit clock is derived from the local transmit clock.
Figure 5-6. Clocking on private lines
Clocking at the Data Port
Data ports on CSU/DSUs are synchronous serial ports. CSU/DSUs transmit data as a varying voltage on the line, with a high voltage representing one and a zero voltage used for zero. A second signal, the clock signal, triggers a voltage measurement and extracts a bit from the voltage stream. Figure 5-7 illustrates the use of the explicit clock signal on a synchronous serial port.
Figure 5-7. Clocking on a synchronous serial port
In Figure 5-7, the external clock signal triggers a measurement when the clock signal goes from a low voltage to a high voltage. Aligning the clock signal with the voltage plateaus is important. Ideally, the clock signal should trigger a voltage measurement at the middle of the bit time. If the clock signal falls too close to a voltage transition, the reading will be unreliable.
Receive clock timing
The clocking on the serial circuit from the CSU/DSU to the router is a synchronous serial line that uses explicit timing. Clock signals for the received data are extracted from the incoming pulse train at the carrier network interface. The extracted data is then transmitted with the extracted clock signal out the data port. No configuration is necessary on T1 equipment to configure the clock signal for received data. Figure 5-8 demonstrates receive clock timing.
Figure 5-8. Receive Clock timing for the data port
Internal data port clocking
Forwarding the received data out the data port and on to the router is only half of what a T1 does. Transmitting data from the data port out to the telco network successfully requires that the data be correctly received from the data port and processed accurately. The simplest clocking method is to allow the CSU/DSU to control the clocking of transmitted data, too. V.35 interfaces allow the CSU/DSU to supply timing to the DTE. Data arrives at the CSU/DSU and is then extracted using the transmit clock supplied to the DTE, as Figure 5-9 illustrates.
Figure 5-9. Internal data port clocking
In most applications, internal data port clocking provides acceptable performance. If the clock signal drifts out of phase with the data transmission, however, the clock will trigger measurements too close to the transition between the high voltage and the low voltage. Measurements to extract bits still take place, but those measurements may not reflect the data that was supposed to be transmitted. Figure 5-10 illustrates the problem.
Figure 5-10. Out-of-phase clock signal
As in Figure 5-7, the clock signal triggers a measurement at the rising edge. The lower clock signal is delayed, or phase shifted, so measurements occur too late and extract incorrect data. One common reason for the delay is a long cable between the CSU/DSU and the router. The router synchronizes its transmissions with the transmit clock, but the data must travel from the router to the CSU/DSU. In the time it takes for the data to travel along the cable to the CSU/DSU, the clock signal has moved on and the measurements it takes are tardy. Problems may also occur at high-transmission rates, because the bit times are shorter, or when the router has a significant processing latency.
External data port clocking
One way to address the problem of a phase-shifted clock signal is to change the source of the transmit clock at the data port. Routers can be designed to accept the transmit clock from the CSU/DSU and use it to drive the external clock line. The external clock line is often labeled XCLK or SCTE (an abbreviation for serial clock timing external). The router assumes the responsibility of synchronizing the transmitted data with the external clock signal. With external clocking, the clock and data must take the same path and are subject to the same delays, so the data and its clock signal stay in phase over the cable. This is illustrated in Figure 5-11.
Figure 5-11. External data port clocking
Many routers, however, do not support looping the received clock signal back to the CSU/DSU. When these routers are used with a CSU/DSU that expects a transmit clock, nothing will be transmitted because no clock signal is transmitted to the CSU/DSU.
WARNING: Using a router that does not supply a transmit clock with a CSU/DSU can be a particularly difficult problem to pinpoint without the right equipment. Protocol analyzers that tap into the V.35 connection between the CSU/DSU and the router may not see a problem because the V.35 connection is fine.
This often manifests itself as a router that insists that it is transmitting data even though nothing is received by the remote end. Because nothing can be transmitted, the link layer protocol cannot initialize the link. If you see this symptom, check with the router's vendor to ensure that they support sending a timing signal to the V.35 DCE.
Inverting the internal clock signal
A second method of addressing a phase shift between the transmitted data and its clock signal is to invert the internal clock, which has the effect of shifting the clock signal by a half-cycle. The goal of clocking on the serial port to the router is to make sure that the clock signals trigger bit extractions in the middle of the bit time. Problems occur when the clock fires at the edge of a bit time. Moving the clock trigger half a cycle returns the clock signal to the middle of the bit time.
Slips: When Timing Goes Bad
T1 equipment employs a variety of techniques to compensate for variations in timing signals. Intermediate network equipment may buffer the 192-bit frames to ensure that frames are complete before forwarding them on to their destinations. CSU/DSUs are equipped with phase lock loop (PLL) circuitry to track with the more accurate clocks at the local exchange office. Occasionally, though, these measures are not enough, and timing problems occur.
Imperfect timing conditions may force network equipment to replicate or delete data in a process called a frame slip. Slips are divided into two categories. Controlled slips replicate or delete a complete 192-bit frame of data, but do not cause any problems with the T1 path. Uncontrolled slips, which are also called change of frame alignment (COFA) events, are much more severe because they disrupt the framing pattern. Controlled slips are the more benign of the two because the path remains available. Uncontrolled slips indicate more severe problems with the circuit.
Controlled slips always involve complete frames, and can be the result of either a buffer overflow or underflow. Both conditions are illustrated in Figure 5-12. In the overflow case, the second frame is lost in time unit 1 because the buffer overflows and replaces it with the third frame. Both the second frame and its framing bit are lost. Receivers use the disruption in the framing bit sequence to detect controlled slips. Controlled slips may also occur because of a buffer underflow, which causes frames to be repeated. In Figure 5-12, the buffer underflow in time unit 1 means that no fresh data is available for transmission in the second time unit.
Figure 5-12. Controlled slip operations
Uncontrolled slips are far more severe. If a buffer overflow or underflow causes a partial frame to be lost, an uncontrolled slip will occur. Framing bits shift within the bit stream, as illustrated in Figure 5-13.
Figure 5-13. Uncontrolled slip operations (a.k.a. COFA events)
In the overflow case, part of the second frame is lost. Following the partial frame 2, the third frame begins with its framing bit. The underflow case is similar in that the first frame is only partially replicated in the second frame slot. Immediately following the partial frame in the second frame slot is the framing bit for the third frame. Partial frames lead to a change in the frame alignment--instead of being greater than 192 data bits, it will be fewer than 192. Because the framing bit is in an unexpected location, the receiving framing unit will need to examine the incoming bit pattern carefully to determine the new frame alignment and reframe appropriately.
Reframing can be a time-consuming process. Instead of a simple higher-level retransmission, the equipment at both ends of the T1 must resynchronize, which takes a significantly longer time to accomplish. Uncontrolled slips are usually the result of equipment that does not buffer full 192-bit frames. Full frame-size buffers are expensive because they require more memory and more complicated handling. Additionally, handling and checking full frames increases the latency of the frames as they pass through the device, which most equipment vendors prefer to avoid.
Underlying timing problems on the circuit are what cause slips to occur. Two rules can prevent configuration errors with regard to the clock source:
- There must be no more than one clock source. Synchronous networks depend on having one source of clocking truth, and free-running clocks at both ends of a circuit will corrupt data as the two clocks drift in and out of synchronization.
- There must be no less than one clock source. If two CSU/DSUs are both operating in slave mode, each will look to the other as the source of timing information. Any changes in the timing of one CSU/DSU will affect the partner because both CSU/DSUs are attempting to lock on to the timing information supplied by the other.
Combining these two rules leads to the obvious conclusion that there must be only one clock source. Typically, the clock source is the telco and the CSU/DSUs at both ends are set for loop timing. On untimed circuits, one of the CSU/DSUs should generate the transmit clock, and the other should be set for loop timing. Network administrators should choose one clock source and configure other devices to accept that clock.
1. Telcos may now use multiple primary reference sources for their networks. Navigational systems like the Global Positioning System (GPS) and the Long Range Navigation (LORAN) network depend on having extremely accurate timing, and it is now quite inexpensive to build specialized receivers to extract timing information from the radio signals at each office, instead of building a transnational timing network.
Back to: T1: A Survival Guide
© 2001, O'Reilly & Associates, Inc.