Asynchronous Domain Crossings
Background
Moving data between different asynchronous clock domains is a common issue in chip design. Trying to keep clocks balanced across an entire chip is impractical for several reasons:
- A large clock network would consume a great deal of power and area.
- A large clock network would likely have a huge skew, increasing clock uncertainty and cutting into your cycle time.
- Peripherals interfacing to off-chip assets likely have different non-integer clock speed requirements (i.e. HDMI may need to run at 250 MHZ, DDR at 533, etc).
Asynchronous clock domain crossings (CDC) must be handled with proper design techniques because:
- CDC issues can cause data errrors which are very difficult to detect.
- CDC errors cannot be detected with standard RTL simulations.
- CDC errors cannot be detected with gate-level simulations. Most of the gate-level models I've used did not model metastability-- if setup or hold was violated an 'X' was generated for the whole cycle which propogates and corrupts the simulation.
In other words, if asynchronous clock domain crossings are not properly handled in design the result will be an unreliable design which is near impossible to debug.
Experience
I've dealt with this situation in several points in my career:
- I designed the HPI interface on the TMS320C6201 which allowed data access to our main processor.
- In doing gate level simulation, I often had to supress SDF timing between clock domains to keep our simulations from erroring out. However, this meant that any asychronous signal was modeled as being captured at the first possible instant which often wasn't the case. It also meant that transient issues (such as a test signal which was too short to be captured) would not be properly detected.
- In developing test patterns for TMS320 processors, we would have to have the tester (running asynchronously to our device) drive test patterns to test the pin-to-pin parameters. Our testers required cycle-accurate behavior and were slower than our parts. Usually we handled this by:
- Putting the PLL in bypass. Since we were more intersted in parameter-to-parameter timing than absolute raw high speed performance, this was an acceptable tradeoff
- Programming the chip to evaluate if it received the appropriate data and report out a self-diagnostic result.
Discussion
There is a very good treatment of this topic at: SNUG-2008 Paper.
So how would I handle asynchronous domain crossings? If I need to get data from clock domain A to clock domain B, I would look at:
- If it's a single signal (for example, an interrupt or a UART input), I would use a Dual-FF synchronizer. As data came in I would sample it and count high/low cycles.
- Of course you'd need the signal to keep level long enough for it to be sampled (at least 1.5 domain B cycles).
- If you sampled a sequential singal (like an 8 bit UART or CAN transceiver) you'd need domain B to toggle at least (2*bits/frame) the clock rate. I.E. For an 8 bit UART you'd need sampling at 16 times the UART baud rate (that is essentially Nyquist's Theorem).
- If it's an enable and a bus then I would look at the clock speed difference between the clock domains:
- If domain A's speed is much slower (1/3 the frequency or less) than domain B, I would capture an enable edge with an edge detector.
- If domain A's speed is close to or faster than domain B's, I would likely use an asynchronous FIFO with grey-code counters. The tradeoff here is you increase area and add some latency.
- If there is risk that domain A could produce data faster than domain B could consume it, you would need some mechanism on the clock A side to fan data to multiple instances of domain B's logic. Or you would lose data.
- If domain A requires an 'acknowledge' of received data before transmitting new data that would slow down the process considerably.
Obviously there are much more complicated cases that could arise. The primary thing to remember is that if you violate the setup and hold of a flop you
are risking that the data will not settle at the input by the next cycle.