1 Introduction

Delta-Sigma ADCs have increased in popularity in recent years. These converters typically use higher order structures with multi-bit quantizers. Higher SNDR is achieved, while maintaining a lower clock speed. Therefore, multi-bit feedback DACs are required. The linearity of the multi-bit DACs are crucial to reach a high performance for the converter. Especially the linearity of the outermost DAC lies in the feedback loop impairs the SNDR the most. Mismatches in this DAC is directly fed into the system in parallel with the input signal.

A possible technique to mitigate the non-linearity is using DEM. However, the effectiveness is limited in a system with very low OSR. In this work, a digital calibration circuit is implemented to compensate for mismatches in the outermost DAC. These mismatches are estimated using digital cross-correlation between the digital output and an externally injected pseudo random test signal. It then been subtracted from the converter’s output in the digital domain to regain the target SNDR. In [1], the digital block is implemented in an external FPGA to calibrate the modulator. However the effectiveness of the background calibration is limited, due to the imperfect test signal cancelation. This work has a synthesised and place-and-routed digital algorithm, integrated with the modulator on a single chip.

Figure 1 shows the system architecture of the proposed converter. The analog part design will be briefly described in Sect. 2. The shadowed digital part is generated using digital design flow. It provides the calibration function to the outermost feedback DAC (\({DAC_\mathrm {1}}\) in Fig. 1) and the detail of this part will be discussed in Sect. 3.

Fig. 1
figure 1

System architecture of CT \({\Delta \Sigma }\) ADC with digital calibration

2 Analog modulator architecture

2.1 CT loop filter

The initial phase in \({\Delta \Sigma }\) ADC design is choosing a proper loop filter. The design specification is to achieve more than 60 dB SNDR in 9 MHz bandwidth. CT loop filters have speed advantages over their discrete-time (DT) counterparts, enabling a higher clock rate or a lower power consumption [3]. A noise transfer function (NTF) should be chosen to determine the noise shaping of the spectrum. The NTF was synthesised and simulated using the Delta-Sigma toolbox [3] presented in [4], which resulted in a 3rd order, 4-bit feedback modulator with an OSR of 8. The NTF is shown in (1).

$$\begin{aligned} NTF=\frac{(z-1)(z^\mathrm {2}-1.908z+1)}{(z-0.1808)(z^\mathrm {2}-0.02173z+0.08153)} \end{aligned}$$
(1)

This NTF is then mapped to a cascaded integrators with distributed feedback (CIFB) architecture in DT, as the architecture shown in Fig. 2(a). Three feedback paths exist in the modulator loop. The equivalent CT architecture is converted from the DT model. The corresponding CT loop filter coefficients are calculated using the impulse invariant transformation [5]. The principle is that the CT and DT impulse responses are identical at the sampling instant. Table 1 illustrates the impulse response of 1st, 2nd and 3rd order DT integration. To simplify the calculation, resonator (\({g_\mathrm {1}}\) in Fig. 2) is temporarily removed.

Non return-to-zero (NRZ) feedback pulses is generated using the DACs. The impulse response of the CT integration with direct path excess loop delay (ELD) compensation is listed in Table 2. The parameters \({\alpha }\) and \({\beta }\) represent the start and stop time of the feedback pulse respectively, which are normalized to the clock period \(T_s\). For example, if there is no ELD, \({\alpha }=0\) and \({\beta }=1\); If there exists half clock cycle ELD, then \({\alpha }=0.5\) and \({\beta }=1.5\). For the first sample (n = 1), however, an additional parameter \({\gamma _i=min(\beta _i, 1)}\) is necessary to guarantee a correct impulse response calculation.

Table 1 1st, 2nd and 3rd order DT integrator transfer functions and the impulse responses
Fig. 2
figure 2

\({\Delta \Sigma }\) modulator architecture. a 3rd order DT CIFB architecture; b Optimized 3rd order CT CIFB/FF architecture

Table 2 The transfer functions of rectangular pulse DAC feedback through direct path, and through 1st, 2nd and 3rd order CT integrators in Laplace domain, and the impulse responses

In the design optimization, the second feedback path, which should be placed in the middle, is omitted. Instead, a direct feed-forward path from the first integrator’s output to the summation point before the third integrator (\(K_\mathrm {2}\) path in Fig. 2(b)) is added [6]. In consequence, this operation yields a lower signal swing at the output of the first integrator, due to the lack of subtraction with the feedback signal.

2.2 Excess loop delay

Due to the non-zero response time of transistors in the quantizer, the ELD between the quantizer clock and the DAC feedback current would affect the equivalence between CT modulator loop filter and its DT counterpart. ELD degrades the performance of the modulator and can even lead to an unstable modulator [7]. Therefore, in this design, a fixed half clock cycle delay between the quantizer and the feedback DACs is allocated. To realise the original NTF, the basic idea is to insert a fast feedback loop around the quantizer [8]. However, such implementation requires one additional high-speed DAC, which consumes extra chip area and power. On the contrary, a feed-forward proportional path (PI path) bypassing the third integrator (\(K_\mathrm {PI}\) path in Fig. 2(b)) is implemented, as proposed in [9]. The feedback signal propagate through \(DAC_3\) and the PI path to act as a direct feedback around the quantizer. It shows identical simulation result in both approaches. Only additional resistors are required instead of a DAC, which benefits both in area and power consumption. The optimized CT CIFB/FF architecture with ELD compensation is plotted in Fig. 2(b).

2.3 Switch logic

The switch logic block (SL), routes an external single-bit test signal to a selected unit cell in the DAC, and controls the calibration in digital domain. This enables the converter to sequentially determine the gain of each unit cell. One SL block is placed between the quantizer and the outermost DAC. A similar one is implemented in the digital domain to match the calibration coefficients with the corresponding bit lines. The 4-bit selection signal, released by a digital control unit, controls both SL blocks. It guarantees synchronised signal routing and correct mismatch correction.

3 Digital calibration

Error shaping and error correction are the two major techniques for linearizing \({\Delta \Sigma }\) modulator. Error shaping technique, also known as DEM, works very reliable, but the performance is limited in low OSR designs [10]. Error correction techniques can be implemented in either the analog or the digital domain. When errors are compensated in the digital domain, the mismatch error to be corrected should be determined in advance. The method of determining individual unit element mismatch in this work is employing the correction techniques with an approach similar to that presented in [1]. The functionality is as follows.

3.1 Algorithm

As shown in Fig. 1, \(DAC_1\) is extended by one additional unit element ut. This extension allows the insertion of a single-bit test signal \(E_t\) into the modulator through one selected DAC unit element routed by SL block, without interfering with the loop behaviour. The inserted additional unit element becomes a part of the system and is used on the fly for background calibration. However, one drawback is that the input dynamic range is slightly reduced by 0.5 dB, because the test bit is processed in parallel with the input.

The feedback DACs can be expressed as a combination of unit gain elements. Each unit element amplifies the corresponding bit line. Therefore, the gain mismatches of these unit elements must be estimated. A single bit pseudo random test signal \(E_t\) is injected into the modulator along with the actual input U(t). \(E_t\) is limited to low frequencies (deep in-band) of the modulator bandwidth, where the signal transfer function (STF) is flat. After it is inserted through a selected unit element, it appears at the output of the \({\Delta \Sigma }\) modulator. The test signal is uncorrelated with the input signal. Therefore, a characteristic value of individual gain of the unit element under test is calculated through the cross-correlation between test signal \(E_t\) and the digital output signal.

3.1.1 Calibration factors

The calibration relies on cross-correlation as presented in [11]. The cross-correlation between \(E_t\) and \(Y_t\) (see Fig. 1) is:

$$\begin{aligned} CCF_i = \sum _{n=1}^LE_t(n)Y_t(n)\approx k_{i}s_{E_t}^2, \ \ i=t,1,2,\ldots ,15 \end{aligned}$$
(2)

where \(CCF_i\) is the cross-correlation factor for \(DAC_1\) unit element i, \(k_{i}\) is the individual gain of the unit element under test, and \(s_{E_t}^2\) is the variance of the inserted test signal. In the design, L = \(2^{18}\) samples are grouped as one set for computing the cross-correlation for one unit element. 16 sets are regarded as one frame. Test signal patterns are identical for every set in one frame. Therefore, in (2), the \(CCF_i\) is directly proportional to the gain (\(k_{i}\)) of each DAC unit element. The \(CCF_t\) is chosen as a reference, and (3) relates \(CCF_i\) to \(CCF_{t}\) to reveal the mismatch which is also the calibration coefficient \(CC_i\) for DAC unit element i:

$$\begin{aligned} CC_i =k \left( \frac{CCF_i}{CCF_{t}} -1 \right)+CC_i', \ \ i=1,2,\ldots ,15 \end{aligned}$$
(3)

where \(CC_i'\) is the calibration coefficient calculated from previous iteration, and k is the ideal unit gain factor. The \(CC_i\) are stored in the registers.

3.1.2 Digital calibration

In Fig. 1, each \(CC_i\) is multiplied with a corresponding bit in \(Y_d\), which is routed by the SL block. The calibrate value \(Y_c\) is then the summation of these multiplications:

$$\begin{aligned} Y_c = \sum _{i=1}^{15}Y_d[i]\times CC_i \end{aligned}$$
(4)

where \(Y_d[i]\) represents the ith bit in 15-bit thermometer coded \(Y_d\). The calibrated value \(Y_c\) is a 17-bit binary signal, where the first 5 bits represent the signed integer, while the following 12 bits contain the fraction of calibration.

After the calculation, \(Y_c\) passes through an error transfer function (ETF), which contains unit delays. It is because the in-band mismatches are injected into the system in parallel with input U and are shaped by a flat STF. The test input can be regarded as an additional deep in-band input, so the same principle applies to the test transfer function (TTF) as well.

The digital output is the subtraction from the quantizer output with calibration and test signal:

$$\begin{aligned} V(n)= Y_d(n)-Y_c(n) \times ETF -E_t \times TTF \end{aligned}$$
(5)

As the calibration runs in background, each iteration increases the accuracy of \(CC_i\) and V continuously approaches the mismatch-free result.

Fig. 3
figure 3

Digital part block diagram

Fig. 4
figure 4

Fully differential operational amplifier schematic, including bias and CMFB circuit

3.2 Implementation

Figure 3 is the block diagram of the digital calibration system implementation. The major task is to implement two equations, cross-correlation (2) and calibration coefficient calculation (3).

The implementation of cross correlation is done in two steps: the multiplication and the accumulation. In the multiplication, the multiplicand \(E_{t}(n)\) is either ‘0’ or ‘1’ . Therefore, the multiplication is done simply by changing the sign of \(Y_t(n)\) accordingly using a multiplexer. The accumulation is performed sequentially as:

$$\begin{aligned} CCF_{i}(n) = \left\{ \begin{array}{lr} CCF_{i}(n-1)+Y_t(n), E_{t}(n)=1\\ CCF_{i}(n-1)-Y_t(n), E_{t}(n) = 0 \end{array} \right. \end{aligned}$$
(6)

where \(CCF_{i}(n-1)\) and \(CCF_{i}(n)\) represent the previous and current accumulation value respectively.

To implement (3), a division block is necessary. The division module is built based on Newton-Raphson algorithm [12, 13], which is briefly described here. Newton-Raphson division is a division method using functional iteration. The division can be written as the product of the dividend and the reciprocal of the divisor. In the algorithm, a priming function is chosen, which has a root at the reciprocal. By efficiently computing the reciprocal of the divisor, the quotient Q can be computed as in (7).

$$\begin{aligned} Q = \frac{N}{D}=N\times {\frac{1}{D}} \end{aligned}$$
(7)

Consider the priming function:

$$\begin{aligned} f(X) = \frac{1}{X}-D \end{aligned}$$
(8)

where the root of f(X) is the divisor reciprocal \(\frac{1}{D}\). The Newton-Raphson equation is given by:

$$\begin{aligned} X_{i+1} =X_i - \frac{f(X_i)}{f'(X_i)} \end{aligned}$$
(9)

f(X) is continuously differentiable around the root, and the derivative at that root is not zero. Newton-Raphson equation converges quadratically. Applying (8) to (9), this iteration can be used to find an approximation to the reciprocal:

$$\begin{aligned} X_{i+1} = X_i + \frac{\frac{1}{X_i}-D}{\frac{1}{X_i^2}} = X_i \times (2-D\times X_i) \end{aligned}$$
(10)

The error in the reciprocal decreases quadratically after each iteration.

In (3), the reference value \(CCF_t\) for calculating \(CC_i\), is a cross-correlation result of the first set sample with the test pattern. It is a constant denominator D in (7). Therefore, division is only calculated once per frame of data.

4 Circuit level implementation

4.1 Amplifiers

Various amplifier architectures have been studied and evaluated. The first integrator in the loop filter handles the input and feedback signal, and the non-linearity in this stage directly degrade the system’s SNDR, which requires high linearity. The last integrator must drive large input capacitors in the multi-bit quantizer, which demands a high driving capability amplifier in the integrator. To meet these requirements, all three amplifiers in the integrators share the same two-stage, Miller-compensated, class-AB output architecture [14], while the bias current is unique for each amplifier to balance the speed, area and power consumption. The schematic is shown in Fig. 4. Tail current sources are cascoded to provide a stable bias current. Common mode feedback (CMFB) circuit senses the common mode voltage of the outputs and controls a part of the input differential stage’s active load. This reduces the CMFB gain, which makes the CMFB easier to stabilize [15].

In the simulation, the gain bandwidth product (GBW) of the three amplifiers are individually swept to study the impact on SNDR. The first amplifier has the highest linearity requirements. The last amplifier should be fast because it handles the PI path. The second integrator has a lower GBW requirement. It does not have to drive a low impedance load and any nonlinearity is suppressed by the loop filter. The optimized GBW of the three amplifiers are 600, 350 and 600 MHz, respectively.

Fig. 5
figure 5

Four-bit flash quantizer. a Quantizer architecture; b Comparator schematic

4.2 Flash quantizer

The four-bit mid-rise flash quantizer consists of 15 comparators, and the block diagram is shown in Fig. 5(a). A resistor ladder provides reference levels to the comparators, and an external control voltage \(V_{ref}\) controls the reference range. Each comparator is formed by a pre-amplifier, a clocked core and a SR latch, as seen in Fig. 5(b) [16]. The pre-amplifier senses the input and the reference voltage. When clock is low, \(M_{15}\) and \(M_{16}\) drive \(R_{int}\) and \(S_{int}\) nodes to VDD. At the clock rising edge, either \(M_{7}\) or \(M_{8}\) is conducting depending on the voltage of node DiffP and DiffN. Therefore, the decision is made at this instant and the cross-couple pair \(M_{11}\)\(M_{14}\) keep forcing \(S_{int}\) and \(R_{int}\) to be ‘1’ or ‘0’. The SR latch then latches this value for one clock cycle.

4.3 DAC

In the CIFB architecture, the outer-most feedback DAC is a critical component, since its linearity dominates the modulator’s performance. The widely used high-speed DAC type is the current-steering DAC, which contains unit elements consisting of MOS transistors operating as current sources.

Fig. 6
figure 6

Resistive DAC unit element schematic. Signal paths are represented by dash and solid lines based on two different clock levels

In this work, a resistive current-mode DAC is chosen. Resistors in the unit elements generate a current, corresponding to the digital value on the input bit line, which is fed into the integrators. Compared to the current-steering DAC, the resistive DAC has less thermal noise [5]. Each unit element is formed by two parts: a D-flip-flop updates one sample at the clock edge, and the resistor converts the DAC reference voltage into the unit current [17]. The output of all the unit elements are connected together, which performs current summation of all the unit currents. This output current is injected directly into the virtual ground node of the integrator. The DAC unit element schematic is shown in Fig. 6. Two latches form a D-flip-flop, which stores a sample at each clock rising edge. \(I_{out}\) node is connected to the integrator’s input, which can be regarded as a virtual ground and has a constant common mode voltage \(V_{CM}\). Thus, depending on the voltage on node Q, an unit current will flow into or out of the integrator, as (11) shows.

$$\begin{aligned} I_{out} = \left\{ \begin{array}{lr} (V_{REF+}-V_{CM})/{R_1}, V(Q)=0\\ (V_{REF-}-V_{CM})/{R_1}, V(Q)=1 \end{array} \right. \end{aligned}$$
(11)

4.4 Digital synthesis and place and route

In the digital synthesis, standard cells are used. Standard threshold voltage (SVT) libraries are chosen, due to the balance between speed and power consumption. To fit timing, pipelines are implemented between the building blocks on the long critical paths.

Furthermore, clock gating is utilized. Clock gating is a technique generally used for saving dynamic power dissipation on flip-flops by gating off the input clocks [18]. It is implemented with a combination of an AND gate and a latch to avoid glitches on the clocks. The architecture is illustrated in Fig. 7, where TE is test enable input, EN is enable input, CLK is clock input, CLK_out is the gated clock output.

In this implementation, clock gating is done by inserting a clock gating cell into the \(CC_i\) calculation module, to disable the clock for certain blocks. For example, division module is not always in use. As discussed in Sect. 3.2, the division is invoked once in one frame of data. The multiplication of \(\frac{CCF_i}{CCF_{t}}\) and k (as in (3)) is also occasionally used. It is called once every set of samples. Therefore, the clock for these modules is disabled when not in use.

Fig. 7
figure 7

Clock gating architecture

Fig. 8
figure 8

System level simulation in Matlab with 2 % unit element mismatch in the outer-most DAC. Original spectrum has SNDR of 55.7 dB (red line), spectrum of DAC calibration after 3 iterations has SNDR of 69.2 dB (yellow line) and the ideal spectrum has SNDR of 70 dB (black line) (Color figure online)

Fig. 9
figure 9

Layout of the system, with an enlarged box showing the analog modulator core

5 Simulation

5.1 Simulation in Matlab

The system is verified using Simulink in Matlab. Random mismatches are added to the unit elements in \({DAC_1}\), which produces varying feedback gain. The unit element mismatch is set to 2 %, a pessimistic estimation of the real mismatch during chip fabrication. The input signal has an amplitude of −3 dB full scale (FS). Without calibration, a large distortion is clearly seen from the spectrum plotted in Fig. 8. After 3 iterations of digital calibration with the limited calculation accuracy (the same accuracy as in the digital implementation in VHDL), the simulated SNDR is improved by 13.5 dB, and spurious-free dynamic range (SFDR) by 20.9 dB. There is less than 1 dB SNDR difference comparing with an ideal mismatch-free spectrum.

5.2 Digital block verification

In this design, the digital block can be regarded as a stand-alone part. It is not inside the modulator loop and only slow varying control signals control the SL block in the analog part. The digital verification is done by comparing the digital behaviour model output and the result of Matlab output with the same accuracy and the same input data. They have identical results as expected.

6 Layout

The modulator with the digital calibration circuit has been implemented in 65 nm CMOS technology, as illustrated in Fig. 9. The modulator circuit is enlarged in the figure. The gaps between the active region and the pad frame are filled with decoupling capacitors, which stabilize the supply and bias voltage. In total 53 pads are placed in the pad ring, where 16 pads are used for the calibrated digital output signal.

The post-layout simulation spectrum is plotted in Fig. 10, with an input sine wave at 2 MHz and transient noise activated. The NTF peaking visible in high frequency is caused by finite GBW of the amplifiers and the additional delay existing in the feedback DACs. The SNDR is 67.1 dB within 9 MHz signal bandwidth.

Figure 11 illustrates the power and area distribution of the presented design. Total power is 6.2 mW and total active area is 0.16 mm\(^2\). Analog circuits, which include the loop filter and the bias circuit, consume the most power. One fifth of the total power is contributed by the proposed digital circuit. The digital calibration block occupies a relatively large area, which will be optimized to a denser design in the future.

Fig. 10
figure 10

Post layout simulation with transient noise

Fig. 11
figure 11

Pie chart of power and area distribution of the presented design

Based on the result from the post layout simulation, the effective number of bits (ENOB) is

$$\begin{aligned} ENOB=\frac{SNDR-1.76}{6.02}=10.9\, bits \end{aligned}$$
(12)

and the figure of merit (FOM) is

$$\begin{aligned} FOM=\frac{P}{(2^\mathrm {ENOB} \times 2f_\mathrm {B})}=180.3\, fJ/conversion \end{aligned}$$
(13)

The performance of the modulator is summarized in Table 3.

Table 3 Performance comparison with other CT \({\Delta \Sigma }\) modulators

7 Conclusion

This work integrates both analog and digital circuits in a single layout in 65 nm CMOS technology, providing background digital calibration to correct for non-linearity in the DAC. A mismatch of 2 % is successfully calibrated, thereby improving the SNDR by 13.5 dB compared with a calibration deactivated system. The presented \({\Delta \Sigma }\) converter post layout simulates 67.1 dB SNDR over 9 MHz bandwidth. The total active area is 0.16 mm\(^2\) and the system consumes 6.2 mW power in 1.2 V supply voltage, including an 1.2mW digital calibration core.