NB-IoT devices in reverberation chambers: a comprehensive uncertainty analysis

Abstract New protocols related to Internet-of-things applications may introduce previously unnoticed measurement effects in reverberation chambers (RCs) due to the narrowband nature of these protocols. Such technologies also require less loading to meet the coherence-bandwidth conditions, which may lead to higher variations, hence uncertainties, across the channel. In this work, we extend a previous study of uncertainty in NB-IoT and CAT-M1 device measurements in RCs by providing, for the first time, a comprehensive uncertainty analysis of the components related to the reference and DUT measurements. By use of a significance test, we show that certain components of uncertainty become more dominant for such narrowband protocols, and cannot be considered as negligible, as in current standardized test methods. We show that the uncertainty, if not accounted for by using the extended formulation, will be greatly overestimated and could lead to non-compliance to standards.


Introduction
The use of Internet-of-things (IoT) or machine-to-machine (M2M) applications is gaining popularity to meet demands such as improved indoor coverage, increased reconfigurability, and mobility, that are required for 5G and beyond [1,2]. Many of these devices will work in the FR1, or sub-6 GHz, bands using protocols such as narrowband IoT (NB-IoT) and CAT-M1 (or LTE-M) [1][2][3].
The performance of these cellular devices is often studied with over-the-air (OTA) tests by metrics such as Total Isotropic Sensitivity (TIS) and Total Radiated Power (TRP) [4][5][6][7][8][9]. These tests can be carried out either in an anechoic chamber (AC) or a reverberation chamber (RC). An RC is a large metal cavity, with one or more mode-stirring mechanisms to produce, on average, a uniform distribution of the fields, and can often produce faster, lower-cost, or more flexibly configurable measurements than an AC [4]. This makes an RC an excellent candidate for testing IoT devices when directional information is not required.
RCs have been researched extensively and were shown to be suitable for TIS measurements on earlier-generation protocols, such as W-CDMA (4 MHz channel bandwidth) [4][5][6][7][8][9]. However, for NB-IoT, we expect additional challenges due to the narrowband nature of this protocol (180 kHz channel bandwidth). Traditionally, to provide accurate results, a wideband RC reference measurement is averaged over frequency in post processing to match the bandwidth of the modulated signal. Such frequency averaging has the added benefit of resulting in a low-uncertainty estimate of the chamber loss. When averaging the frequency response over a narrow bandwidth, the uncertainty estimate is more sensitive to peaks and nulls in the RC's frequency response for the mode-stirring samples and may increase uncertainty.
Multiple works have studied uncertainty effects in loaded RCs for wireless-device testing for wideband protocols [10][11][12][13][14], but little research has been published on uncertainty in loaded RCs for narrowband protocols [15]. As [10,13] show, for the wideband (4 MHz channel bandwidth) protocols, the uncertainty budget contains many contributing components, but generally, the biggest contributor is the chamber lack of spatial uniformity due to loading. This component can be estimated by measuring the standard deviation between independent realizations of the stepped mode-stirring sequence [10]. This method, as is advised in current standardized test methods, deems differences within an independent realization as negligible. In a previous work, we showed with preliminary results that larger variations occur within such an independent realization due to the low averaging bandwidths of narrowband protocols, as compared to wideband protocols, and that they cannot be considered as negligible [16].
In this paper, we extend the previous work by providing a comprehensive uncertainty analysis, where we include all components of uncertainty discussed in [17,18] to obtain the total expanded uncertainty. We also show a more extensive chamber characterization, and more extensive results for uncertainty and the significance test over multiple bands, where we show that a formulation that takes both the uncertainty between independent realizations of a given mode-stirring sequence and within an independent realization into account should be used, as compared to a formulation that only takes the between uncertainty into account. Using the latter, the user may greatly overestimate the uncertainty of the measurement system, as we will show. We base the majority of our analyses on the Test Plan for Wireless Large-Form-Factor Device Over-The-Air Performance [17] by the CTIA, an organization which provides test plans for wirelessdevice OTA testing and is planning on providing such a test plan for NB-IoT. This work aims to aid in that goal.
In section "TIS measurement procedure", we introduce the current standardized procedure for performing TIS measurements in RCs. In section "Significance test", we describe the theory of the significance test, where we show with measurement results that the formulation used in current standardized wideband test methods should be changed for NB-IoT. In section "Uncertainty analysis", we show the expanded uncertainties using that formulation, which are 1.26 and 1.14 dB for an NB-IoT and CAT-M1 device, respectively, operating in the Cellular Band 2. The work is concluded in section "Conclusion".

TIS measurement procedure
TIS is a measure of the minimum received power that a device can accept without incurring an unacceptably low throughput or an unacceptably high error rate for a certain protocol. An illustration of a typical RC setup for a TIS measurement is shown in Fig. 1. The measurement procedure is as follows: A wireless link is established between a base-station emulator (BSE) and a device under test (DUT), where the BSE transmits a signal at decreasing power levels at the downlink frequency, and measures the DUT's reported throughput or error rate at the uplink frequency. Per the CTIA test plan [17], TIS measurements are performed using data throughput as the measurement metric. The TIS for the NB-IoT and CAT-M1 protocols corresponds to the minimum downlink power required to provide a data throughput rate greater than or equal to 95% of the maximum throughput of the reference measurement channel. We measure the BSE power for a high value of starting power and as long as the throughput is higher than this threshold, we step the power down until the throughput drops below the threshold to obtain a minimum power for each mode-stirring sample. This process is repeated for every sample in the stepped mode-stirring sequence, and then averaged over all mode-stirring samples to obtain TIS [17].
Usually, we need to load the chamber by adding RF absorbers to flatten the RC's frequency response allowing us to keep the communication link between the BSE and the DUT while measuring TIS. This is due to the fact that, in an unloaded chamber, the frequency selectivity is usually too high for the DUT's equalizers. Loading increases frequency correlation and reduces spatial uniformity, which may increase uncertainty if not compensated for using position stirring with, for example, a turntable as shown in Fig. 1 [19]. The amount of loading necessary can be determined from the coherence bandwidth (CBW), defined as the average bandwidth over which the frequency samples have a minimum specified level of correlation [19]. In general, the CBW needs to be wider than the channel bandwidth to maintain the link [17].
In the CTIA Test Plan for Wireless Large-Form-Factor Device Over-the-Air Performance [17], TIS is calculated from where P TIS is the total isotropic sensitivity in W and h tot meas the total efficiency of the measurement antenna (see Fig. 1). G cable is the cable loss between the measurement antenna and the BSE, P BSE(m) is the minimum received power measured by the BSE at the threshold throughput in W for mode-stirring sample m, 〈 · 〉 M is an ensemble average over the total number of mode-stirring samples M. G ref is the chamber transfer function given by [17,19] where h tot ref is the total efficiency of the reference antenna (not shown in Fig. 1) and 〈 · 〉 F is an ensemble average over F frequencies across the channel bandwidth. G ref is frequency averaged over the same bandwidth as the DUT channel being measured. The uncertainty in all metrics introduced in (1) and (2) should be taken into account in a comprehensive uncertainty analysis, as we will discuss in section "Uncertainty analysis".

Significance test
In this section, we perform a significance test to determine what formulation should be used to estimate uncertainty in both the reference and the DUT measurements.

Theory
The current CTIA formulation for RC-induced uncertainty is based on the concept that the lack of spatial uniformity is the dominant component of uncertainty, since chambers are typically loaded for the widest channel bandwidth to be tested, which is often 4 MHz [5,17,19]. This type of uncertainty is calculated from the variation between different independent realizations of the stepped mode-stirring sequence, which assumes uncertainty due to variations within an independent realization to be negligible. However, for a narrow channel bandwidth, such as that of NB-IoT, this is not the case. To illustrate this, we perform a significance test as described in detail in [10] to determine which uncertainty should be used:
(1) Only the variation between independent realizations (lack of spatial uniformity) of the mode-stirring sequence is dominant. This is the current CTIA formulation. (2) Both the variation due to the number of samples within the mode-stirring sequence and the variation between independent realizations of the mode-stirring sequence are included. This is the formulation we propose for NB-IoT and CAT-M1 measurements.
The "significance" is determined in an F-test [10,20] and is defined as the ratio between the variance in the between and within samples. The significance is compared to a threshold derived from a 95th-percentile of an F-distribution, with N B − 1 and N B (N W − 1) degrees of freedom, respectively. The 95thpercentile corresponds to 95 If within and between differences are both significant (Formulation 2), the formulation that should be used is given by [10]: Note that Formulation 2 yields a lower uncertainty-estimate due to the number of samples and degrees of freedom in the denominator [20]. Physically, this means that when the lack of spatial uniformity dominates the uncertainty, the uncertainty can be significantly higher unless the stirring sequence includes a large number of spatial-stirring samples [10]. In the comprehensive uncertainty analysis, both the uncertainty in the reference measurement and the DUT measurement are taken into account, where u 2 DUT = N B u 2 Ref , since typically N B = 1 for the DUT [17]. This is because a test lab typically performs a single measurement of each device. Next, we discuss the measurement setting for estimating G Ref , such that we can calculate the significance, and the uncertainty using both formulations. We applied the significance test to these measurements with the results described in section "results".

Measurement setup and mode-stirring sequence
Measurements were carried out in a 4.6 m × 3.1 m × 2.8 m RC at the National Institute of Standards and Technology (NIST), as shown in Fig. 2, which has one paddle as a mode-stirring mechanism and a turntable and height translation for position stirring. A vector network analyzer (VNA) was used in all measurements, with an IF BW setting of 1 kHz, a source power of −8 dBm and a 1 kHz frequency spacing. We focus on three different sub-bands of the Cellular NB-IoT Band 2, each with a 10 MHz bandwidth, centered at 1930, 1960, and 1990 MHz. All results are shown for the frequency-averaging bandwidths of both narrowband protocols NB-IoT (180 kHz) and CAT-M1 (1.4 MHz), and one of 2 MHz, to study a more wideband protocol. We averaged all transmission-coefficient results over these three bandwidths, with which we computed G Ref and the significance. To investigate the effects of loading, we used one measurement setup with "light loading" (two absorbers) and one with "heavy loading" (eight absorbers), resulting in CBW values of 1.5 MHz and 3.3 MHz, respectively. We calculated the CBW with a threshold of 0.5 [14,17]. The measurement setup with eight absorbers is shown in Fig. 2. The unloaded CBW is on the order of 500 kHz, which is larger than the channel bandwidth of NB-IoT. However, for this large chamber, we always introduce some loading to minimize the potential for a large amount of constructive interference damaging the DUTs. Even a small amount of RF absorber dampens the modes sufficiently to prevent such damage. We used two low-loss broadband antennas for the reference measurement, where the calibration reference plane was specified at the connectors of the antennas using an N-type electronic calibration module. We obtained the G Ref estimate from a transmission-coefficient measurement between two antennas (as discussed in [19]), where the second antenna was replaced with the DUT for the DUT measurement.
By subsetting all of the mode-stirring samples, we acquired six independent realizations (IRs) (N B = 6), each containing 120 stepped mode-stirring samples (N W = 120) within the modestirring sequence obtained from eight paddle and 15 turntable angles with 45 ∘and 24 ∘angle spacing, respectively, as shown in Table 1. IR1-3 and IR4-6 were measured at antenna heights of 0.3 and 1.3 m, respectively, where IRs with the same height have different paddle-angle offsets, as shown in Table 1. To confirm low correlation between samples, we performed a linear autocorrelation test of the data within each of the independent realizations and a Pearsons' cross-correlation test of the data between all independent realizations. For both cases, we show the worst-case scenario, which is, according to the data, a heavily  International Journal of Microwave and Wireless Technologies 563 loaded case. The within correlation is shown in Fig. 3, which shows the normalized correlation value versus lag shifted copies of the entire sequence with itself. The peak correlation value of 1 at 0 sample lag occurs because the exact same two arrays are being compared. Lag shifting the sequence over by one sample with itself, in either direction, drops the correlation value to below the 0.3 threshold [17,21], as shown in Fig. 3, verifying independent samples. The correlation between independent realizations for Band 1 is shown in Table 2. A few cases slightly exceed the 0.3 threshold. These cases are underlined in Table 2. However, since the loading is much higher than required for NB-IoT and CAT-M1, this is not expected to influence the final results significantly. In the CBW = 1.5 MHz case, which we use in the final uncertainty budget, all correlations between independent realizations are below the threshold.

Results
Using the significance test, we calculated the percentage of the band over which Formulation 2, (4) should be used. These results are shown in Table 3 for the three bands, the two absorber cases and the three averaging bandwidths. The results show that Formulation 2 holds for the majority of the band, in both the the NB-IoT (180 kHz) and CAT-M1 (1.4 MHz) bandwidths, in contrast to the current standardized methods which used Formulation 1, (3) [17]. For the majority of the results, the between significance increases for a higher CBW, as loading reduces spatial uniformity. This yields larger differences between independent realizations of the mode-stirring sequence and increases the between uncertainty. There are three exceptions in the band centered at 1960 MHz which are marked in italics in Table 3. These can be attributed to high variations in the significance in combination with a narrow bandwidth, as we will show. In most cases, we also observe an increase in the between significance for higher averaging bandwidths. This is due to the fact that the within differences reduce significantly due to the reduction in peaks and nulls in the frequency response when averaging. Four exceptions, due to the reduced number of points by averaging as we will show, are underlined in Table 3.
As the band centered at 1960 MHz has the most exceptions, we show this case in Fig. 4. This figure shows the significance for this band for two absorber cases and three averaging bandwidths, with the alpha-percentile (95%) of the F-distribution, as discussed in this section. Note that the trend of the significance changes for each averaging bandwidth, as it is based on the ratio between the variance of within and between samples, which both use G Ref averaged over the channel bandwidth. Since both variances change, the ratio between them, hence the F-statistic, changes   too. Figure 4 shows that a high variation of the significance metric can occur over frequency. Due to the narrow bandwidth used, an exception may occur where the significance metric lies below the threshold for a higher percentage of the band in a higher loading case, as compared to a lower one, since they were different setups. The underlined exceptions in Table 3 can be attributed to a loss of samples at the edges of the band for increasing averaging bandwidths due to the running-average technique used, as shown in Fig. 4. If peaks in significance that are above the threshold occur at the edge of the band, these will be averaged out, resulting in a lower percentage of between significance in Table 3. This is specifically the case in the band centered at 1960 MHz, where a peak at the lower edge of the band exceeding the threshold is averaged out for the 1.4 and 2 MHz averaging bandwidths, resulting in two exceptions in Table 3. For all cases, these exceptions only showed when peaks in significance occur close to the edges of the band. Note that these exceptions do not influence the outcome of which formulation should be used. In general, for all NB-IoT and CAT-M1 cases, between differences do not dominate, and the within uncertainty should be taken into account by using Formulation 2 (4). Next, we use this formulation for a comprehensive uncertainty analysis and we show the effects of this choice on the measurement uncertainty.

Uncertainty analysis
In this section, we first analyze results of the uncertainty in the G Ref measurement, using the previously defined formulation. Then, we include other uncertainty components as well, where we present a comprehensive uncertainty analysis.

Combined uncertainty
Using Formulation 2, (4), we can calculate u 2 Ref and u 2 DUT . Using the root-sum-of-squares (RSS) technique, we can calculate the combined uncertainty of those, u Combined , normalized to G ref using [10,17] u Combined, dB = 10 log 10 The results for all bands are shown in Fig. 5. In the current standard, the user selects the highest value of uncertainty, computed over all frequencies within the band of interest, since, as shown in Fig. 5, uncertainty estimates can change over frequency. Table 4 shows this value for all cases, calculated using both Formulation 1 and 2. It can be clearly seen that Formulation 1, , overestimates the uncertainty in all cases. Several other effects related to the loading and averaging bandwidth can be observed in the combined uncertainty results. First, the uncertainty reduces for higher averaging bandwidths, as expected since it reduces within uncertainty. Second, the maximum uncertainty for the NB-IoT averaging bandwidth is very similar for both loading cases (note that the black curves in Fig. 5 overlay), which is generally not the case in wideband measurements. In Figs 5(b) and 5(c), the maximum uncertainty for the NB-IoT bandwidth is even higher for a lower-loading case. Even with Formulation 1 (see Table 4), the uncertainty does not always increase for increased loading. This all implies that the within uncertainty is more dominant than the between uncertainty, and that this loading has relatively little effect on this uncertainty for the NB-IoT bandwidth. A third effect is that the uncertainty is higher for the CAT-M1 and 2 MHz averaging bandwidths in the eight-absorber case, as compared to the two-absorber case. This is as expected, since the within differences become less significant for higher averaging bandwidths, while the between differences become more significant for higher-loading cases (see Table 3).

Comprehensive uncertainty analysis
In this subsection, we estimate the uncertainty in the whole measurement, taking into account all metrics in (1). Table 5 shows a summary of all the components of uncertainty related to the measurement, split into two groups. The groups contain contributions to the uncertainty in the reference measurement and the DUT measurement. In this analysis, we used the NB-IoT and CAT-M1 bandwidth results, where CBW = 1.5 MHz, as this is wider than the channel bandwidth of interest, while it does not excessively load the chamber. We based our analysis on the components discussed in [17].

International Journal of Microwave and Wireless Technologies 565
In the contributions to the DUT measurement, we calculated the mismatch between the BSE and the measurement antenna, and the temperature variation in the system using equations provided in [22]. In the calculation for temperature variation, we used a variation of ±3 K, assuming worst-case values as presented in [22,24]. Fixed worst-case standard uncertainties were used for the cable factor, insertion loss, the sensitivity search step size, miscellaneous uncertainty, and the frequency resolution for the TIS measurement. The BSE output level (stability) was extracted from the manufacturer data sheet. In the reference measurement, we extracted the VNA absolute level and level stability from an earlier work that used a similar setup. The uncertainties in the impedance mismatch and cable measurements were calculated using [22], which were, in this case, negligible. We therefore state them as being < 0.01 dB. It should be noted that these are not always negligible. In this measurement setup, the uncertainty due to the cable movements is not considered in the reference measurement, since they are calibrated out. The uncertainty from moving cables due to the movement of the turntable was found negligible due to the use of a rotary joint. We calculated the uncertainty of the radiation efficiency of the reference antenna using [25].
In [17], one component of uncertainty is the chamber "lack of spatial uniformity", which is calculated using Formulation 1, which only uses between uncertainty. We used Formulation 2, that also includes within uncertainty, so this component does not contain only uncertainty due to a lack of spatial uniformity. Since we measured multiple bands, we used the maximum uncertainty derived from all bands. It should be noted that the maximum uncertainty value did not vary more than 0.03 dB between the bands. The same holds for the chamber "lack of spatial uniformity" component of uncertainty in the contribution in the reference measurement part. The uncertainty is lower here, since N B = 6 for the reference measurement, while N B = 1 for the DUT measurement.
We estimated the total expanded uncertainty by using an RSS technique on the uncertainties in dB from both groups, according to [17]. We assume all of the components of uncertainty to be uncorrelated and Gaussian distributed here. To cover the uncertainty due to a limited number of samples, we multiplied the result with a coverage factor of 1.96 to obtain a 95% confidence interval [20]. The total expanded uncertainties for NB-IoT and CAT-M1 are 1.26 and 1.14 dB, respectively. For both protocols, the uncertainty lies below the maximum allowed uncertainty for TIS, which is 2.3 dB (2.0 dB for TRP) [22]. It should be noted that, if the formulation taking only between uncertainty into account was used, these values would be 1.75 and 1.45 dB, respectively, which overestimates the uncertainty significantly. If another uncertainty component turns out to be higher than anticipated, this could lead to non-compliance with the standard. This shows the importance of taking both the within and between uncertainty into account.

Conclusion
In this paper, we presented for the first time a comprehensive uncertainty analysis of NB-IoT and CAT-M1 measurements of TIS in a RC. We performed a significance test and analyzed the results using three different NB-IoT bands and multiple CBW cases. Using the outcome of the test, we showed that a formulation that takes both within and between uncertainty into account should be used to calculate the uncertainty in the reference and DUT measurements, as compared to current standardized test methods, which only use the between uncertainty. This is due to the narrowband nature of these protocols, which greatly increases the uncertainty within an independent realization. This type of uncertainty has been considered as negligible, up until now. For the results shown here, use of the between formulation will overestimate the total expanded uncertainty by approximately 0.5 dB for NB-IoT and CAT-M1. This could lead to non-compliance to the standard and is therefore critical to be taken into account by engineers.