Live processing of momentum-resolved STEM data for first moment imaging and ptychography

A reformulated implementation of single-sideband ptychography enables analysis and display of live detector data streams in 4D scanning transmission electron microscopy (STEM) using the LiberTEM open-source platform. This is combined with live first moment and further virtual STEM detector analysis. Processing of both real experimental and simulated data shows the characteristics of this method when data is processed progressively, as opposed to the usual offline processing of a complete dataset. In particular, the single side band method is compared to other techniques such as the enhanced ptychographic engine in order to ascertain its capability for structural imaging at increased specimen thickness. Qualitatively interpretable live results are obtained also if the sample is moved, or magnification is changed during the analysis. This allows live optimization of instrument as well as specimen parameters during the analysis. The methodology is especially expected to improve contrast- and dose-efficient in-situ imaging of weakly scattering specimens, where fast live feedback during the experiment is required.


Introduction
The development of ultrafast cameras for transmission electron microscopy (TEM) such as the pnCCD (Müller et al., 2012;Ryll et al., 2016), the Medipix3 chip (Placke et al., 2013), delay-line detectors (Müller-Caspary et al., 2015;Oelsner et al., 2001) or the EMPAD (Tate et al., 2016) enabled the collection of the full di raction space up to a flexible cut-o spatial frequency at each scan point in scanning TEM (STEM). This technology paved the way for momentum-resolved STEM techniques with high samplings in both real and di raction space, sometimes being referred to as 4D-STEM. Acquisitions with a detector frame rate of several kHz are currently achieved by employing these cameras. In particular, the mapping of electric fields and charge densities down to the atomic scale (Müller et al., 2014), meso-scale strain, and electric field measurements by nano-beam electron di raction (Müller-Caspary et al., 2015), and, furthermore, electron ptychography (Hoppe, 1969;Hegerl & Hoppe, 1970) have been enabled by this dramatic detector speed enhancement (Nellist et al., 1995;Rodenburg & Bates, 1992;Rodenburg et al., 1993;Jiang et al., 2018;Humphry et al., 2012).
Due to its excellent dose e iciency (Zhou et al., 2020) STEM ptychography as a method to retrieve the complex object transmission function has gained increasing interest. Four-dimensional data sets combining real and di raction space information have been shown to provide enormous flexibility in post-acquisition processing. For example, ptychography has been demonstrated to be capable of both resolution improvement and aberration correction a er the acquisition using computational methods (Nellist et al., 1995;Gao et al., 2017). It achieves a be er signal to noise ratio for weak phase objects than annular bright field or di erential phase contrast (Seki et al., 2018) and allows reconstruction at extremely low dose (O'Leary et al., 2020).
The high data rates require an e icient implementation of advanced methods for imaging contrast via postprocessing in order to minimise the duration of numerical processing. Ultimately, computing and so ware implementation capabilities are desirable that allow for the live reconstruction of the ptychographic phase and ampli-tude, first moments, electric fields, and charge densities, for example.
During experiments a region of interest is usually selected via imaging employing conventional STEM detectors. Weakly sca ering and beam sensitive specimens, where ptychography and first moment imaging (Waddell & Chapman, 1979;Müller et al., 2014) can be most advantageous, generate poor contrast in conventional imaging modes and quickly degrade using typical beam currents in conventional STEM (Peet et al., 2019). 4D-STEM analyses are normally applied a er data acquisition and transfer to a data processing workstation. It is therefore not possible to be certain that the selected region and microscope se ings were appropriate until a er successful reconstruction. For that reason, one o en acquires a larger number of data sets, which takes considerable storage space in the case of 4D-STEM. In contrast, a fast implementation of the considered computational methods would allow to perform this data evaluation live during the experiment.
In this study, we demonstrate 4D-STEM continuous live scanning with the simultaneous ptychographic single-sideband (SSB) reconstruction combined with bright field, annular dark field, and first moment imaging, including its divergence which is proportional to the charge density in thin specimens. To this end, the ptychographic algorithm was firstly reformulated mathematically. This allows navigation on the sample and change of microscope parameters with a live view of the reconstruction, alongside signals from other 4D-STEM techniques. Secondly, we use SrTiO 3 and In 2 Se 3 to demonstrate the live evaluation capability of our approach in experiments by in-situ processing of the data stream of an ultrafast camera. Thirdly, the results obtained live are validated by conventional post-processing. Particular a ention is drawn to the capability of SSB to provide reliable structural images, to reconstruction artefacts arising from processing partial scans and to reconstructing non-periodic objects. Moreover, the performances of SSB ptychography and the enhanced ptychographic engine (ePIE) are compared, interestingly pointing towards SSB being significantly more robust against dynamical sca ering in terms of qualitative structural imaging. This article closes with a detailed discussion and a summary.

Live imaging
A continuous live scanning display for large-scale 4D-STEM data benefits greatly from a data processing method where smaller portions of input data are processed independently and merged progressively into the complete result. For virtual detectors and first moments, o en referred to as "centre-of-mass (COM)", this is trivial to achieve since each detector frame, i.e., di raction pa ern, produces an independent entry in the final data set which allows frames to be processed individually, accumulating results in a bu er. Displaying the contents of this bu er at regular intervals provides a live-updating view.
In contrast, ptychography generates results by pu ing detector frames from the entire data set, or at least a local environment, in relation to each other. Consequently, adapting the methodology so as to circumvent the processing of the entire data set at once, thus gradually merging partial results extracted from portions of the input data into the complete result, is necessary. As previously demonstrated by Rodenburg & Bates (1992), Rodenburg et al. (1993) and Pennycook et al. (2015), both phase and amplitude information can be extracted from a set of di raction pa erns (usually restricting to the Ronchigram region) by performing the Fourier transform of the four-dimensional data set with respect to the scan raster and reordering the dimensions. Depending on the model presumed for the interaction between specimen and incident STEM probe, the direct inversion of the data can either be done by Wigner Distribution Deconvolution (WDD) or the SSB ptychography scheme (Rodenburg et al., 1993). Whereas WDD is based on a single interaction with an arbitrary complex object transmission function and is capable of separating specimen and probe, the weak phase approximation governs the SSB approach. Here, the specimen exit wave at a given scan point can be expressed by a multiplication of the probe wave function ψ probe ( ì r ) with the first-order Taylor expansion of a phase object with phase distribution Φ( ì r ). It is hence limited to weakly sca ering ultrathin objects e.g. thin light ma er investigated with relativistic electrons. Any quantitative interpretation of SSB reconstructions of real data should, therefore, be examined critically since eq. (1) breaks down quickly with increasing thickness. Its capability of direct, dose-e icient phase recovery makes it nevertheless a ractive for the in situ qualitative assessment of specimen and imaging conditions. A more advanced ptychography scheme can still be applied a er recording. Because a data point in the final reconstruction depends on all recorded scan points, the reconstruction is only accurate if it is applied to the full 4D data.
We refer to original work for a derivation of the conventional SSB methodology (Rodenburg & Bates, 1992;Pennycook et al., 2015) and give a concise summary here. The first processing step consists of Fourier transforming the 4D data cube as to the scan coordinate, translating the scan coordinate in real space to spatial frequencies ì Q Fig. 1. Double overlap regions in the planes of the Fourier transformed 4D data cube are defined by three circles of the size of the probe-forming aperture. The distance Q between the circles is determined by the currently considered spatial frequency of the scan raster and defines the spatial frequency of Φ reconstructed in this plane of the data cube. Potential triple overlaps for small Q need to be excluded.
sampled by the scanning probe. This new 4D data cube can be ordered such that each scan spatial frequency defines one plane. By employing the weak phase object approximation in eq. (1), Rodenburg et al. have shown analytically that the data in each plane is described by three discs of the size of the probe-forming aperture, positioned at the origin and as Friedel pairs at the positions defined by the spatial frequency vector ì Q . Importantly, double overlaps as depicted schematically in Fig. 1 contain the complex Fourier coe icients of i Φ, potentially a ected by aberrations of the probe-forming system. The double overlap regions are o en referred to as tro ers colloquially.
The ptychographic SSB reconstruction is a linear function of the input data since the result is obtained with a sequence of linear transformations, such as Fourier transforms, element-wise multiplication, and summation (Pennycook et al., 2015). Such linear functions are particularly suitable for incremental processing since they are additive. Mathematically, the complete input data can be understood as the sum of smaller individual data portions that are padded with zeros to fill the shape of the complete data set. Additive functions allow to calculate the complete result by accumulating the processing results of zero-padded portions in any subdivision and order. Furthermore, intermediate results can be extracted at any desired stage from the yet incomplete sum of results.
However, directly processing zero-padded data this way is very ine icient for computationally demanding algorithms such as ptychography employing large 4D-STEM data, because the processing e ort and memory consumption is amplified by the number of subdivisions. For that reason, the algorithm should be reformu-lated to process smaller input data portions without zeropadding. The additivity and homogeneity of linear functions gives ample freedom to restructure the underlying data processing flow towards this goal, allowing development of mathematically equivalent implementations that are optimized for live imaging.
In the particular case of SSB ptychography, individual spatial frequencies of the result Φ( ì r ) are extracted from spatial frequencies of signals at specific sca ering angle ranges within the double overlaps (Rodenburg et al., 1993;Pennycook et al., 2015) introduced in Fig. 1.
Suppose we have the intensity of di raction pa erns present as four-dimensional data, which is wri en as D x y ∈ m×n where x ∈ [s x ], y ∈ [s y ] and the notation [s x ] represents the set of natural numbers not exceeding s x − 1. That is, [s x ] := {0, 1, . . . , s x − 1} with the total number of elements s x . The indices x, y represent the scan position index, with the total number of scanning steps being s x and s y in each direction. Each matrix D is a di raction pa ern d pq where p and q represent the pixel index on the detector with dimension m × n. Each pixel index pq corresponds to a sca ering angle, or equivalently, spatial frequency in the specimen.
We can write the Fourier transform with respect to the scan raster as where k and l denote the spatial frequencies in the scan dimension, taking the places of x and y . Therefore we obtain a four-dimensional dataset in the spatial frequency domain, i.e., F k l ∈ ¼ m×n . The next step in SSB is to apply a filter B k l ∈ ¼ m×n for each tuple of spatial frequencies k l that calculates the weighted average of the spatial frequency signal over specific ranges of pixels, i.e. sca ering angles pq , that can have positive or negative weight as defined by the trotters. It should be noted that the structure of this filter is di erent for each tuple of spatial frequencies.
Using (2), we can write the reconstruction in the Fourier domain p k l as The reformulation is given by interchanging the summation and the index dimension of detector and the index of scan points in the real and frequency domain, respectively. This nested summation can be pruned without changing the result by skipping parts that are known to yield zero, for example calculations for empty double overlap regions. Note that p k l essentially represents the Fourier coe icients of i Φ x y in eq. (1) which are potentially a ected by aberrations of the probe-forming system.
The inner part of eq. (3) can be implemented with a matrix product between B and D. The outer sum over x and y can then be sub-divided and reordered to process the input data D incrementally in smaller portions. Numerically, the filter matrix B k l can be stored e iciently as a sparse matrix since the tro ers for the highest frequencies are o en empty (no overlap), and for many frequencies the double overlap region where the filter is non-zero, is small.
As an intermediate summary, this formulation translates the SSB scheme to a matrix product of a partial input data matrix with a sparse matrix containing the double overlap regions. Elements of a Fourier transform are then applied to the result of this matrix product. This generates a partial reconstruction result in the frequency domain that covers the entire field of view. The partial reconstructions for all partial input data portions are accumulated in a global bu er using a sum, as described above. For a live view of the reconstruction in the spatial domain, the contents of this bu er can be inversely Fourier transformed at any desired time to show the transmission function of the specimen within the weak phase object approximation.
We used LiberTEM  as a data processing framework since it is optimized for MapReducelike approaches and designed with live data processing capabilities in mind . LiberTEM userdefined functions (UDFs) provide the application programming interface (API) to e iciently implement operations that follow the described pa ern: A method to process a stack of frames that is called repeatedly, task data to store constant data such as the sparse matrix for the double overlaps, result bu ers of arbitrary type and shape, and user-defined merging operations to generate a result of arbitrary complexity from partial results. Furthermore, LiberTEM allows to update a result display each time a partial result is merged. UDFs for first moment analysis and virtual detectors were already implemented before for o line data analysis.
To run the UDFs on live data, we implemented a prototype live UDF back-end that allows to run a set of UDFs on data from a antum Detectors Merlin for EM Medipix3 for electron microscopy (Placke et al., 2013; antum Detectors, 2019). It uses multiple CPU cores for decoding the raw data from the detector, and a single GPU for the main processing task, which was suf-ficient for this application. A production-ready version with support for multiple processing nodes, further enhanced use of multiple CPUs and multiple GPUs similar to the o line data processing capabilities of LiberTEM is being designed at the time of writing and will be published as open-source as a part of LiberTEM.
For comparison, ePIE (Maiden & Rodenburg, 2009) based ptychographic reconstructions have been performed in post processing as it can reconstruct both the complex object transmission function and the complex illumination, still presuming single interaction of probe and specimen, but considering an arbitrary complex object transmission function. This was done to check whether the straightforward SSB approach compromises the quality of the result compared to more advanced ptychographic schemes. It has to be noted that, as an iterative method, ePIE is not suitable for live imaging in the current realisations.

Material system
We demonstrate live processing using two di erent specimens. First, indium selenide (In 2 Se 3 ) was used to highlight the advantages of live ptychography and centre of mass in comparison to conventional STEM (Ye et al., 1998). Second, a strontium titanate (SrTiO 3 ) lamella with the electron beam incident along the [100] axis for a more quantitative analysis was used. The la er is stable and provides good contrast in conventional STEM for comparison and adjustments, is well-characterized, and at the same time can highlight the ability to also resolve the light oxygen columns that are di icult to image with reliable contrast by conventional STEM techniques (Browning et al., 1995). Too small double overlap regions with an area of less than 10 px were omi ed in live processing to avoid the introduction of excessive noise. This can be understood as a band-pass filter applied to exclude the high frequencies close to the transfer limit of SSB ptychography which corresponds to twice the radius of the probe-forming aperture.

Experimental setup
First moment imaging and ptychography require the correct alignment and calibration of the scan (specimen) coordinate system with respect to the detector coordinate system. The convergence angle of the incident probe was measured with a polycrystalline gold specimen. Employing parallel illumination first, the (111) gold di raction ring was used to calibrate the di raction space assuming a la ice constant of gold of 0.4083 nm (Villars & Cenzual, 2016). With the known wavelength the convergence semi-angle was determined to 22.1 mrad from a Ronchigram recorded in the same STEM se ing as used in the actual experiment. The rotation of the detector coordinate system with respect to the scan axes was determined by minimizing the curl of the first moment vector field and making sure that the divergence of the field is negative at atom positions. Note that, in theory, the curl of purely electrostatic fields should vanish. The pixel size in the scan dimension was taken from the STEM control so ware during live processing and verified by comparison with the known la ice constant of SrTiO 3 . The residual scan distortion, that is, the translation of the di raction pa ern as a whole during scanning, was not compensated for since it turned out to be negligible at the atomic-resolution STEM magnifications used in this analysis.
Data was acquired at a probe-corrected FEI Titan 80-300 STEM (Heggen et al., 2016) operated at 300 kV. The microscope was equipped with a Medipix Merlin for EM detector operated at an acquisition rate for a single di raction pa ern of 1 kHz in continuous mode. The scan size was 128 × 128 scan points and the recorded di raction pa erns had a dimension of 256 × 256 pixel. In addition, the high-angle annular dark field (HAADF) signal has been recorded with a Fischione Model 3000 detector covering an angular range of 121.7 mrad to approximately 200 mrad. The upper limit is defined by apertures of the microscope rather than by the outer radius of the detector. The run time for each step of the data processing flow was measured using the line-profiler Python package.

Live processing
The performance of our approach is demonstrated in Fig. 2 which depicts (a) the structure of In 2 Se 3 together with the projected potential and (b) snapshots of the reconstructed phase in nanosheets of this material using SSB ptychography at di erent stages of the scan. The reconstruction is performed progressively such that only the di raction pa erns of incoming scan pixels are processed and added to the final reconstruction according to eq. (3) in which all pixels are a ected. Whereas current research in the field of materials science explores ultrathin In 2 Se 3 as a candidate for 2D ferroelectrics, this specimen was chosen due to its robustness against electron dose for the methodological development. The data in Fig. 2 b was taken from a live video recorded during the STEM session, being available as supplementary material online.
In general, the quality of the ptychographic reconstruction strongly depends on matching the position and radius of the (usually circular) aperture function on the detector with the double-overlap regions. Even small deviations lead to a mismatch between the mask borders and the edges of the zero-order disk, which adds noise and reconstruction errors from those erroneous border pixels. A precise alignment can be done, e.g. using a position-averaged di raction pa ern or by recording the bright-field disc with the specimen removed from the field of view, as in our analyses.
It must be pointed out that a partial SSB reconstruction only approximates the result because not all spatial frequencies have been completely sampled at this point. However, it is already possible to see the atom columns in the partial reconstructions. This allows time-and dosee icient live focusing and navigation to regions of interest on specimens. In Fig. 3, complete reconstructions of In 2 Se 3 are shown. All atom columns can be seen clearly in the bright field images, in the divergence of the first moment and in the phase of the object transmission function obtained by SSB ptychography. All signals were calculated live from the 4D-STEM data stream. Figure 4 depicts the phase of ptychographic SSB reconstructions of SrTiO 3 performed live while changing microscope parameters such as the STEM magnification (scan pixel size) in Fig. 4 a-c, and the probe focus in Fig. 4 d. The full video showing live updating during continuous scanning while microscope parameters are changed is available in the supplementary material online.
Although the inherent reconstruction parameters are naturally robust against a change of the probe focus for direct ptychography schemes that do not intend to correct aberrations, a magnification change influences the relation between a spatial frequency in the reconstruction in pixel coordinates, and the pixel distance in sample coordinates. This means that the geometry of a double overlap region that contains the signal to reconstruct a certain spatial frequency in the result changes with magnification. In the strict sense, changing the magnification without updating the scan step size (and double overlap masks) in the reconstruction is, therefore, mathematically inaccurate. In practice, however, it is desirable to be able to lower the magnification so as to navigate coarsely to a specimen region of interest without a new calibration or pre-calculation of the masks B k l in eq. (3) to be time-e icient, and then observe or record details a er switching back to the correct magnification again. Therefore, we studied to which extent the reconstruction is robust against a magnification change despite keeping the internal SSB parameters unchanged in Fig. 4 a-c. It showcases that the structural contrast is in general maintained. A possible future development with only a moderate e ort could be pre-defining double overlap masks for common magnifications in order to build a look-up table that solves inaccuracies. We assume without loss of generality regarding the algorithmic structure, that live processing is required at a dedicated magnification being equal to the final recording where data is actually stored, and mention a certain robustness against changes of the scan pixel size as a valuable side note.

Computational details
In our implementation the sparse matrix product was the throughput-limiting step. By using an e icient GPU implementation from cupyx.scipy.sparse (CuPy), this could be accelerated su iciently to allow processing of over 1000 frames per second at suitable parameters, enabling for live imaging using a Medipix3 sensor.
The memory consumption of the current SSB ptychography implementation scales as O (N ) with the number of scan points at constant aspect ratio since the number of spatial frequencies to reconstruct scales linearly with the number of scan points, and each reconstructed frequency adds a double overlap region. These regions sample the di raction data more densely when extracting more spatial frequencies, meaning the average number of non-zero entries per tro er is roughly constant for a given pixel size and beam parameters. As the processing time for dense-sparse matrix products roughly scales linearly with the number of non-zero entries in the sparse matrix and number of vectors in the dense matrix, SSB scales poorly with O N 2 in computation time with the number of scan points.
The size of the matrix containing the double overlap regions is highly dependent upon the acquisition parameters in the present implementation. A larger camera length increases the Ronchigram size, consequently increasing non-zero entries. The relationship between scan pixel size and the convergence angle changes the spatial frequency limit above which no double overlaps occur. The less o en spatial frequencies create double overlaps, the smaller the matrix B becomes. In this study the scan area for live processing was limited to a size of 128 × 128 and microscope parameters were chosen to keep the matrix size low enough to fit into the GPU RAM.

Post processing
We concentrated up to this point, solely on the live evaluation in order to facilitate optimisation of experimental parameters during the session. This avoided the need for saving vast amounts of 4D-STEM data that would have required significant disc space due to the continuous nature of the experiments. During the experiments only a few reliable data sets were recorded. To verify the reliability of our approach we compared the post-processing of the recorded data against the live experiment to determine if the inherent parameters used in post-processing such as the Ronchigram position and radius, are in su icient agreement with those of the live results.
In Fig. 5, the post processing of In 2 Se 3 data is shown. Here, the two di erent atom columns highlighted in Fig. 2 a can be seen in the first moment vector field (Fig. 5 a), in its divergence (Fig. 5 b), and in the HAADF image (Fig. 5 c). The phase of the SSB reconstruction ( Fig. 5 d) also yields site-specific contrast, which is more pronounced than in the live imaging result in Fig. 3 d. Recalling that SSB ptychography relies on the weak phase object approximation in eq. (1) that breaks down already at the thinnest of specimen as to a quantitative interpretability, conclusions from relative phases among different atomic sites need to be drawn with great care.
In Figure 6, a comparison between di erent signals as well as between live imaging and post processing of SrTiO 3 is shown. Strontium titanate enables evaluation of the di erent imaging modes regarding their capability for simultaneous imaging of light oxygen columns and comparably heavy Sr and Ti oxide columns, adding to the results obtained for In 2 Se 3 in Fig. 5. The first moment vector field (Fig. 6 a) predominantly shows the heavy atom columns as sinks. The fact this vector field also contains sinks at the oxygen sites becomes visible in the divergence map in Fig. 6 b, whereas the HAADF signal recorded sep- (h) The amplitude of the probe from the ePIE reconstruction shows some la ice information. The ePIE reconstruction has a 4°rotation compared to the other post processed results due to the rotation angle. This owes to the implementation of dealing with the scan rotation. The semi-convergence angle was 22.1 mrad, the sample thickness was approximately 25 nm, determined by comparing the PACBED with simulation as shown in Figure 7. arately with the conventional annular detector in Fig. 6 c visualises only the heavy-atom sites of Sr and Ti oxide. In the post-processed SSB (Fig. 6d) the oxygen columns can be easily determined simultaneously with the heavy sites. The live SSB (Fig. 6g) has some noise, but the oxygen columns are still visible. Moreover, a comparison of the conventional HAADF in Fig. 6 c with the annular dark field signal generated by a virtual annular detector applied to the 4D-STEM data in Fig. 6 f demonstrates that practically all main contrast mechanisms exploiting lowas well as high-angle sca ering can be captured by the 4D-STEM imaging mode.
Earlier we stated that the live ptychographic reconstruction exploits the SSB algorithm, because it is a noniterative, linear and direct scheme that allows for in-situ processing. However, the weak phase object interaction model given by eq. (1) is a seemingly drastic limitation of this approach. In the analysis of post processing data, it is beneficial to explore whether a ptychographic scheme based more on an advanced interaction model, neglecting computational hardware constraints for the moment, could have been the be er choice for live ptychography. To this end, we used the ePIE algorithm to reconstruct the SrTiO 3 data as shown in Fig. 6 e. The standard ePIE implementation clearly does not give a reasonable result, at least for the usual reconstruction se ings reported in literature that we used here. To explore this in more detail, the sample thickness was determined to approximately 25 nm by comparing the experimental PACBED with a thickness dependent simulation as in Fig. 7. Consequently, the specimen thickness was far beyond the validity of both the weak phase object approximation and the single-slice model used in ePIE. At first sight it is nevertheless surprising that the la er approach performs worse, since one must consider it more advanced than the weak phase model from the viewpoint of sca ering theory. In fact, ePIE tries to iteratively find both the probe and the object transmission function in such a way that the modulus of the Fourier transform of their product agrees best with all details of the experimental di raction data. When dynamical sca ering sets in, this multi- plicative interaction scheme is incapable of delivering the details of the di raction pa ern, so that the algorithm does not converge to a reliable solution. In this particular case ePIE has put some la ice information in the probe (Fig. 6 h). This will be studied in the next subsection by means of simulations.
To summarise the experimental results, performing live ptychography using the SSB method had originally been motivated by computational aspects, but contrary to expectations it also turns out to be more robust against the violation of the weak phase object approximation and dynamical sca ering. Of course, this only holds for qualitative imaging, but it is a significant advantage in practice where suitable structural contrast is obtained also at elevated specimen thickness for both light and heavy atomic columns.

Simulation studies
Partial reconstruction. As the implementation of live ptychography based on eq. (3) maps the result of single scan points successively to the final reconstruction, a simulation study has been conducted in which we investigated the accuracy of the reconstructed phase in already scanned regions in dependence of the scan progress. A synthetic dataset has been simulated, based on an SrTiO 3 unit cell as a starting point. Then, a five by five super cell was created by repetition and the phase grating (Fig. 8) has been calculated. Two artificial spatial frequencies were added to the phase grating, one with a wavelength of a single unit cell and one with a wavelength of the super cell. To eliminate dynamical sca ering in this conceptual study, a 4D-STEM simulation with 20 × 20 scan Fig. 9. Visualization of accumulation of partial reconstructions using a synthetic dataset of simulated SrTiO 3 combined with long-range potential modulation. The le column shows the reconstruction of disjoint subsets of the input data, and the right one the accumulated result until all data is processed and the complete result is obtained. An animation with smaller subdivisions is available in the supplementary material. points per unit cell was performed using only one slice with a thickness of one unit cell along electron beam direction [001]. Finally, partial SSB reconstructions have been performed using the full range of scan pixels in vertical directions, but only portions of 20 scan pixels horizontally, mimicking a reconstruction during progressive scanning as depicted in Fig. 9. An animation calculated from 100 single pixel columns is available in the supplementary material. Figure 9 contains the individual blocks of the 20 scan pixel wide reconstructions in the le hand column, and the accumulated result on the right hand side with the scan progressing from top to bo om. Consequently, the phase bo om right is the final reconstruction for the full scan, obtained by our cumulative approach which we found to be identical to a reconstruction using the conventional treatment employing the whole 4D scan. Only here, the low, medium and atomic-scale spatial frequencies are reconstructed correctly without artefacts. In that respect, it is instructive to explore the partial reconstructions in Fig. 9. Since we used subsets that equal the size of a single unit cell, spatial frequencies down to the synthetic one with a period of one unit cell appear at least qualitatively in all partial reconstructions. The sharp edges between the available data and the yet-missing region have resulted in a ringing e ect near the edges known as the Gibbs-phenomenon. A closer look at the le column of Fig. 9 exhibits that these artefacts largely interfere destructively during accumulation as seen, e.g. by a maximum at horizontal scan pixel 20 in the top row and a minimum at this position in the row below. Consequently, ringing artefacts become less obvious in the full reconstruction on the right within the region that has already been scanned. Therefore, it is already possible to visualise the atomic structure and partly meso-scale phase variations for partial scans, making it possible to navigate on the sample and visually interpret results in a real experiment.
Real specimens and scan regions do not usually fulfill periodic boundary conditions which still apply to the full scan of Fig. 9. Therefore, Fig. 10 shows the impact of selecting di erent reconstruction areas by simulating the reconstruction of a smaller scan area that is not aligned with the underlying la ice. SSB reconstructs the specimen with an assumption of a periodic boundary condition and cannot reconstruct spatial frequencies above a certain threshold, as previously discussed. Trying to reconstruct a field of view where wrapping around the edges creates a discontinuity, i.e. frequencies higher than SSB can reconstruct, leads to reconstruction artefacts as seen in Fig. 9.
antitatively, the di erence between the ground truth taken from the marked rectangle of the full reconstruction in Fig. 10 a and a reconstruction that solely employs the scan pixels therein, as seen in Fig. 10 b is Fig. 10. Reconstruction limited to a cutout from the data in both reconstructed area and input data, as opposed to partial reconstruction of the area of the full dataset in Figure 9. (a) shows the selected cutout area from the full reconstruction, (b) the reconstruction limited to this area, and (c) the di erence between the two. This demonstrates how a discontinuity from wrapping around at the edges for a periodic boundary condition creates reconstruction artefacts due to the high frequency cuto of SSB. mapped in Fig. 10 c and can take significant values of 5-10 % of the phase itself in the present example.
As a solution, the field of view where an accurate reconstruction is required can be surrounded by a smooth transition to a zero-valued bu er area so that the presence of spatial frequencies above the resolution limit is minimized when the reconstruction area is wrapped around at the edges. In future studies a Lanczos filtering (Duchon, 1979) scheme could be added for our live ptychography approach.
Thickness e ects. A comprehensive algorithmic review is not our focus but touching briefly on the findings in conjunction with Fig. 6 we elucidate the impact of dynamical sca ering, or, equivalently, specimen thickness, on di erent signals. A multislice simulation (Rosenauer & Schowalter, 2007) has been performed for SrTiO 3 in [001] projection employing the experimental parameters and using 20 × 20 scan pixels. The data was evaluated for thicknesses of 1 nm and 30 nm, addressing both the kinematic case and the situation of elevated thickness in our experiment. The results have been compiled in Fig. 11. Figure 11 a shows the phase grating used in the multislice simulation for structural reference. In figure part (b) we added the theoretical result that would be obtained for SSB ptychography in case all methodological premises were fulfilled in practice. That is, we generated a 4D-STEM data set by means of the weak phase approximation in eq. (1) for a single slice with the thickness of one SrTiO 3 unit cell and performed the SSB reconstruction. Note that this is identical to the phase Φ in eq. (1), lowpass filtered with a circular aperture that has twice the radius of the probe-forming aperture.
Figures 11 c-j show the results of evaluating the 1 nm (le column) and the 30 nm data (right column) using di erent methods aligned row-wise. As can be expected from former studies employing first moment based imaging (Müller-Caspary et al., 2017;Müller et al., 2014) of centrosymmetric structures, Figs. 11 c-f resemble the atomic structure in terms of momentum transfer vector maps and their divergences with the atoms being sites of central fields. This is preserved in qualitative manner only for more elevated thicknesses. Note that Figs. 11 c and e are proportional to the probe-convoluted distributions of the projected electric field and charge density, respectively.
Similarly, the SSB reconstructions in Figs. 11 g and h yield reliable structural contrast at both low and elevated thickness despite the violation of the weak phase approximation in eq. (1), which confirms our interpretation in the post processing section. However, already the result in Fig. 11 g should be considered as qualitative except for the oxygen sites. Please note that the probe had been focused on the specimen surface in the simulation. Because the SSB reconstruction considers the 30 nm thick specimen as a single slice here, the optimum focus would have been at some depth inside the specimen which is one reason why atomic sites appear slightly broader (Fig. 11 h).
The situation is di erent for the ePIE results in Figs. 11 i,j. Whereas the reconstruction for the thin specimen in Fig. 11 i represents the phase excellently and can be considered quantitative within the general framework of validity of ePIE, the algorithm has severe di iculties in reconstructing the object transmission function at 30 nm thickness in Fig. 11 j. In Fig. 11 k the reconstructed probe looks like an airy disc. This is the result of an aberration free probe, limited only by an aperture in di raction space, and that was used in the simulation. In Fig. 11 l sample information is transferred to the probe. This further confirms our observations concerning ePIE in the post processing section by simulation. However, Fig. 6 e,h gives even less information from the specimen than Fig. 11 j,l, which can be a ributed to Poisson noise neglected in the simulation, residual aberrations, and importantly, di erent manners of separating probe and object which starts to fail when dynamical sca ering sets in. To conclude, selecting the SSB scheme for live imaging of structural contrast can also be supported from the simulation point of view.
Low dose. The performance of the SSB reconstruction and the divergence of the first moment was checked in a low dose simulation (Fig. 12). At 1000 electrons per Å 2 (Fig. 12a,b) the heavy atom columns are visible. At this dose, the SSB reconstruction gives stronger contrast than the divergence of the first moment. At 10000 electrons per Å 2 (Fig. 12c,d) also the oxygen atom columns are visible. The SSB reconstruction and the divergence of the first moment shows similar performance at this dose.

Discussion
The digitisation that took place decades ago with key developments such as charge-coupled device (CCD) cameras and computer controlling, processing and visualisation in STEM denotes one of the drastic paradigm changes in electron microscopy. It enabled the live assessment of recorded data and transformed an optimisation of experimental parameters from multiple sessions to a quick feedback loop taking only several minutes within a single session. Surprisingly, innovative hardware associated with an increase of the dimensionality of the recorded data has, to some extent, put us back to ancient workflows for advanced methodologies. A major challenge for contemporary imaging in the era of Big Data is thus to make current ex-situ multidimensional evaluations capable for live imaging. Within this context, the present work shall be seen as a first step that demonstrates the feasibility of such a workflow using a rather straightforward example. Our work highlights the ongoing push towards high-performance computational methods in electron microscopy that are driven by an increasing camera performance . That includes suitable so ware frameworks, connections, storage, processing hardware, and know-how in computer science and engineering to be used in synergy with established and future imaging methodologies.
Several important general conclusions can be drawn from the present study. First, adequate computational hardware is already available for this purpose, given that the mathematical formulation can be adapted to make use of it e iciently. Second, and most importantly, open and well-defined so ware interfaces which were available for the hardware used here, are key prerequisites to implement nonstandard imaging concepts developed in science into established infrastructures. Third, it can be beneficial to exploit partly simplistic models to achieve live imaging capabilities, exemplified here by the use of Fig. 11. Simulations for SrTiO 3 : (a) Phase grating used for multislice simulation, (b) SSB from a single slice simulation using also the weak phase approximation in the simulation, (c, d) First moment, (e, f) Divergence of first moment, (g, h) SSB without using weak phase approximation in the simulation of the 4D-STEM-data, (i, j) phase object from ePIE, (k,l) probe amplitude from ePIE, (c, e, g, i, k) 1 nm sample thickness and (d, f, h, j, l) 30 nm sample thickness. The simulation parameters where chosen to match those used in the experiments. Additionally the following parameters were used: 22.1 mrad semiconvergence angle and 20 by 20 scan points per unit cell. the SSB algorithm for a materials science case. On the one hand, it violates inherent assumptions significantly, on the other hand, the qualitative nature of the results is be er than one might expect from the weak phase approximation. In particular, the present setup can be considered to be highly beneficial for low-dose ptychographic live imaging of challenging specimen in the fields of structural biology and so ma er in Cryo electron microscopy. A suitable experimental setup with open soware interfaces to tap the data stream of a 4D-Cryo-STEM experiment was unfortunately not available to the authors to enable the inclusion of respective examples in the present report. On the other hand, this would not change the methodological setup worked out in this paper using solid-state examples.
The current implementation mainly served to investigate the fundamental feasibility and characteristics of ptychography for live imaging. Integrating the parameter selection and results display in to existing instrument control so ware could be the next steps to improve the usability, making it more practical to apply routinely in microscopy. In that respect, live focusing, stigmation, and, prospectively, correction of further aberrations based on ptychography are possible. Furthermore, the implementation can be extended to include mitigation of artefacts from the edges that are demonstrated in Figure 10.
Existing LiberTEM UDFs that were previously only used for o line processing were applied to live data without modification, proving a long-standing design goal of LiberTEM . In particular, the UDF interface allowed to disentangle details of the data logis-tics from the numerical and scientific aspects of the used algorithms. The prototype data decoder and UDF runner could easily keep up with the data rate of the Merlin detector. UDFs with low computational load such as virtual detectors or first moments remained at single-digit CPU load percentages. That means much higher data rates are likely to be possible with a suitable distributed UDF runner implementation. This will be required to support multi-chip cameras such as the Gatan K2 or K3 IS or X-Spectrum Lambda. Computationally intensive operations like ptychography are more challenging to scale to such data rates.
The poor scaling behaviour of the current SSB implementation has proven to be a limiting factor. For illustration, doubling the scan resolution at constant aspect ratio quadruples the number of scan points and results in 16x increased computation time. In future, an implementation that significantly reduces the computational load would be highly desirable to make this technique useful for mainstream data analysis. Ideally there would be a constant memory consumption independent of scan area and O (N ) or O (N log N ) scaling for computation e ort as a function of the number of scan points.

Summary
Live imaging of central 4D-STEM signals such as the ptychographic phase, first moments, their divergence and rotation, as well as flexible virtual detectors has been demonstrated. A direct processing of the data stream of a Medipix3 chip mounted in an aberrationcorrected STEM was implemented. An enhanced version is available open-source under https://github.com/ LiberTEM/LiberTEM-live. A prototype was used to generate the live results and is available upon request. The live imaging capability could be demonstrated for two materials science cases In 2 Se 3 and SrTiO 3 , where single-sideband ptychography proved surprisingly robust against imaging at elevated specimen thicknesses around 20 nm. It is anticipated that the live imaging approach presented here, and the transfer of such direct workflows to further imaging methods, can also enhance imaging in life sciences where, e.g., ptychography is a promising candidate for high-contrast, low-dose imaging of weakly sca ering objects without compromising spatial resolution.
innovation programme under grant agreements No. 823717 -ESTEEM3 and No. 780487 -VIDEO. Supplementary material