Three-dimensional alignment of density maps in cryo-electron microscopy

A common task in cryo-electron microscopy data processing is to compare three-dimensional density maps of macromolecules. In this paper, we propose an algorithm for aligning three-dimensional density maps, which exploits common lines between projection images of the maps. The algorithm is fully automatic and handles rotations, reflections (handedness), and translations between the maps. In addition, the algorithm is applicable to any type of molecular symmetry without requiring any information regarding the symmetry of the maps. We evaluate our alignment algorithm on publicly available density maps, demonstrating its accuracy and efficiency. The algorithm is available at https://github.com/ShkolniskyLab/emalign.


Introduction
Single particle cryo-electron microscopy (cryo-EM) is a method to determine the three-dimensional structure of biological macromolecules from their two-dimensional projection images acquired by an electron microscope [1].In this method, a sample of identical copies of the investigated molecule is quickly frozen in a thin layer of ice, where each copy is frozen at an unknown random orientation.The frozen sample is imaged by an electron microscope, resulting in two-dimensional images, where each image is a tomographic projection of one of the randomly oriented copies in the ice layer.The goal of single particle cryo-EM is to determine the three-dimensional structure of the molecule from the acquired two-dimensional images.A common task in cryo-EM data processing is to compare two density maps of the same molecule.This is required, for example, for estimating the resolution of the maps, evaluating their Fourier shell correlation curve [2], or to analyze their different conformations.All these tasks require to first align two density maps, that is, to orient them in the same way in a common coordinate system.Due to the nature of the cryo-EM imaging process, the two density maps may differ not only in their three-dimensional orientation (that is, their "rotation"), but may also have different handedness (namely, reflected relative to each other), and may be centered differently with respect to a common coordinate system.
In this paper, we propose an algorithm for aligning two density maps, which is fully automatic and can handle rotations, translations, and reflections between the maps.The algorithm requires as an input only the two density maps.In particular, it does not assume knowledge of any other information such as the symmetry of the maps.
Formally, let φ 1 : R 3 → R and φ 2 : R 3 → R be two volumes such that where r = (x, y, z) T ∈ R 3 , O ∈ O(3) and t = (∆x, ∆y, ∆z) T ∈ R 3 (O(3) is the group of all orthogonal transformations of the three-dimensional space, namely, rotations and reflections).The alignment problem is to estimate O and t given φ 1 and φ 2 .The matrix O is known as the orientation parameter, and the vector t as the translation parameter.In practice, we only get samples of φ 1 and φ 2 , arranged as three-dimensional arrays of size n × n × n, where n is the resolution of sampling.In cryo-EM, φ 1 and φ 2 represent two reconstructions of the same underlying molecule that we would like to compare (such as two half maps from a refinement process).In principle, it is possible to approximate the solution to the alignment problem using exhaustive search, by generating a set of candidate pairs (O i , t i ), where

Existing methods
There exist several methods for three-dimensional alignment of molecular volumes.The Chimera software [3] offers a semi-automatic alignment method which requires the user to approximately align the volumes manually, and then refines this alignment using an optimization procedure.This means that a sufficiently accurate initial approximation for the alignment is required.Achieving this initial approximate alignment manually naturally takes time and effort, yet it is crucial for the success of Chimera's alignment algorithm.The alignment procedure implemented by Chimera maximizes the correlation or overlap function between the two volumes by using a steepest descent optimization.The iterations of this optimization stop after reaching convergence or after 2,000 steps.Another alignment method is the projection based volume alignment algorithm (PBVA) [4].This method aligns a target volume to a reference volume by aligning multiple projections of the reference volume to the target volume whose orientation is unknown.The PBVA algorithm is based on finding two identical projections, a projection P 1 from the reference volume and a projection P 2 from the target volume as follows.The reference volume is projected at some known Euler angles, resulting in a projection P 1 , and the matching projection P 2 is found by maximizing the cross-correlation function between P 1 and a set of projections representing the possible projections of the target volume.The cross-correlation function is of five parameters -three Euler angles and two translation parameters in the plane of the projection P 2 [5].Finally, the rotation between the volumes is estimated from the relation between the Euler angles corresponding to the projections P 1 and P 2 .After estimating the rotation between the volumes, the translation between them is found using projection images from the target volume.The translation between the volumes is estimated by least-squares regression using the two translation parameters of each projection from the target volume, where a minimum of two projections is required for calculating the three-dimensional translation vector.Using multiple projection images to estimate the translation between the volumes makes the alignment more robust.
The Xmipp software package [6] also offers a three-dimensional alignment algorithm.It is based on expanding the two volumes using spherical harmonics followed by computing the cross-correlation function between the two spherical harmonics expansions representing the volumes [7].The process of expanding a volume into spherical harmonics is called the Spherical Fourier Transform (SFT) of the volume, where like the FFT algorithm, there exists an efficient algorithm for calculating the SFT [7].The process of calculating the cross-correlation function between the two spherical harmonics expansions of the volumes and estimating the rotation between the two volumes is implemented by a fast rotational matching (FRM) algorithm [8].After estimating the rotation between the two volumes, the translation between them is found by using the phase correlation algorithm [9].
Finally, the EMAN2 software package [10] offers two three-dimensional alignment algorithms.In the first algorithm (implemented by the program e2align3d, now mostly obsolete), the rotation between the volumes is estimated using an exhaustive search for the three Euler angles of the rotation.First, the algorithm generates a set of candidate Euler angles with large angular increments.Then, the algorithm iteratively decreases the angular increments in the set of candidates in order to refine the resolution of the angular search [11].A much faster tree-based algorithm is implemented in the program e2proc3d.This method performs three-dimensional rotational and translational alignment using a hierarchical method with gradually decreasing downsampling in Fourier space.In Section 8, we compare our algorithm to this latter algorithm, as well as to the fast rotational matching algorithm implemented by Xmipp.In practice, it may be that R is not determined uniquely by P , as for example, a volume may have two very similar views even if it is not symmetric.Moreover, the volumes to align are discretized and sometimes noisy, which introduces inaccuracies into the estimation of O. Thus, to estimate O more robustly, instead of using a single image P , we generate from φ 2 multiple images P 1 , . . ., P N with orientations R 1 , . . ., R N , align each P i to φ 1 as above, resulting in estimates for O given by O i = Ri R T i , and then estimate O from all O i simultaneously by solving where • F is the Frobenius matrix norm.In Section 4, we give an explicit solution for the latter optimization problem.
The key of the above procedure is estimating the orientation of a projection image P of φ 2 in the coordinate system of φ 1 .This is done by inspecting a large enough set of candidate rotations, and finding the rotation R for which the induced common lines between P (when assuming its orientation is R) and a set of projections generated from φ 1 best agree.As inspecting each candidate rotation involves only one-dimensional operations (even if the input volumes are centered differently), it is very fast and highly parallelizable.Thus, this somewhat brute-force approach is applicable to very large sets of candidate rotations (several thousands, for accurate alignment) and still results in a fast algorithm.We discuss below the complexity and advantages of this approach.We summarize the outline of our approach in Fig. 1, and describe it in detail in Sections 4 and 5.
In the above approach, we assume that O is a rotation.However, φ 1 and φ 2 may have a different handedness, and so O may include a reflection.The above approach can obviously be used to resolve the handedness by aligning φ 2 to φ 1 and to a reflected copy of φ 1 , and determining whether a reflection is needed using some quality score of the alignment parameters (e.g., the correlation between the aligned volumes).However, as we show below, in our method, there is no need to actually align φ 2 to a reflected copy of φ 1 , saving roughly half of the computations (those required to actually align φ 2 to a reflected copy of φ 1 ), as explained in Section 4.
We next explain in detail the various steps of our algorithm, including handling translations, reflections, and symmetry in the volumes.

Estimating the alignment parameters
Consider two volumes φ 1 and φ 2 , where one volume is a rotated copy of the other (assuming for now no reflection nor translation between the volumes), namely (see (1)) where O is an unknown rotation matrix.Our goal is to find an estimate for O.
In case where φ 1 and φ 2 exhibit symmetry, the solution for O is not unique.To be concrete, we denote by SO(3) the group of all 3 × 3 rotation matrices.A group G ⊆ SO(3) is a symmetry group of a volume φ, if for all g ∈ G it holds that φ(r) = φ(gr), r = (x, y, z) In other words, a symmetry group of a volume is a group of rotations under which the volume is invariant (see [12] for more details).If we denote the symmetry group of φ 1 by G 1 ⊆ SO(3) and define r = Or, then, from ( 2) and (3), we get for any symmetry element g ∈ G 1 Comparing the latter with (2), we conclude that the solution for O is not unique, and we thus replace the goal of finding O by finding any gO for some arbitrary element g ∈ G 1 of the symmetry group.Note that we assume that O is a rotation, namely that φ 1 and φ 2 are related by rotation without reflection.The case where O is a reflection will be considered below.Let P be a projection image generated from φ 2 using a rotation R, that is where R (1) , R (2) , R (3) are the columns of the matrix R and r = (x, y, z) T .From (2), we have that Thus, using (5), we have Equation ( 7) implies that if P has orientation R with respect to φ 2 , then it has orientation OR with respect to φ 1 .In Section 5, we describe how to estimate OR given P and φ 1 , namely, how to estimate a rotation R that satisfies R = OR.
If the volume φ 1 is symmetric with symmetry group G 1 , then (as shown above) the rotation OR is equivalent to the rotation gOR for any g ∈ G 1 , and moreover, the two rotations cannot be distinguished.Thus, we conclude that R = gOR for some unknown g ∈ G 1 .Using the latter equation, we can estimate O as Note that in the latter equation R is known, R can estimated using the algorithm in Section 5 below, and g can be arbitrary.Thus, (8) provides a means for estimating O.However, to estimate O more robustly, we use multiple projections generated from φ 2 .Let R 1 , . . ., R N be random rotations, and let P 1 , . . ., P N be the corresponding projections generated from φ 2 according to (5).Using the procedure described above, we estimate for each P i a rotation Ri that satisfies Ri = g i OR i for some unknown g i ∈ G 1 .Thus, as in (8), we can estimate O using any i ∈ {1, . . ., N } by Contrary to (8), if we want the right hand side of (9) to result in the same O for all i = 1, . . ., N , then g i cannot be arbitrary.In order to estimate O, we therefore need to find g i , i = 1, . . ., N , and combine all estimates for O given in ( 9) into a single estimate.
To that end, define and look at the matrix H of size 3N × 3N whose (i, j) block of size 3 × 3 is given by (see ( 9) and ( 10)) By a direct calculation, we get that the matrix of size where W is an arbitrary 3 × 3 orthogonal matrix (i.e., W W T = W T W = I) satisfies Equation ( 11) also shows that the matrix H is of rank 3, which together with (13) implies that g can be calculated by arranging the three leading eigenvectors whose 3×3 blocks V 1 , . . ., V N are g 1 W, . . ., g N W , for some unknown arbitrary W (see [13] for a detailed derivation).
In practice, at this point, we replace each g i W by its closest orthogonal transformation, as described in [14], to improve its accuracy in the presence of noise and discretization errors.Next, in order to extract an estimate for g 1 , . . ., g N from ( 14) (that is, to eliminate W from the estimates in g given by ( 12)), we multiply each g i W by (g 1 W ) T , resulting in Thus, each g esti is a rotation, even if W is not.We define O i = g T esti Ri R T i , and using (9), we get for i = 1, . . ., N Thus, we have N estimates for g 1 O. Equation ( 4) states that φ 2 (r) = φ 1 (Or) = φ 1 (gOr) for any symmetry element g ∈ G 1 .Therefore, estimating g 1 O is equivalent to estimating O.In order to estimate g 1 O simultaneously from all O i , i = 1, . . ., N , we search for the rotation O est (the superscript will be explained shortly) that satisfies In other words, O est is the "closest" to all the estimated rotations O i in the least squares sense.To solve (17), let Õ be the 3 × 3 matrix In [15], it is proven that the solution to the optimization problem in (17) is Eq. ( 5) 3: Apply Algorithm 2 to each P i and φ 1 .Denote the resulting rotations by Eq. ( 14) 9: Compute Eq. ( 15) 10: for i = 1 to N do 11: Eq. ( 16)

Estimated rotation
where Õ = Ũ Σ Ṽ T is the singular value decomposition (SVD) of Õ.The algorithm for estimating O (1) est given φ 1 and φ 2 , as described above, is presented in Algorithm 1.
To handle the case where φ 1 and φ 2 have a different handedness (namely, related by reflection), we can of course apply Algorithm 1 to φ 2 and a reflected copy of φ 1 .However, this would roughly double the runtime of the estimation process, as the most time consuming step in Algorithm 1 is step 3, whose complexity is O(n 3 log n) operations for a volume of size n × n × n voxels (see Section 5).
Alternatively, it is possible to augment the above algorithm to handle reflections without doubling its runtime.In the case where there is a reflection between φ 1 and φ 2 , we need to replace the relation in ( 2) by the relation Note that J in ( 20) is a reflection and that O is a rotation.Repeating the above derivation starting from (20) shows that to estimate O in this case, we can use the same R i used above and the same estimates Ri obtained above (steps 1 and 3 of Algorithm 1), but this time we get that O i = g T i Ri JR T i J (compare with ( 9)).Then, we set X i = JR i J RT i (compare with (10)) and proceed as above, resulting in an estimate O est (compare with ( 17)), which corresponds to the optimal alignment parameters if φ 1 and φ 2 have opposite handedness.Once we have the two estimates O est using phase correlation [9] (see Appendix C for details).This results in two sets of alignment parameters (rotation+translation).We then apply both sets of parameters to φ 2 to align it with φ 1 , and pick the parameters for which φ 2 after alignment has higher correlation with φ 1 .We denote the estimated parameters by (O est , t est ).

Projection alignment
It remains to show how to implement step 3 of Algorithm 1, that is, how to find the orientation of a projection P of φ 2 with respect to the coordinate system of φ 1 .Mathematically, we would like to solve the equation for the unknown rotation R. A brute-force approach of testing many candidate rotations in search for the R that (best) satisfies ( 21) is prohibitively expensive, as it requires to compute a projection of φ 1 for each candidate rotation (this is essentially projection matching).We therefore take a different approach, whose cost for inspecting each candidate rotation is much lower (in fact requires O(n) operations to test each candidate rotation for a volume φ 1 discretized into an array of size n × n × n).
The idea is to generate several projection images from φ 1 , and then, for each candidate rotation, to check the agreement of the common lines between P and the projections of φ 1 , assuming the orientation of P is given by the candidate rotation.We estimate the rotation corresponding to P as the candidate rotation that results in the best agreement.We next formalize this method, and then analyze its complexity.
We start by considering the case where there is no translation between P and φ 1 , namely, P and φ 1 satisfy (21), and our goal is to estimate R given P and φ 1 .We generate N projection images from φ 1 (N is typically small, see Section 8), denoted P N chosen uniformly at random (note that we deliberately reuse the notation N used in Section 4, as explained below) .We generate a set of candidate rotations S, over which we will search for the solution R of (21).The set S consists of a large number of approximately equally spaced rotations.See Appendix B for a detailed description of the construction of S.
We will assume for each candidate rotation Q ∈ S that P was generated using the rotation Q (that is, we assume that R in ( 21) is equal to Q), compute the mean correlation of the common lines between P and P i , given by the angles α i in P and β i in P (a) i , as explained in Appendix A. The common line property [16]  i , respectively (see Appendix A for a review of common lines and their properties).We thus define and the cost function where fi denotes the complex conjugate of f i .In other words, ρ(Q) measures how well the common lines induced by Q between P and P (a) 1 , . . ., P N agree.We then set our estimate for R to be We explore the appropriate value for N in Section 8. We now extend the above scheme to the case where P is not centered with respect to φ 1 , namely, P is given by for an unknown rotation R and an unknown translation (∆x, ∆y).The idea for estimating R is the same as before, except that the calculation of the common lines should take into account the unknown translation, as we describe next.We denote the unshifted version of P by P , which is given by (this is exactly (21), but we repeat it to clearly set up the notation).Then, P (x, y) = P (x + ∆x, y + ∆y).
Suppose that the common line between P and P (a) i is given by the angles α i in P and β i in P (a) i (see Appendix A).By definition of the common line, it holds that Using (25), we get that P (ξ cos α i , ξ sin α i )e −ıξ∆ξ = P (a) i (ξ cos β i , ξ sin β i ), where ∆ξ = ∆x cos α i + ∆y sin α i is the one-dimensional shift between the projections along their common line.We assume that this one-dimensional shift is bounded by some number d.
Thus, we need to modify our cost function (22) to take into account also the unknown (one-dimensional) phase e −ıξ∆ξ .We therefore define (with a slight abuse of notation in reusing the previous notation for the cost function) and the cost function and set our estimate for the solution R of ( 23) to be The formula for the angles α i and β i of the common line between P and P (a) i induced by the rotations Q ∈ S and R i is given in Appendix A. Note that at this point we are only interested in R est and not in the translation (∆x, ∆y) in P , as the relative translation between φ 1 and φ 2 is efficiently determined using phase correlation (see [9] and Appendix C) once we have determined their relative rotation.The algorithm for solving equation (23)  As mentioned above, we use the same N in Sections 4 and 5.While in principle, the number of projections generated from φ 2 in Section 4 can be different from the number of projections generated from φ 1 in Section 5, due to the symmetric role of φ 1 and φ 2 in the alignment problem, there is no reason to consider different values.

Implementation and complexity analysis
Algorithms 1 and 2 are formulated in the continuous domain.Obviously, to implement them, we must explain how to apply them to volumes φ 1 and φ 2 given as three-dimensional arrays of size n × n × n.We now explain how to discretize each of the steps of Algorithms 1 and 2, and analyze their complexity.For simplicity, we use for the discrete quantities the same notation we have used for the continuous ones.
The only step in Algorithm 1 that needs to be discretized is step 2. This step is accurately discretized based on the Fourier projection slice theorem (32) using a non-equally spaced fast Fourier transform [17,18], whose complexity is O(n 3 log n) (for a fixed prescribed accuracy).The result of this step is a discrete projection image P given as a two-dimensional array of size n × n pixels.The remaining steps of Algorithm 1 are already discrete, and since the value of N is small compared to n, their complexity is negligible.
We next analyze Algorithm 2. The input to this algorithm is a projection image P of size n×n pixels, and a volume φ 1 of size n×n×n voxels.The algorithm also uses the parameter N , but since it is a small constant, we ignore it in our complexity analysis.Step 1 of Algorithm 2 requires a constant number of operations.Step 2 is accurately implemented using a non-equally spaced fast Fourier transform [17,18], whose complexity is O(n 3 log n) (for a fixed prescribed accuracy).Step 3 is independent of the input volume, and moreover, the set S can be precomputed and stored.To implement step 4, we first discretize the interval of one-dimensional shifts [−d, d] in fixed steps of ∆d pixels (say, 1 pixel).Specifically, we use the following shift candidates for the optimization in step 4 Then, for each Q ∈ S, we compute the angles α i and β i (see Appendix A), and evaluate (26) for the pair (Q, ∆ξ) by replacing the integral with a sum.If we store the polar Fourier transforms of all involved projection images P and P (a) 1 , . . ., P (a) N (computed using the non-equally spaced fast Fourier transform [17,18]), each such evaluation amounts to accessing the rays in the polar Fourier transform corresponding to the angles α i and β i , namely O(n) operations.Thus, the total number of operations required to implement step 4 of Algorithm 2 is |S| × ( 2d/∆d + 1) × n (|S| is the number of elements in the set S).Of course, all |S|×( 2d/∆d +1) evaluations are independent, and can be computed in parallel.Thus, the total complexity of Algorithm 2 is O(n 3 log n) operations for step 2 and O(n) operations for testing each pair (Q, ∆ξ) in step 4. Therefore, since the optimization in step 4 is very fast, it is practical to test even a very large set of candidate rotations S.
Finally, we note that in practice, to further speed up the algorithm, we first downsample the input volumes to size n ds , align the two downsampled volumes, and apply the estimated alignment parameters to the original volumes.We demonstrate in Section 8 that this approach still results in a highly accurate alignment.
To understand the theoretical advantage of the above approach, we compare it to a brute force approach.In the brute force approach, we 1) scan over a large set of rotations and three-dimensional translations, 2) for each pair of a rotation and a translation, we transform one of the volumes according to this pair of parameters, and 3) choose the pair for which the correlation between the volumes after the transformation is maximal.Testing each pair of candidate parameters requires O(n 3 ) operations (for rotating and translating one of the volumes, and for computing correlation), which amounts to a total of O(n 3 × |S| × (2d/∆d) 3 ) operations.In other words, testing each candidate rotation and translation is way more expensive than in our proposed method.In our approach, the expensive operation of complexity O(n 3 log n) needs to be executed only once per each pair of inputs P and φ 1 .Moreover, in our approach, the search over shifts is one-dimensional as opposed to the three-dimensional search required in the brute-force approach.

Parameters' refinement
In this section, we describe an optional refinement procedure for improving the accuracy of the estimated parameters O est and t est obtained using the algorithm of Section 4.
We define the vector Θ = (ψ, ϑ, ϕ, ∆ x , ∆ y , ∆ z ) consisting of the 6 parameters required to describe the transformation between two volumes -3 Euler angles (ψ, ϑ, ϕ) describing their relative rotation, and 3 parameters (∆x, ∆y, ∆z) describing their relative translation.We define the operator T Θ (φ), which applies the transformation parameters Θ to the volume φ (that is, T Θ first rotates the volume and then translates it, according to the parameters in Θ).Next, for given volumes φ 1 and φ 2 , we denote their correlation by ρ(φ 1 , φ 2 ).We are reusing the notation ρ from Section 5, since all occurrences of ρ in this paper correspond to a correlation coefficient whose evaluation formula is clear from its arguments.Finally, we define the objective function which vanishes for the parameters Θ that align φ 1 with φ 2 .
To refine O est and t est of Section 4, we simply apply the BFGS algorithm [19] to the objective function (28), with an initialization given by O est and t est .

Results
The alignment algorithm (with and without the optional refinement described in Section 7) was implemented in Python and is available online1 , including the code that generates the figures of this section.A Matlab version of the algorithm is available as part of the ASPIRE software package [20].
As the algorithm uses two parameters -the downsampling n ds (see Section 6) and the number of reference projections N (see Section 4) -we first examine how to appropriately set their values.Then, we examine the advantage of the refinement procedure proposed in Section 7. To show the benefits of our algorithm in practice, we then compare its performance to that of two other alignment algorithms -the alignment algorithm from the EMAN2 software package [10] (implemented in the program e2proc3d) and the fast rotational matching algorithm implemented in the Xmipp software package [6].Finally, we examine the performance of the three algorithms using noisy input volumes.
We tested our algorithm on volumes from the electron microscopy data bank (EMDB) [21] with different types of symmetries, whose properties are described in Table 1.All tests were executed on a dual Intel Xeon E5-2683 CPU (32 cores in total), with 768GB of RAM running Linux.The memory required by the algorithm is of the order of the size of the input volumes.We used 15, 236 candidate rotations in Algorithm 2 (the size of the set S), generated as described in Appendix B. This set of candidates is roughly equally spaced in the set of rotations SO(3).While it is difficult to characterize the resolution of this set in terms of the resolution of each of the Euler angles, a rough calculation suggests that the resolution in each of the Euler angles is smaller than 5 degrees.We do not use rotations generated by a regular grid of Euler angles, as such a grid is less efficient than our grid, due to the nonuniform rotations generated by a regular grid of Euler angles.For example, discretizing each of the Euler angles to 5 degrees would result in 186,624 rotations, more than an order of magnitude larger than the number of rotations we use.
For each test, we generate a pair of volumes φ 1 and φ 2 related by a rotation matrix O and a translation vector t ∈ R 3 .The translation is chosen at random with magnitude up to 10% of the size of the volume.We denote the alignment parameters estimated by our algorithm by O est and t est .We evaluate the accuracy of our algorithm by calculating the difference between the rotations O and O est .To that end, we first note that following (4), O est is an estimate of gO for some arbitrary g ∈ G 1 , where G 1 ⊆ SO(3) is the symmetry group of φ 1 .In order to calculate the difference between O and O est , we have to find the symmetry element g.In our tests, the symmetry group G 1 is known (see Table 1), and so we find g by solving followed by defining O est = g T O est .Next, the error in the estimated rotation O est is calculated using the axis-angle representation of rotations as follows.The axis of the rotation O is defined to be the unit vector v ∈ R 3 that satisfies Ov = v, that is, v is an eigenvector of O corresponding to eigenvalue 1.Similarly, we define the unit vector v ∈ R 3 to be the axis of the rotation O est .Then, we calculate the angle between the axes of the rotations as The angle of rotation of the matrix O around its axis v is given by θ 1 = cos −1 (u • Ou), where u ∈ R 3 is a unit vector perpendicular to v. Similarly, we define θ 2 to be the angle of rotation of the matrix O est around its axis v .The error in the rotation angle is then defined as We start by investigating the appropriate value for the downsampling parameter n ds (see Section 6).To that end, for each of the volumes in Table 1, we create its rotated and shifted copy, and apply our algorithm with the downsampling parameter equal to 16, 32, 64, and 128 (namely, we actually align downsampled copies of the volumes and then apply the estimated parameters to the original volumes).The results are shown in Fig. 2. For each value of downsampling, we show a bar plot that summarizes the results for all test volumes.Note that these results are without the refinement procedure of Section 7. To provide a more detailed information on the chosen downsampling value, we show in Fig. 3 only the results for downsampling to sizes 64 and 128.Based on these results, we use a downsampling value of 64 in all subsequent tests.In particular, this value of downsampling results in an accurate initialization of the refinement procedure of Section 7, as shown in Fig. 4. As of timing, we show in Fig. 5 the timing, without and with refinement, for downsampling to sizes 64 and 128.Next, we wish to determine the number of reference projections N to use in Algorithms 1 and 2. We set the downsampling parameter to 64, and measure the estimation error for different numbers of reference projections.The results are summarized in Fig. 6.We also show the timing for different numbers of reference projections, without and with refinement, in Fig. 7. Based on these results, we choose the number of reference projections to be 30, as a good compromise between accuracy and speed.
Next, we compare the performance of our algorithm with that of EMAN2's and Xmipp's alignment algorithms.The accuracy and timing results are summarized in Tables 2 and 3, respectively.Finally, we demonstrate the performance of the different algorithms for noisy input volumes.To that end, we use as a reference volume EMD 2660 [22] from EMDB (of size 360 × 360 × 360 voxels), and create its rotated and translated copy.We add to the reference volume and its rotate/translated copy additive Gaussian noise with SNR ranging from 1 to 1/256.A central slice from the noisy reference volume at different levels of SNR is shown in Fig. 8.The accuracy results of all algorithms for the various SNRs are shown in Table 4.The timings of the different algorithms are shown in Table 5.

Discussion and conclusions
In this paper, we proposed a fully automatic method for aligning three-dimensional volumes with respect to rotation, translation, and reflection.While the parameters of the algorithm can be tuned whenever needed, we showed that the default parameters work very well for a wide range of volumes of various symmetries.We also developed an auxiliary   algorithm which finds the orientation of a volume giving rise to a given projection image (Section 5).This algorithm may serve as a fast and highly accurate substitute to projection matching.The core difference between our approach and other existing approaches is that our approach is based on commons line between projection images generated from the volumes.The advantage of this approach is that inspecting each candidate rotation is very fast, as it is based on one-dimensional operations on the common lines (O(n) operations for volumes of size n × n × n).We also note that our cost function (26) for identifying the optimal alignment is different than in other algorithms.While the typical cost function used by alignment algorithms is the correlation between the volumes, our cost function is the average correlation of the common lines between projection images of the volumes.These two cost functions are not equivalent, and while in our experiments we have not identified a scenario where one cost function is superior over the other, having tools that are based on different principles may prove beneficial in the future.
From the comparison of our algorithm with the alignment algorithms in EMAN2 and Xmipp, we conclude that our algorithm can be used in one of two modes.If we are interested in fast alignment with good accuracy (average error of 1.9 degrees of the rotation axis, and average error of 1.86 degrees of the in-plane rotation angle, with standard deviations of 1.25 degrees and 1.3 degrees, respectively), we can use our algorithm without the refinement procedure of Section 7.This is appropriate, for example, for visualization, as such an initial alignment is sufficient as an input for high resolution optimization-based alignment algorithms, such as the one in Chimera [3].In such a case, our algorithm is more than 3 times faster than EMAN2's algorithm (even though our algorithm is implemented entirely in Python), and almost 40 times faster than Xmipp's algorithm.If we are interested in very low alignment errors, the refinement procedure of Section 7 brings the average errors down to 0.25 degrees for the rotation axis and 0.28 degrees for the in-plane rotation angle (with standard deviations of 0.66 degrees and 0.63 degrees, respectively).In such a case, our algorithm is 80% faster than EMAN2's and 15 times faster than Xmipp's.

Figure 1 :
Figure 1: Outline of the algorithm est for the alignment parameters between φ 1 and φ 2 (without and with reflection), we estimate the translation corresponding to each of O (1) est and O (2) choose as an estimate for R the rotation Q for which the mean correlation is highest.Specifically, for each Q ∈ S and R (a) i , i = 1, . . ., N , we compute the direction of the common line between P and P (a)

Figure 2 :
Figure 2: Downsampling parameter vs. accuracy of the algorithm.The left figure corresponds to the error e 1 in the rotation axis (see (30)).The right figure corresponds to the error e 2 in the rotation angle (see (31)).

Figure 3 :
Figure 3: Downsampling parameter vs. accuracy of the algorithm, focused on 64 and 128.See Fig. 2 for more details.

Figure 4 :
Figure 4: Error without (left figure) and with (right figure) refinement for downsampling to size 64 × 64 × 64.The error reported in the figure is either e 1 (30) or e 2 (31), as shown on the x-axis.

Figure 5 :
Figure 5: Timing of the alignment algorithm with downsampling to sizes 64 and 128.NR stands for "without refinement"; R stands for "with refinement".

Figure 6 :
Figure 6: Error vs. the number of reference projections N .The left and right figures show the error without and with the refinement procedure of Section 7, respectively.The error reported in this figure is the sum e 1 + e 2 given in (30) and (31).

Figure 7 :
Figure 7: Time vs. the number of reference projections.

Figure 8 :
Figure 8: Central slice of the noisy reference volume at different SNRs.
is summarized in Algorithm 2.

Table 1 :
Test volumes.Each volume is a three-dimensional array of size n × n × n, with n specified on the third column.The symmetry of each volume is given by the second column.

Table 2 :
Accuracy comparison with EMAN2 and Xmipp.The errors reported in this table are the sum e 1 + e 2 given in (30) and (31).Errors are given in degrees.For EMalign, (NR) corresponds to "without refinement" and (R) to "with refinement".The two bottom rows show the mean and standard deviation of the error (in degrees) over all experiments.

Table 3 :
Timing comparison with EMAN2 and Xmipp (in seconds).For EMalign, (NR) corresponds to "without refinement" and (R) to "with refinement".The column "size" is the side length of the input volumes.

Table 4 :
Accuracy comparison for noisy input volumes at different SNRs.See Table2for more details.

Table 5 :
Timing comparison for noisy input volumes at different SNRs.All timings are given in seconds.