Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates

Liu, Kun; Skibbe, Henrik; Schmidt, Thorsten; Blein, Thomas; Palme, Klaus; Brox, Thomas; Ronneberger, Olaf

doi:10.1007/s11263-013-0634-z

Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates

Published: 07 June 2013

Volume 106, pages 342–364, (2014)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Kun Liu¹,
Henrik Skibbe¹^nAff2,
Thorsten Schmidt¹,
Thomas Blein³^nAff4,
Klaus Palme³,
Thomas Brox¹ &
…
Olaf Ronneberger¹

5031 Accesses
112 Citations
4 Altmetric
Explore all metrics

Abstract

The histogram of oriented gradients (HOG) is widely used for image description and proves to be very effective. In many vision problems, rotation-invariant analysis is necessary or preferred. Popular solutions are mainly based on pose normalization or learning, neglecting some intrinsic properties of rotations. This paper presents a method to build rotation-invariant HOG descriptors using Fourier analysis in polar/spherical coordinates, which are closely related to the irreducible representation of the 2D/3D rotation groups. This is achieved by considering a gradient histogram as a continuous angular signal which can be well represented by the Fourier basis (2D) or spherical harmonics (3D). As rotation-invariance is established in an analytical way, we can avoid discretization artifacts and create a continuous mapping from the image to the feature space. In the experiments, we first show that our method outperforms the state-of-the-art in a public dataset for a car detection task in aerial images. We further use the Princeton Shape Benchmark and the SHREC 2009 Generic Shape Benchmark to demonstrate the high performance of our method for similarity measures of 3D shapes. Finally, we show an application on microscopic volumetric data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Rotation Invariant Descriptor Using Multi-directional and High-Order Gradients

A Two-Part Approach to Face Recognition: Generalized Hough Transform and Image Descriptors

Radial Hahn Moment Invariants for 2D and 3D Image Recognition

Article 21 June 2017

Notes

In this paper, a quantity that describes certain image content is generally called a feature; a single gradient histogram computed in a local patch is referred to as a HOG cell; an assembled feature vector that describes a region of multiple cells is referred to as a HOG descriptor.
The property in Eq.(6) has also been referred to as equivariance in some works (Reisert and Burkhardt 2008; Vedaldi et al. 2011).
In this paper, we do not rely on this polar tensor concept, because we do not need any special mathematical tools for the related analysis of 2D images.
We purposely define the expansion coefficients with a conjugation, which makes it a standard inner product between the coefficients and SH basis. The same convention is used in Reisert and Burkhardt (2009). The advantage is that this linear expansion can be understood as a coupling between two spherical tensors, which will be explained later.
This operator is written as $\circ _\ell $ in Reisert and Burkhardt (2009), since $\ell _1, \ell _2$ can be inferred from the two coupled tensors. In this paper we use the more explicit notation ${\otimes }_{(\ell |\ell _1,\ell _2)}$.
http://lmb.informatik.uni-freiburg.de/resources/opensource/FourierHOG/
The coupling used here is only a portion of all possible combinations. We prefer these simple choices since we only want to demonstrate the description power of the proposed method. We believe that the optimal feature selection is application-dependent. Using a classifier like linear SVM or Random Forest, which have built-in feature selection ability, allows to increase the dimensionality of the feature vector by adding more coupled features.
Patrick Min, https://www.google.com/search?q=binvox
We created the ground-truth by editing a watershed segmentation result manually. Some very badly segmented regions were discarded and were not used for training.

References

Ahonen, T., Matas, J., He, C., Pietikäinen, M. (2009). Rotation invariant image description with local binary pattern histogram Fourier features. In Scandinavian Conference on Image, Analysis, pp. 61–70.
Akgül, C., Axenopoulos, A., Bustos, B., Chaouch, M., Daras, P., Dutagaci, H., Furuya, T., Godil, A., Kreft, S., Lian, Z., et al. (2009). SHREC 2009-Generic Shape Retrieval contest. In Eurographics workshop on 3D object retrieval.
Allaire, S., Kim, J., Breen, S., Jaffray, D., & Pekar, V. (2008). Full orientation invariance and improved feature selectivity of 3D SIFT with application to medical image analysis. In CVPR Workshops.
Arsenault, H., & Sheng, Y. (1986). Properties of the circular harmonic expansion for rotation-invariant pattern recognition. Applied Optics, 25(18), 3225–3229.
Article Google Scholar
Bendale, P., Triggs, B., & Kingsbury, N. (2010). Multiscale keypoint analysis based on complex wavelets. In British Machine Vision Conference, pp. 49(1–49), 10.
Bourdev, L., Malik, J. (2009). Poselets: Body part detectors trained using 3D human pose annotations. In International Conference on Computer Vision, pp. 1365–1372.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Article MATH Google Scholar
Brink, D., & Satchler, G. (1968). Angular momentum. Oxford: Clarendon Press.
Google Scholar
Bülow, T. (2004). Spherical diffusion for 3D surface smoothing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(12), 1650–1654.
Article Google Scholar
Burkhardt, H., & Siggelkow, S. (2001). Invariant features in pattern recognition—fundamentals and applications. In C. Kotropoulos & I. Pitas (Eds.), Nonlinear model-based image/video processing and analysis (pp. 269–307). New York: Wiley.
Google Scholar
Chang, C.-C., Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2,27:1–27:27. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
MATH Google Scholar
Dalal, N., Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 886–893.
Driscoll, J., & Healy, D. (1994). Computing Fourier transforms and convolutions on the 2-sphere. Advances in Applied Mathematics, 15(2), 202–250.
Article MATH MathSciNet Google Scholar
Fan, R., Chang, K., Hsieh, C., Wang, X., & Lin, C. (2008). LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research, 9, 1871–1874.
MATH Google Scholar
Fehr, J. (2010). Local rotation invariant patch descriptors for 3D vector fields. In International Conference on, Pattern Recognition, pp. 1381–1384.
Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.
Article Google Scholar
Flitton, G., Breckon, T., & Megherbi, N. (2010). Object recognition using 3D SIFT in complex CT volumes. In British Machine Vision Conference, pp. 11(1–11), 12.
Fornasier, M., & Toniolo, D. (2005). Fast, robust and efficient 2D pattern recognition for re-assembling fragmented images. Pattern Recognition, 38(11), 2074–2087.
Article Google Scholar
Förstner, W., Gülch, E. (1987). A fast operator for detection and precise location of distinct points, corners and centres of circular features. In ISPRS intercommission conference on fast processing of photogrammetric data, pp. 281–305.
Freeman, W., & Adelson, E. (1991). The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9), 891–906.
Article Google Scholar
Gauglitz, S. (2011). Improving keypoint orientation assignment. In British Machine Vision Conference, pp. 93(1–93), 11.
Giannakis, G. (1989). Signal reconstruction from multiple correlations: frequency- and time-domain approaches. Journal of Optical Society of America A, 6(5), 682–697.
Article Google Scholar
Golub, G., & Van Loan, C. (1996). Matrix computations. Baltimore: Johns Hopkins Univ Press.
MATH Google Scholar
Green, R. (2003). Spherical harmonic lighting: The gritty details. In Game Developers Conference, 2, 2–3.
Google Scholar
Haasdonk, B., & Burkhardt, H. (2007). Invariant kernel functions for pattern analysis and machine learning. Machine Learning, 68(1), 35–61.
Article Google Scholar
Heitz, G., Koller, D. (2008). Learning spatial context: Using stuff to find things. In European Conference on Computer Vision, pp. 30–43.
Jacovitti, G., & Neri, A. (2000). Multiresolution circular harmonic decomposition. IEEE Transaction on Signal Processing, 48(11), 3242–3247.
Article MathSciNet Google Scholar
Kavukcuoglu, K., Ranzato, M., Fergus, R., Le-Cun, Y. (2009). Learning invariant features through topographic filter maps. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1605–1612.
Kazhdan, M., Funkhouser, T., Rusinkiewicz, S. (2003). Rotation invariant spherical harmonic representation of 3D shape descriptors. In Eurographics/ACM SIGGRAPH symposium on Geometry processing, pp. 156–164.
Kläser, A., Marszałek, M., Schmid, C. (2008). A spatio-temporal descriptor based on 3D-gradients. In British Machine Vision Conference, pp. 995–1004.
Knopp, J., Prasad, M., Van Gool, L. (2010a). Orientation invariant 3D object classification using Hough transform based methods. In ACM Multimedia, Workshop, pp. 15–20.
Knopp, J., Prasad, M., Willems, G., Timofte, R., Van Gool, L. (2010b). Hough transform and 3D SURF for robust three dimensional classification. In European Conference on Computer Vision, pp. 589–602.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Article Google Scholar
Lenz, R. (1990). Group theoretical methods in image processing. Berlin: Springer.
Book Google Scholar
Lin, W., Liu, L., Matsushita, Y., Low, K., Liu, S. (2012). Aligning images in the wild. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1–8.
Liu, K., Skibbe, H., Schmidt, T., Blein, T., Palme, K., & Ronneberger, O. (2011). 3D rotation-invariant description from tensor operation on spherical HOG field. In British Machine Vision Conference, pp. 33(1-33), 12.
Liu, K., Wang, Q., Driever, W., Ronneberger, O. (2012). 2D/3D Rotation-invariant Detection using Equivariant Filters and Kernel Weighted Mapping. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 917–924.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Makadia, A., & Daniilidis, K. (2010). Spherical correlation of visual representations for 3D model retrieval. International Journal of Computer Vision, 89(2), 193–210.
Article Google Scholar
Memisevic, R., & Hinton, G. (2010). Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Computation, 22(6), 1473–1492.
Article MATH Google Scholar
Özuysal, M., Calonder, M., Lepetit, V., & Fua, P. (2010). Fast keypoint recognition using random ferns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 448–461.
Article Google Scholar
Ponce, C., & Singer, A. (2011). Computing steerable principal components of a large set of images and their rotations. IEEE Transactions on Image Processing, 20(11), 3051–3062.
Article MathSciNet Google Scholar
Reisert, M., & Burkhardt, H. (2008a). Efficient tensor voting with 3D tensorial harmonics. In CVPR Workshops.
Reisert, M., & Burkhardt, H. (2008b). Equivariant holomorphic filters for contour denoising and rapid object detection. IEEE Transactions on Image Processing, 17(2), 190–203.
Article MathSciNet Google Scholar
Reisert M., Burkhardt H. (2009) Spherical Tensor Calculus for Local Adaptive Filtering. In: Aja-Fernández S., de Luis García R., Tao D., Li X. (eds) Tensors in Image Processing and Computer Vision Advances in Pattern Recognition. Springer, USA, pp. 153–178.
Ronneberger, O., Burkhardt, H., & Schultz, E. (2002). General-purpose Object Recognition in 3D Volume Data Sets using Gray-Scale Invariants—Classification of Airborne Pollen-Grains Recorded with a Confocal Laser Scanning Microscope. In International Conference on Pattern Recognition, 2, 290–295.
Ronneberger, O., Liu, K., Rath, M., Ruess, D., Mueller, T., Skibbe, H., et al. (2012). ViBE-Z: a framework for 3D virtual colocalization analysis in zebrafish larval brains. Nature Methods, 9(7), 735–742.
Article Google Scholar
Ronneberger, O., Wang, Q., & Burkhardt, H. (2007). 3D Invariants with High Robustness to Local Deformations for Automated Pollen Recognition (pp. 455–435). Pattern recognition: In DAGM conference on.
Rose, M. (1957). Elementary theory of angular momentum. New York: Wiley.
MATH Google Scholar
Scherer, M., Walter, M., & Schreck, T. (2010). Histograms of Oriented Gradients for 3D Model Retrieval (pp. 41–48). Visualization and Computer Vision: In International Conference in Central Europe on Computer Graphics.
Schmidt, T., Keuper, M., Pasternak, T., Palme, K., & Ronneberger, O. (2012). Modeling of Sparsely Sampled Tubular Surfaces Using Coupled Curves (pp. 83–92). Pattern recognition: In DAGM conference on.
Schmidt, U., Roth, S. (2012). Learning rotation-aware features: From invariant priors to equivariant descriptors. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 2050–2057.
Schultz, T., Weickert, J., & Seidel, H. (2009). A higher-order structure tensor. In D. Laidlaw & J. Weickert (Eds.), Visualization and processing of tensor fields (pp. 263–279). Berlin: Springer.
Chapter Google Scholar
Sheng, Y., & Arsenault, H. (1986). Experiments on pattern recognition using invariant Fourier-Mellin descriptors. Journal of Optical Society of America A, 3(6), 771–776.
Article Google Scholar
Shilane, P., Min, P., Kazhdan, M., Funkhouser, T. (2004). The Princeton Shape Benchmark. In International Conference on Shape Modeling and Applications, pp. 167–178.
Skibbe, H., & Reisert, M. (2012). Circular Fourier-HOG features for rotation invariant object detection in biomedical images. In IEEE International Symposium on Biomedical Imaging, pp. 450–453.
Skibbe, H., Reisert, M., & Burkhardt, H. (2011). SHOG-spherical HOG descriptors for rotation invariant 3D object detection. In DAGM conference on Pattern recognition, pp. 142–151.
Skibbe, H., Reisert, M., Ronneberger, O., & Burkhardt, H. (2009). Increasing the dimension of creativity in rotation invariant feature design using 3D tensorial harmonics. In DAGM conference on Pattern recognition, pp. 141–150.
Skibbe, H., Reisert, M., Schmidt, T., Brox, T., Ronneberger, O., Burkhardt, H. (2012). Fast rotation invariant 3D feature computation utilizing efficient local neighborhood operators. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(8):1563–1575. Software available at https://bitbucket.org/skibbe/sta-imagetoolbox
Google Scholar
Takacs, G., Chandrasekhar, V., Tsai, S., Chen, D., Grzeszczuk, R., Girod, B. (2010). Unified real-time tracking and recognition with rotation-invariant fast features. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 934–941.
Vedaldi, A., Blaschko, M., Zisserman, A. (2011). Learning equivariant structured output SVM regressors. In International Conference on Computer Vision, pp. 959–966.
Villamizar, M., Moreno-Noguer, F., Andrade-Cetto, J., Sanfeliu, A. (2010). Efficient rotation invariant object detection using boosted random ferns. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1038–1045.
Wang, Q., Ronneberger, O., & Burkhardt, H. (2009). Rotational invariance based on fourier analysis in polar and spherical coordinates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 1715–1722.
Article MATH Google Scholar
Wolberg, G., Zokai, S. (2000). Robust image registration using log-polar transform. In IEEE International Conference on Image Processing, pp. 493–496.

Download references

Acknowledgments

This study was supported by the Excellence Initiative of the German Federal and State Governments: BIOSS Centre for Biological Signalling Studies (EXC 294) and the Bundesministerium für Bildung und Forschung (German Federal Ministry of Education and Research) Project: New Methods in Systems Biology (SYSTEC, 0101-31P5914) – Quantitative 3D and 4D cell analysis in living organisms.

Henrik Skibbe is indebted to the Baden-Württemberg Stiftung for the financial support by the Elite Program for Post-docs. Dr. Thomas Blein was supported by a long-term post-doctoral fellowship from European Molecular Biology Organization (EMBO, ALTF250-2009). Dr. Thomas Blein and Prof. Klaus Palme are also supported by Deutsches Zentrum für Luft und Raumfahrt (DLR 50WB1022) and the European Union Framework 6 Program (AUTOSCREEN, LSHG-CT-2007-037897).

Author information

Henrik Skibbe
Present address: Integrated Systems Biology Lab., Department of Systems Science, Kyoto University, Kyoto, 611-0011, Japan
Thomas Blein
Present address: Institut Jean-Pierre Bourgin, INRA Centre de Versailles-Grignon, 78026 , Versailles, France

Authors and Affiliations

Department of Computer Science, University of Freiburg, 79110 , Freiburg, Germany
Kun Liu, Henrik Skibbe, Thorsten Schmidt, Thomas Brox & Olaf Ronneberger
Institute of Biology II (Botany), University of Freiburg, 79104 , Freiburg, Germany
Thomas Blein & Klaus Palme

Authors

Kun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Henrik Skibbe
View author publications
You can also search for this author in PubMed Google Scholar
Thorsten Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Blein
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Palme
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Brox
View author publications
You can also search for this author in PubMed Google Scholar
Olaf Ronneberger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kun Liu.

Additional information

All authors are part of the BIOSS Centre for Biological Signalling Studies, University of Freiburg.

Appendix

1.1 Computation of the Tensorial Harmonic Expansion

Given a spherical tensor field $\mathbf{F} \in \mathcal{T }^{\ell }$, we have a way to compute the tensorial harmonic expansion in Eq.(34), which is more efficient than the direct projections.

First we compute the scalar (SH) expansion on each individual tensor component ${F}_m:\mathbb{R }^3 \rightarrow \mathbb{C }$ as

$$\begin{aligned} {{{{F}}_m}(r,\theta ,\varphi )} = \sum _{j=0}^{\infty }\sum _{n=-j}^j{\overline{\hat{{b}}_{m,n}^{j}}(r) {Y}^j_n(\theta , \varphi ) }, \end{aligned}$$

(42)

then the tensorial expansion coefficients $\mathbf{a}^{j,k}(r)$ can be computed from the above component-wise expansions by a derived relation as

$$\begin{aligned} {a}_{m^{\prime }}^{j,k}(r)\!=\!\frac{2(j+k)+1}{2{\ell }\!+\!1} \sum _{m,n} {\hat{b}_{m,n}^{\,j}(r)} C(\ell ,m|j\!+\!k,m^{\prime },j,n),\nonumber \\ \end{aligned}$$

(43)

where $-(j+k) \le m^{\prime } \le j+k$. See Reisert and Burkhardt (2009) for proofs. We need to compute the ClebschGordan coefficients $C$ in this circumstance. An easy way is to use their relation to the Wigner 3-j symbols $\left( \begin{array}{lll} j_1&{}j_2&{}j_3\\ m_1&{}m_2&{}m_3 \end{array}\right) $ (Brink and Satchler 1968), which is written as

$$\begin{aligned}&C(j_3,m_3|j_1,m_1,j_2,m_2) = (-1)^{j_1-j_2+m_3} \sqrt{2j_3 + 1}\nonumber \\&\quad \times \left( \begin{array}{lll} j_1 &{} j_2 &{} j_3 \\ m_1 &{} m_2 &{} m_3 \end{array}\right) . \end{aligned}$$

(44)

One can use the function “gsl_sf_coupling_3j” in the GNU Scientific Library to compute the Wigner 3-j symbol.

1.2 Spherical Gaussian Derivatives

Let $\mathbf{F} \in \mathcal{T }_\ell $, the spherical up-derivative $\varvec{\nabla }^{1}_{}:\mathcal{T }_\ell \rightarrow \mathcal{T }_{\ell +1}$ and the down-derivative $\varvec{\nabla }^{}_{1}:\mathcal{T }_\ell \rightarrow \mathcal{T }_{\ell -1}$ (Reisert and Burkhardt 2009) are defined as

$$\begin{aligned} \varvec{\nabla }^{1}_{}\mathbf{F}&:= \varvec{\nabla }^{}_{} \bullet _{(\ell +1|1, \ell )} \mathbf{F}, \end{aligned}$$

(45)

$$\begin{aligned} \varvec{\nabla }^{}_{1}\mathbf{F}&:= \varvec{\nabla }^{}_{} \bullet _{(\ell -1|1, \ell )} \mathbf{F}, \end{aligned}$$

(46)

where $\nabla = (\frac{1}{\sqrt{2}}(\partial _x - \mathrm{i }{} \partial _y), \partial _z, -\frac{1}{\sqrt{2}}(\partial _x + \mathrm{i }{} \partial _y))$ is the spherical gradient operator with $\partial _x,\partial _y,\partial _z$ being the standard partial derivatives. It is further defined that $\varvec{\nabla }^{j_u}_{j_d}\mathbf{V} = \underbrace{\varvec{\nabla }^{}_{1}\ldots \varvec{\nabla }^{}_{1}}_{j_d \text{ times }}\underbrace{\varvec{\nabla }^{1}_{}\ldots \varvec{\nabla }^{1}_{}}_{j_u \text{ times }}\mathbf{V}$. One important property of this operation is that it maps a spherical tensor field to a higher or lower rank spherical tensor field. This is analogous to the fact that computing derivatives on a scalar field produces a gradient field, which is a rank-$1$ tensor, and a subsequent derivative can either produce the Hessian (rank-$2$ tensor) or the divergence (rank-$0$ tensor).

For $ \mathbf{V} = \varvec{\nabla }^{1}_{} \mathbf{V}^{\prime } \nonumber $, where $\mathbf{V}^{\prime }:\mathbb{R }^3 \rightarrow {\mathbb{C }}^{2(\ell -1)+1}, \mathbf{V}:\mathbb{R }^3 \rightarrow {\mathbb{C }}^{2\ell +1}$, by indexing the elements of $\mathbf{V} $ and $\mathbf{V}^{\prime } $ as $\{V_{-\ell },\ldots ,V_{\ell }\}$ and $\{V^{\prime }_{-\ell +1},\ldots ,V^{\prime }_{\ell -1}\}$, the computation rule of $\varvec{\nabla }^{1}_{}$ is:

$$\begin{aligned} V_{m}&= w(\ell , m,-1) \; \frac{1}{\sqrt{2}} (\partial _x - \mathrm{i }{} \partial _y) V^{\prime }_{m+1} \nonumber \\&+ w(\ell , m,0)\;\partial _z V^{\prime }_m\nonumber \\&- w(\ell , m,1)\;\frac{1}{\sqrt{2}}(\partial _x + \mathrm{i }{} \partial _y) V^{\prime }_{m-1} \quad , \end{aligned}$$

(47)

where $w$ is the weighting coefficients which can be pre-computed from two Clebsch-Gordan coefficients as $w{(\ell , m, a)} = \frac{C(\ell ,m|\ell -1 ,m-a, 1,a)}{C(\ell ,0|\ell -1, 0, 1, 0)}$. Thus the computation of the spherical tensor derivatives is just a group of weighted combinations of normal Cartesian derivatives.

Equation (47) also fits the spherical down-derivative $ \mathbf{V} = \varvec{\nabla }^{}_{1} \mathbf{V}^{\prime } $, where $\mathbf{V}:\mathbb{R }^3 \rightarrow {\mathbb{C }}^{2\ell +1}$ and $\mathbf{V}^{\prime }:\mathbb{R }^3 \rightarrow {\mathbb{C }}^{2(\ell +1)+1}$. The only difference are the coefficients: $w{(\ell , m, a)} = \frac{C(\ell ,m|\ell +1,m-a, 1,a)}{C(\ell ,0|\ell +1,0, 1,0)}$.

A fast filtering tool is derived by computing the derivatives on an isotropic Gaussian function, which creates a series of basis function of different tensor ranks, as $\varvec{\nabla }^{j_u}_{j_d} G \in \mathcal{T }_{j_u-j_d}$ (where $j_u \ge j_d, G$ is a Gaussian function). The convolution with the spherical Gaussian derivatives can be computed efficiently like the standard Gaussian derivatives based on the commutativity of the convolution and differentiation. As an example, let $\mathbf{F} \in \mathcal{T }_\ell $ be a spherical tensor field, we have

$$\begin{aligned} \varvec{\nabla }^{j_u}_{j_d} G \;\widetilde{\bullet }_{{(\ell +j_u-j_d| j_u-j_d, \ell )}} \;\mathbf{F} = \varvec{\nabla }^{j_u}_{j_d}(G \;\widetilde{\bullet }_{(\ell | 0, \ell )} \mathbf{F}). \end{aligned}$$

(48)

We can therefore compute multiple filtering outputs (for different $\{j_u,j_d\}$) by a single tensorial convolution plus differentiations. Note, the convolution like $G \;\widetilde{\bullet }_{(\ell | 0, \ell )} \;\mathbf{F}$ is equivalent to normal Gaussian convolutions as $[G \;\widetilde{\bullet }_{(\ell | 0, \ell )} \mathbf{F}]_m = G * F_m$ (because $C(\ell ,m|\ell ,m,0,0) = 1$). The output is a tensor field of rank $\ell + j_u - j_d$. In the context of this paper, we can take the SGD as derivatives after a scale-space selection by Gaussian convolution. The only important property for the rotation-invariance is that the introduced basis functions are spherical tensor fields.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, K., Skibbe, H., Schmidt, T. et al. Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates. Int J Comput Vis 106, 342–364 (2014). https://doi.org/10.1007/s11263-013-0634-z

Download citation

Received: 30 September 2012
Accepted: 21 May 2013
Published: 07 June 2013
Issue Date: February 2014
DOI: https://doi.org/10.1007/s11263-013-0634-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates

Abstract

Access this article

Similar content being viewed by others

A Rotation Invariant Descriptor Using Multi-directional and High-Order Gradients

A Two-Part Approach to Face Recognition: Generalized Hough Transform and Image Descriptors

Radial Hahn Moment Invariants for 2D and 3D Image Recognition

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 Computation of the Tensorial Harmonic Expansion

1.2 Spherical Gaussian Derivatives

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates

Abstract

Access this article

Similar content being viewed by others

A Rotation Invariant Descriptor Using Multi-directional and High-Order Gradients

A Two-Part Approach to Face Recognition: Generalized Hough Transform and Image Descriptors

Radial Hahn Moment Invariants for 2D and 3D Image Recognition

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 Computation of the Tensorial Harmonic Expansion

1.2 Spherical Gaussian Derivatives

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation