Decision-theoretic model to identify printed sources

Tsai, Min-Jen; Yuadi, Imam; Tao, Yu-Han

doi:10.1007/s11042-018-5938-0

Decision-theoretic model to identify printed sources

Published: 22 April 2018

Volume 77, pages 27543–27587, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

620 Accesses
13 Citations
Explore all metrics

Abstract

When trying to identify a printed forged document, examining digital evidence can prove to be a challenge. Over the past several years, digital forensics for printed document source identification has begun to be increasingly important which can be related to the investigation and prosecution of many types of crimes. Unlike invasive forensic approach which requires a fraction of the printed document as the specimen for verification, noninvasive forensic technique uses the optical mechanism to explore the relationship between the scanned images and the source printer. To explore the relationship between source printers and images obtained by the scanner, the proposed decision-theoretical approach utilizes image processing techniques and data exploration methods to calculate many important statistical features, including: Local Binary Pattern (LBP), Gray Level Co-occurrence Matrix (GLCM), Discrete Wavelet Transform (DWT), Spatial filters, the Wiener filter, the Gabor filter, Haralick, and SFTA features. Consequently, the proposed aggregation method intensively applies the extracted features and decision-fusion model of feature selections for classification. In addition, the impact of different paper texture or paper color for printed sources identification is also investigated. In the meantime, the up-to-date techniques based on deep learning system is developed by Convolutional Neural Networks (CNNs) which can learn the features automatically to solve the complex image classification problem. Both systems have been compared and the experimental results indicate that the proposed system achieve the overall best accuracy prediction for image and text input and is superior to the existing approaches. In brief, the proposed decision-theoretical model can be very efficiently implemented for real world digital forensic applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Digital forensics of microscopic images for printed source identification

Article 23 May 2017

Ink analysis based forensic investigation of handwritten legal documents

Article 13 May 2022

Digital Forensics of Printed Source Identification for Chinese Characters

References

Ali GN, Mikkilineni AK, Chiang PJ, Allebach GT, Delp EJ (2003) Intrinsic and extrinsic signatures for information hiding and secure printing with electrophotographic devices. In International Conference on Digital Printing Technologies. New Orleans, LA, USA; 28 Sept–3 Oct, 511–515
Bekhti MA, Kobayashi Y (2016) Prediction of vibrations as a measure of terrain traversability in outdoor structured and natural environments. In: Image and video technology, Vol. 9431 of the series lecture notes in computer science. Springer International Publishing, Auckland 282–294. https://doi.org/10.1007/978-3-319-29451-3_23
Chapter Google Scholar
Bulan O, Mao J, Sharma G (2009) Geometric distortion signatures for printer identification. International conference on acoustics, speech and signal processing (ICASSP), Taipei, 1401–1404. https://doi.org/10.1109/ICASSP.2009.4959855
Burger W, Burge MJ (2018) Digital image processing: an introduction algorithmic using Java. Springer Science Business Media, New York
Google Scholar
Choi JH, Lee HY, Lee HK (2013) Color laser printer forensic based on noisy feature and support vector machine classifier. Multimed Tools Appl 67:363–382. https://doi.org/10.1007/s11042-011-0835-9
Article Google Scholar
Costa AF, Humpire-Mamani G, Traina AJM (2012) An efficient algorithm for fractal analysis of textures. SIBGRAPI conference on graphics, patterns and images, August, Ouro Preto, 39–46. https://doi.org/10.1109/SIBGRAPI.2012.15
Daugman JG (1988) Complete discrete 2D Gabor transforms by neural networks for image-analysis and compression. IEEE Trans Acoust Speech Signal Process 36(7):1169–1179. https://doi.org/10.1109/29.1644
Article MATH Google Scholar
Ferreira A, Navarro LC, Pinheiro G, Santos JAD, Rocha A (2015) Laser printer attribution: exploring new features and beyond. Forensic Sci Int 247:105–125. https://doi.org/10.1016/j.forsciint.2014.11.030
Article Google Scholar
Gonzales RC, Woods RE (2008) Digital image processing, 3rd edn. Prentice Hall, New Jersey
Google Scholar
Gonzales RC, Woods RE, Eddins SL (2009) Digital image processing using MATLAB, 2nd edn. Gatesmark, United States
Google Scholar
Haghighat M, Zonout S, Abdel-Mottaleb M (2015) CloudID: trustworthy cloud-based and cross-enterprise biometric identification. Expert Syst Appl 42(21):7905–7916. https://doi.org/10.1016/j.eswa.2015.06.025
Article Google Scholar
Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybernet SMC 3(6):610–621
Article Google Scholar
He K et al (2016) Deep residual learning for image recognition. IEEE conference on computer vision and pattern recognition (CVPR) 770–778
Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Article MathSciNet Google Scholar
Hsu CW, Chang CC, Lin CJ (2003) A practical guide to support vector classification. National Taiwan University, Taipei http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf. Accessed 8 Apr 2017
Google Scholar
http://www.explainthatstuff.com/laserprinters.html. Accessed 2 Apr 2017
http://computer.howstuffworks.com/laser-printer2.htm. Accessed 2 Apr 2017
Hubel D, Wiesel T (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106–154
Article Google Scholar
Jurič I, Ranđelović D, Karlović I, Tomić I (2014) Influence of the surface roughness of coated and uncoated papers on the digital print mottle. J Graph Eng Des 5(1):17–23
Google Scholar
Kawasaki M, Ishisaki M (2009) Investigation into the cause of print mottle in halftone dots of coated paper: effect of optical dot gain non-uniformity 63(11):1362–1373. http://www.tappi.org/content/06IPGA/5-4%20Kawasaki%20M%20Ishisaki.pdf. Accessed 7 April 2017
Kee E, Farid H (2008) Printer profiling for forensics and ballistics. ACM Workshop on Multimedia and Security, 3–10
Khanna N, Delp EJ (2010) Intrinsic signatures for scanned documents forensics: effect of font shape and size. Proceedings of 2010 I.E. international symposium on circuits and systems (ISCAS), 30 May– 2 June. https://doi.org/10.1109/ISCAS.2010.5537996
Kim DG, Lee HK (2014) Color laser printer identification using photographed halftone images. Proc. of EUSIPCO. September, IEEE, Lisbon, 795–799
Kim KI, Jung K, Park SH, Kim HJ (2002) Support vector machines for texture classification. IEEE Trans Pattern Anal Mach Intell 24(11):1542–1550. https://doi.org/10.1109/TPAMI.2002.1046177
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. Process Int Conf Neural Inf Process Syst (NIPS) 1:1097–1105
Google Scholar
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Lewis JA (2014) Forensic document examination: fundamentals and current trends. Elsevier, Oxford. https://doi.org/10.1016/B978-0-12-416693-6.12001-6
Book Google Scholar
Lin CJ (2007) A tutorial of the wavelet transforms. National Taiwan University, http://disp.ee.ntu.edu.tw/tutorial/WaveletTutorial.pdf. Accessed 3 Apr 2017
Lopez FM, Martins DC, Cesar RM (2008) Feature selection environment for genomic applications. BMC Bioinformatics 9:451. https://doi.org/10.1186/1471-2105-9-451
Article Google Scholar
Mäenpää T, Pietikäinen M (2004) Texture analysis with local binary patterns. In: Chen CH, Wang PSP (eds) Handbook of pattern recognition & computer vision, 3rd edn. World Scientific, Singapore, pp 115–118
Google Scholar
Markoff J (2012) How many computers to identify a cat? 16,000. The New York. Retrieved June 22, 2012, from https://mobile.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html
McAndrew A (2016) A computational introduction to digital image processing. CRC Press, Boca Raton
MATH Google Scholar
Mikkilineni AK, Chiang PJ, Ali GN, Chiu GT, Allebach JP, Delp EJ (2004) Printer identification based on textural features. Intl. conference on digital printing technologies 306–311
Mikkilineni AK, Chiang JP, Ali GN, Chiu GT, Allebach JP, Delp EJ (2005) Printer identification based on graylevel co-occurrence features for security and forensic applications. Intl. Conference on Security, Steganography and Watermarking of Multimedia Contents VII, Proc. SPIE. 5681, 430–440, March 21. https://doi.org/10.1117/12.593796
Mikkilineni AK, Arslan O, Chiang PJ, Kumontoy RM, Allebach JP, Chiu GT (2005) Printer forensics using SVM techniques. Intl. conference on digital printing technologies, 223–226
Mikkilineni AK, Khanna N, Delp EJ (2010) Texture based attacks on intrinsic signature based printer identification. Proceedings SPIE 7541, Media Forensics and Security II, 28 January. https://doi.org/10.1117/12.845377
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A (2011) Reading digits in natural images with unsupervised feature learning. NIPS workshop on deep learning and unsupervised feature learning
Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recogn 29(1):51–59. https://doi.org/10.1016/0031-3203(95)00067-4
Article Google Scholar
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with LBP. IEEE Trans Pattern Anal Mach Intell 24(7):971–987. https://doi.org/10.1109/TPAMI.2002.1017623
Article MATH Google Scholar
Pudil P, Ferry FJ, Novovicova J, Kittler J(1994) Floating search methods for feature selection with nonmonotonic criterion functions. IEEE, 1051-465U9, http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=576920. Accessed 3 Apr 2017
Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15:1119–1125
Article Google Scholar
Qiu Z, Jin J, Lam HK, Zhang Y, Wang X, Cichocki A (2016) Improved SFFS method for channel selection in motor imagery based BCI. Neurocomputing. https://doi.org/10.1016/j.neucom.2016.05.035
Article Google Scholar
Rumelhart E, Geoffrey E, Ronald J (1986) Learning representations by back-propagating errors. Nature 323:533–536
Article Google Scholar
Russakovsky O et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Ryu SJ, Lee KY, Im DH, Choi JH, Lee HK (2010) Electrophotographic printer identification by halftone texture analysis. In: IEEE Intl. conference on acoustics speech and signal processing (ICASSP), 1846–1849. https://doi.org/10.1109/ICASSP.2010.5495377
Say OT, Sauli Z, Retnasamy V (2013) High density printing paper quality investigation. IEEE Regional Symposium on Micro and Nano electronics (RSM), Langkawi, 273–277. https://doi.org/10.1109/RSM.2013.6706528
Schalkoff RJ (1989) Digital image processing and computer vision. Wiley, Australia
Google Scholar
Simonyan K. and Zisserman A. (2015) Very deep convolutional networks for large-scale image recognition. IEEE conference on computer vision and pattern recognition (CVPR), arXiv preprint arXiv:1409.1556
Su R, Pekarovicova A, Fleming PD, Bliznyuk V (2005) Physical properties of LWC papers and Gravure Ink Mileage, https://www.researchgate.net/publication/251423637_Physical_Properties_of_LWC_Papers_and_Gravure_Ink_Mileage. Accessed 3 Apr 2017
Szegedy, C. (2015) Going deeper with convolutions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2015.7298594
The Electron Microscope (2017) http://www.microscopemaster.com/electron-microscope.html. Accessed 11 Apr 2017
Tong S, Koller D (2001) Support vector machine active learning with applications to text classification. J Mach Learn Res 45–66. http://www.jmlr.org/papers/volume2/tong01a/tong01a.pdf. Accessed 7 Apr 2017
Tsai MJ, Liu J (2013) Digital forensics for printed source identification. In: IEEE international symposium on circuits and systems (ISCAS), May, 2347–2350. https://doi.org/10.1109/ISCAS.2013.6572349
Tsai MJ, Liu J, Wang CS, Chuang CH (2011) Source color laser printer identification using discrete wavelet transform and feature selection algorithms. IEEE international symposium on circuits and systems (ISCAS), May, Rio de Janeiro, 2633–2636. https://doi.org/10.1109/ISCAS.2011.5938145
Tsai MJ, Yin JS, Yuadi I, Liu J (2014) Digital forensics of printed source identification for Chinese characters. Multimed Tools Appl 73:2129–2155. https://doi.org/10.1007/s11042-013-1642-2
Article Google Scholar
Tsai MJ, Hsu CL, Yin JS, Yuadi I (2015) Japanese character based printed source identification. IEEE International Symposium on Circuits and Systems (ISCAS), May, Lisbon, 2800–2803. https://doi.org/10.1109/ISCAS.2015.7169268
Vega LR, Rey H (2013) A rapid introduction to adaptive filtering. Springer-Verlag, Berlin
Book Google Scholar
Wu Y, Kong X, You X, Guo Y 2009 Printer forensics based on page document’s geometric distortion. Intl. conference on image processing (ICIP), Cairo, 2909–2912. https://doi.org/10.1109/ICIP.2009.5413420
Zhou H, Wu J, Zhang J (2010) Digital image processing: part 1. Ventus Publishing ApS, Denmark
Google Scholar

Download references

Acknowledgments

This work was partially supported by the National Science Council in Taiwan, Republic of China, under NSC104-2410-H-009-020-MY2 and NSC106-2410-H-009-022-.

The authors would like to thank the anonymous reviewers with their valuable comments to improve the quality of this manuscript. Special thanks to Jin-Sheng Yin and Goang-Jiun Wang at National Chiao Tung University who help the revision and the software experiments.

Author information

Authors and Affiliations

Institute of Information Management, National Chiao Tung University, 1001 Ta-Hsueh Road, Hsin-Chu, 300, Taiwan, Republic of China
Min-Jen Tsai, Imam Yuadi & Yu-Han Tao
Department of Information and Library Science, Airlangga University, Jl. Airlangga 4-6, Surabaya, East Java, 60286, Indonesia
Imam Yuadi

Authors

Min-Jen Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Imam Yuadi
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Han Tao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min-Jen Tsai.

Appendix: Formulas of feature filters

Brief description of the formulas for ten feature filter sets is shown below:

Feature Filter	Image quality Measures	Formula
GLCM	Region of interest R (ROI) GLCM	\( R=\sum \limits_{\left(i,j\right)\in ROI}^1 \) \( GLCM\ \left(i,j\right)=\frac{1}{\sum \limits_{\left(i,j\right)} Img\left(i,j\right)} Img\left(i,j\right) \) where (i, j) indicates the spatial location of image. Img (i, j) is the probability from location (i, j).
DWT	Dilation 3 wavelet functions	Ψ^(H)(x, y), Ψ^(V)(x, y), and Ψ^(D)(x, y), When the wavelete function is sparable by f(x, y) = f₁(x), f₂(y), then these functions rewritten to ϕ(x, y) = ϕ(x), ϕ(y) Ψ^(H)(x, y) = Ψ(x), ϕ(y) Ψ^(V)(x, y) = ϕ(x), Ψ(y) Ψ^(D)(x, y) = Ψ(x), Ψ(y) whereΨ^(H)(x, y),Ψ^(V)(x, y), and Ψ^(D)(x, y) are called horizontal, vertical, and diagonal wavelets
Gaussian	G (x, y)is Gaussian matrix element at position (x, y)	\( \mathrm{G}\ \left(x,y\right)=\frac{1}{2\uppi {\sigma}^2}{e}^{-\frac{1}{2\uppi {\sigma}^2}} \) where G (x, y)is Gaussian matrix element at position (x, y), σ is the standard deviation.
LoG	Log(x, y) is the high-frequency Laplacian filter	\( \mathrm{Log}\left(x,y\right)=-\frac{1}{\uppi {\sigma}^4}\left[1-\frac{1}{\uppi {\sigma}^4}\right]{\mathrm{e}}^{-\frac{1}{\uppi {\sigma}^4}} \)
Unsharp	f_s(x, y) is the sharped imaged from unsharp mask	\( {f}_s\left(x,y\right)=f\left(x,y\right)-\overline{f}\ \left(x,y\right) \) where \( \overline{f}\ \left(x,y\right) \)is a blured version of f(x, y)
Wiener	H(u, v)is function of Wiener filter	g(x, y) = f(x, y) + n (x, y) \( H\left(u,v\right)=\frac{P_f\left(u,v\right)}{P_f\left(u,v\right)+{\sigma}^2} \) Where σ²is variance from the noise in Eqs. (9), P_f(u, v)is the signal power spectrum
Gabor	sx & sy are the variance along x and y axis, f is the frequency of sinusoidal function and θ is the orientation of Gabor function	\( G\left(x,y\right)=\frac{f^2}{\uppi \upgamma \upeta}\exp \left(-\frac{f^2}{\uppi \upgamma \upeta}\right)\exp \Big(\mathrm{j}2\uppi f{x}^{\prime }+ \) ϕ) x^′ = x cos(θ) + y sin(θ) y^′ = y cos(θ) − x sin(θ)
Haralick	Angular second moment Contrast Correlation Sum of squares (variance) Inverse different moment Sum average Sum varince Sum entropy Entropy Difference variance Difference entropy Info. measure of correlation 1 Info. measure of correlation 2 Max. correlation coefficient	∑_i∑_jp(i, j)² \( {\sum}_{\mathrm{n}=0}^{{\mathrm{N}}_{\mathrm{g}}-1}{n}^2\left\{{\sum}_{\mathrm{i}=1}^{{\mathrm{N}}_{\mathrm{g}}}{\sum}_{\mathrm{j}=1}^{{\mathrm{N}}_{\mathrm{g}}}p\left(i,j\right)\right\},\left\|i-j\right\|=n \) \( \frac{\sum_i{\sum}_j(ij)p\left(i,j\right)-{\mu}_x{\mu}_y}{\sigma_x{\sigma}_y} \) ∑_i∑_j(i − μ)²p(i, j) \( {\sum}_i{\sum}_j\frac{i}{1+{\left(i-j\right)}^p}p\left(i,j\right) \) \( {\sum}_{i=2}^{2{N}_g}{ip}_{x+y}(i) \) \( {\sum}_{i=2}^{2{N}_g}{\left(i-{f}_s\right)}^2{p}_{x+y}(i) \) \( -{\sum}_{i=2}^{2{N}_g}{p}_{x+y}(i)\mathit{\log}\left\{{p}_{x+y}(i)\right\} \) −∑_i∑_jp(i, j) log(p(i, j)) variance of p_x − y(i) \( -{\sum}_{i=0}^{N_g-1}{p}_{x-y}(i)\log \left\{{p}_{x+y}(i)\right\} \) \( \frac{ HX Y-{ HX Y}_1}{\max \left\{ HX, HY\right\}} \) \( {\left(1-\exp \left[-2.0\left( HXY2- HXY\right)\right]\right)}^{\frac{1}{2}} \) HXY1 = − ∑_i∑_jp(i, j) log(p(i, j)) where HX and HY are the entropies of p_x and p_y, HXY1 = − ∑_i∑_jp(i, j) log {p_x(i)(p)_y(j)}, and HXY2 = − ∑_i∑_jp_x(i)p_y(j) log {p_x(i)(p)_y(j)} (the second largest eigenvalue of Q)^1/2 where\( Q\left(i,j\right)={\sum}_k\frac{p\left(i,k\right)p\left(j,k\right)}{p_x(i){p}_y(k)} \)
Fractal	Δ(x, y): fractal feature vector	\( \Delta \left(x,y\right)=\left\{\begin{array}{c}1,\\ {}\\ {}0,\end{array}\right.{\displaystyle \begin{array}{c} if\exists \left({x}^{\prime },{y}^{\prime}\right)\in {N}_4{\left[x,y\right]}_{,}\\ {} Ib\left({x}^{\prime },{y}^{\prime}\right)=0\wedge Ib\left({x}^{\prime },{y}^{\prime}\right)=1,\\ {} otherwise\end{array}} \) where N₄[x, y]_,is the set of pixels that are 4-connected to (x, y) from the image. ∆(x, y) uses the value 1 if the pixel at position (x, y) in the binary image Ib(x^′, y^′) that has the value 1 and having one neighboring pixel with the value 0. Otherwise, ∆(x, y) takes the value 0.
LBP	LBP_{P, R}(x_c, y_c) LBP features where P sampling points on a circle of R radius	\( {LBP}_{P,R}\left({x}_c,{y}_c\right)={\sum}_{p=0}^{P-1}\mathrm{s}\left({g}_{\mathrm{p}}-{g}_c\right){2}^ps(x)=\left\{\begin{array}{c}1, ifx\ge 0;\kern1em \\ {}0, otherwise.\end{array}\right. \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsai, MJ., Yuadi, I. & Tao, YH. Decision-theoretic model to identify printed sources. Multimed Tools Appl 77, 27543–27587 (2018). https://doi.org/10.1007/s11042-018-5938-0

Download citation

Received: 09 May 2017
Revised: 21 February 2018
Accepted: 26 March 2018
Published: 22 April 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s11042-018-5938-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Decision-theoretic model to identify printed sources

Abstract

Access this article

Similar content being viewed by others

Digital forensics of microscopic images for printed source identification

Ink analysis based forensic investigation of handwritten legal documents

Digital Forensics of Printed Source Identification for Chinese Characters

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Formulas of feature filters

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Decision-theoretic model to identify printed sources

Abstract

Access this article

Similar content being viewed by others

Digital forensics of microscopic images for printed source identification

Ink analysis based forensic investigation of handwritten legal documents

Digital Forensics of Printed Source Identification for Chinese Characters

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Formulas of feature filters

Appendix: Formulas of feature filters

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation