Skip to main content
×
×
Home

Mesh-based piecewise planar motion compensation and optical flow clustering for ROI coding

  • Holger Meuel (a1), Marco Munderloh (a1), Matthias Reso (a1) and Jörn Ostermann (a1)
Abstract

For the transmission of aerial surveillance videos taken from unmanned aerial vehicles (UAVs), region of interest (ROI)-based coding systems are of growing interest in order to cope with the limited channel capacities available. We present a fully automatic detection and coding system which is capable of transmitting high-resolution aerial surveillance videos at very low bit rates. Our coding system is based on the transmission of ROI areas only. We assume two different kinds of ROIs: in order to limit the transmission bit rate while simultaneously retaining a high-quality view of the ground, we only transmit new emerging areas (ROI-NA) for each frame instead of the entire frame. At the decoder side, the surface of the earth is reconstructed from transmitted ROI-NA by means of global motion compensation (GMC). In order to retain the movement of moving objects not conforming with the motion of the ground (like moving cars and their previously occluded ground), we additionally consider regions containing such objects as interesting (ROI-MO). Finally, both ROIs are used as input to an externally controlled video encoder. While we use GMC for the reconstruction of the ground from ROI-NA, we use meshed-based motion compensation in order to generate the pelwise difference in the luminance channel (difference image) between the mesh-based motion compensated and the current input image to detect the ROI-MO. High spots of energy within this difference image are used as seeds to select corresponding superpixels from an independent (temporally consistent) superpixel segmentation of the input image in order to obtain accurate shape information of ROI-MO. For a false positive detection rate (regions falsely classified as containing local motion) of less than 2% we detect more than 97% true positives (correctly detected ROI-MOs) in challenging scenarios. Furthermore, we propose to use a modified high-efficiency video coding (HEVC) video encoder. Retaining full HDTV video resolution at 30 fps and subjectively high quality we achieve bit rates of about 0.6–0.9 Mbit/s, which is a bit rate saving of about 90% compared to an unmodified HEVC encoder.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Mesh-based piecewise planar motion compensation and optical flow clustering for ROI coding
      Available formats
      ×
      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Mesh-based piecewise planar motion compensation and optical flow clustering for ROI coding
      Available formats
      ×
      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Mesh-based piecewise planar motion compensation and optical flow clustering for ROI coding
      Available formats
      ×
Copyright
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Corresponding author
Corresponding author: H. Meuel Email: meuel@tnt.uni-hannover.de
References
Hide All
[1]Ciubotaru, B.; Muntean, G.; Ghinea, G.: Objective assessment of region of interest-aware adaptive multimedia streaming quality. IEEE Trans. Broadcast., 55 (2) (2009), 202212.
[2]Karlsson, L.; Sjöström, M.; Olsson, R.: Spatio-temporal filter for ROI video coding, in Proc. of the 14th European Signal Processing Conf. (EUSIPCO), September 2006, 15.
[3]Doulamis, N.; Doulamis, A.; Kalogeras, D.; Kollias, S.: Low bit-rate coding of image sequences using adaptive regions of interest. IEEE Trans. Circuits Syst. Video Technol., 8 (8) (1998), 928934.
[4]Chen, M.-J.; Chi, M.-C.; Hsu, C.-T.; Chen, J.-W.: ROI video coding based on H.263+ with robust skin-color detection technique. IEEE Trans. Consum. Electron., 49 (3) (2003), 724730.
[5]AVC: Recommendation ITU-T H.264 and ISO/IEC 14496-10 (MPEG-4 Part 10): Advanced Video Coding (AVC), 3rd ed.ISO/IEC and ITU-T, Geneva, Switzerland, 2004.
[6]HEVC: ITU-T Recommendation H.265/ ISO/IEC 23008-2:2013 MPEG-H Part 2/: High Efficiency Video Coding (HEVC), 2013.
[7]Liu, Y.; Li, Z.G.; Soh, Y.C.: Region-of-interest based resource allocation for conversational video communication of H.264/AVC. IEEE Trans. Circuits Syst. Video Technol., 18 (1) (2008), 134139.
[8]Wu, C.-Y.; Su, P.-C.; Yeh, C.-H.; Hsu, H.-C.: A joint content adaptive rate-quantization model and region of interest intra coding of H.264/AVC, in IEEE International Conf. on Multimedia and Expo (ICME), July 2014, 16.
[9]Liu, Y.; Li, Z.; Soh, Y.; Loke, M.: Conversational video communication of H.264/AVC with region-of-interest concern, in IEEE Int. Conf. on Image Processing (ICIP), October 2006, 31293132.
[10]Wu, C.-Y.; Su, P.-C.: A region of interest rate-control scheme for encoding traffic surveillance videos, in Fifth Int. Conf. on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), September 2009, 194197.
[11]Xing, P.; Tian, Y.; Huang, T.; Gao, W.: Surveillance video coding with quadtree partition based ROI extraction, in Proc. of the IEEE Picture Coding Symposium (PCS), December 2013, 157160.
[12]Meddeb, M.; Cagnazzo, M.; and Pesquet-Popescu, B.: Region-of-interest based rate control scheme for high efficiency video coding, in IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), May 2014, 73387342.
[13]Meuel, H.; Munderloh, M.; Ostermann, J.: Low bit rate ROI based video coding for HDTV aerial surveillance video sequences, in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition – Workshops (CVPRW), June 2011, 1320.
[14]Meuel, H.; Schmidt, J.; Munderloh, M.; Ostermann, J.: Advanced Video Coding for Next-Generation Multimedia Services – Chapter 3: Region of Interest Coding for Aerial Video Sequences Using Landscape Models, Intech, 2013. [Online]. Available at: http://www.intechopen.com/books/advanced-video-coding-for-next-generation-multimedia-services/region-of-interest-coding-for-aerial-video-sequences-using-landscape-models.
[15]Meuel, H.; Reso, M.; Jachalsky, J.; Ostermann, J.: Superpixel-based segmentation of moving objects for low-complexity surveillance systems, in Proc. of the 10th IEEE Int. Conf. on Advanced Video and Signal Based Surveillance (AVSS), August 2013, 395400.
[16]Terrillon, J.-C.; David, M.; Akamatsu, S.: Automatic detection of human faces in natural scene images by use of a skin color model and of invariant moments, in Proc. of the Third IEEE Int. Conf. on Automatic Face and Gesture Recognition, April 1998, 112117.
[17]Sakaino, H.: Video-based tracking, learning, and recognition method for multiple moving objects. IEEE Trans. Circuits Syst. Video Technol., 23 (10) (2013), 16611674.
[18]Dey, B.; Kundu, M.: Robust background subtraction for network surveillance in H.264 streaming video. IEEE Trans. Circuits Syst. Video Technol., 23 (10) (2013), 16951703.
[19]Bang, J.; Kim, D.; Eom, H.: Motion object and regional detection method using block-based background difference video frames, in Proc. of the 18th IEEE Int. Conf. on Embedded and Real-Time Computing Systems and Applications (RTCSA), August 2012, 350357.
[20]Zhang, X.; Tian, Y.; Huang, T.; Dong, S.; Gao, W.: Optimizing the hierarchical prediction and coding in HEVC for surveillance and conference videos with background modeling. IEEE Trans. Image Process., 23, (10) (2014), 45114526.
[21]Jones, R.; Ristic, B.; Redding, N.; Booth, D.: Moving target indication and tracking from moving sensors, in Proc. of Digital Image Comput.: Techniques and Application (DICTA), December 2005, 46.
[22]Shastry, A.; Schowengerdt, R.: Airborne video registration and traffic-flow parameter estimation. IEEE Trans. Intell. Transp. Syst., 6 (4) (2005), 391405.
[23]Cao, X.; Lan, J.; Yan, P.; and Li, X.: KLT feature based vehicle detection and tracking in airborne videos, in Sixth Int. Conf. on Image and Graphics (ICIG), August 2011, 673678.
[24]Ibrahim, A.; Ching, P.W.; Seet, G.; Lau, W.; Czajewski, W.: Moving objects detection and tracking framework for UAV-based surveillance, in Fourth Pacific-Rim Symp. on Image and Video Technology (PSIVT), November 2010, 456461.
[25]Kang, J.; Cohen, I.; Medioni, G.; Yuan, C.: Detection and tracking of moving objects from a moving platform in presence of strong parallax, in Tenth IEEE Int. Conf. on Computer Vision (ICCV), vol. 1, October 2005, 1017.
[26]Yalcin, H.; Hebert, M.; Collins, R.; Black, M.: A flow-based approach to vehicle detection and background mosaicking in airborne video, in IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR), vol. 2, June 2005, 1202.
[27]Teutsch, M.; Kruger, W.: Detection, segmentation, and tracking of moving objects in UAV videos, in Proc. of the IEEE Ninth Int. Conf. on Advanced Video and Signal-Based Surveillance (AVSS), September 2012, 313318.
[28]Kumar, R. et al. : Aerial video surveillance and exploitation. Proc. IEEE, 89 (10) (2001), 15181539.
[29]Teutsch, M.: Moving Object Detection and Segmentation for Remote Aerial Video Surveillance. Ph.D. dissertation, Karlsruhe Institute of Technology (KIT), Germany, 2014.
[30]Mundhenk, T.N.; Ni, K.-Y.; Chen, Y.; Kim, K.; Owechko, Y.: Detection of unknown targets from aerial camera and extraction of simple object fingerprints for the purpose of target reacquisition, in Proc. SPIE, vol. 8301, 2012, 83 010H–83 010H–14. [Online]. Available at: http://dx.doi.org/10.1117/12.906491.
[31]Comaniciu, D.; Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell., 24 (5) (2002), 603619.
[32]Xiao, J.; Cheng, H.; Feng, H.; Yang, C.: Object tracking and classification in aerial videos, in Proc. of the SPIE Automatic Target Recognition XVIII, vol. 6967, 2008, 696 711–696 711–9. [Online]. Available at: http://dx.doi.org/10.1117/12.777827.
[33]Reso, M.; Jachalsky, J.; Rosenhahn, B.; Ostermann, J.: Temporally consistent superpixels, in Proc. of the IEEE Int. Conf. on Computer Vision (ICCV), December 2013, 385392.
[34]Meuel, H.; Munderloh, M.; Reso, M.; Ostermann, J.: Optical flow cluster filtering for ROI coding, in Proc. of the Picture Coding Symp. (PCS), December 2013, 129132.
[35]Munderloh, M.: Detection of Moving Objects for Aerial Surveillance of Arbitrary Terrain. Ph.D. dissertation, Leibniz Universität Hannover, Germany, 2015.
[36]Munderloh, M.; Meuel, H.; Ostermann, J.: Mesh-based global motion compensation for robust mosaicking and detection of moving objects in aerial surveillance, in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW), June 2011, 16.
[37]Defense Advanced Research Projects Agency (DARPA): VIRAT Video Dataset, 2009. [Online]. Available at: http://www.viratdata.org/.
[38]Oh, S. et al. : A large-scale benchmark dataset for event recognition in surveillance video, in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), June 2011, 31533160.
[39]Institut für Informationsverarbeitung (TNT), Leibniz Universität Hannover: TNT Aerial Video Testset (TAVT), 2010–2014. [Online]. Available at: https://www.tnt.uni-hannover.de/project/TNT_Aerial_Video_Testset/.
[40]Harris, C.; Stephens, M.: A combined corner and edge detection, in Proc. of the Fourth Alvey Vision Conf., 1988, 147151.
[41]Tomasi, C.; Kanade, T.: Detection and Tracking of Point Features, Carnegie Mellon University, Technical Report, CMU-CS-91-132, April 1991.
[42]Shi, J.; Tomasi, C.: Good features to track, in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Seattle, June 1994, 593600.
[43]Fischler, M.A.; Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24 (6) (1981), 381395. [Online]. Available at: http://dx.doi.org/10.1145/358669.358692.
[44]Scheuermann, B.; Rosenhahn, B.: SlimCuts: graphcuts for high resolution images using graph reduction, in Energy Minimization Methods in Computer Vision and Pattern Recognition, ser. Lecture Notes in Computer Science (Boykov, Y., Kahl, F., Lempitsky, V., Schmidt, F.R., eds), 219232, Springer, Berlin, Heidelberg, vol. 6819, 2011. [Online]. Available at: http://dx.doi.org/10.1007/978-3-642-23094-3_16.
[45]Ren, X.; Malik, J.: Learning a classification model for segmentation, in Proc. of the IEEE Int. Conf. on Computer Vision (ICCV), 2003, 1017.
[46]Munderloh, M.; Klomp, S.; Ostermann, J.: Mesh-based decoder-side motion estimation, in Proc. of the IEEE Int. Conf. on Image Processing (ICIP), September 2010, 20492052.
[47]Dwyer, R.A.: A faster divide-and-conquer algorithm for constructing delaunay triangulations. Algorithmica, 2 (1–4) (1987), 137151.
[48]Guibas, L.J.; Stolfi, J.: Primitives for the manipulation of general subdivisions and the computation of Voronoi diagrams, in Proc. of the 15th Annual ACM Symp. on Theory of Computing, ser. STOC, ACM, New York, NY, USA, 1983, 221234. [Online]. Available at: http://doi.acm.org/10.1145/800061.808751.
[49]Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett., 27 (8) (2006), 861874. [Online]. Available at: http://dx.doi.org/10.1016/j.patrec.2005.10.010.
[50]VideoLAN Organization: x264, 2009. [Online]. Available at: http://www.videolan.org/developers/x264.html.
[51]Joint Video Team (JVT) of ISO/IEC MPEG & ITU: H.264/14496-10 AVC Reference Software (JM), 2009. [Online]. Available at: http://iphome.hhi.de/suehring/tml/.
[52]Tourapis, A.M.; Leontaris, A.; Sühring, K.; Sullivan, G.: H.264/14496-10 AVC reference software manual, in Joint Video Team Doc. JVT-AE010, 31th Meeting, London, UK, July 2009.
[53]Kim, I.-K.; McCann, K.; Sugimoto, K.; Bross, B.; Han, W.-J.: High effic. video coding (HEVC) test model 10 (HM10) encoder description, in JCT-VC Doc. JCTVC-L1002, Geneva, Switzerland, January 2013.
[54]Grois, D.; Hadar, O.: Complexity-aware adaptive spatial pre-processing for ROI scalable video coding with dynamic transition region, in Proc. of the 18th IEEE Int. Conf. on Image Processing (ICIP), September 2011, 741744.
[55]Gorur, P.; Amrutur, B.: Skip decision and reference frame selection for low-complexity H.264/AVC surveillance video coding. IEEE Trans. Circuits Syst. Video Technol., 24 (7) (2014), 11561169.
[56]Bjøntegaard, G.: Calculation of average PSNR differences between RD curves, in ITU-T SG16/Q6 Output Document VCEG-M33, Austin, Texas, April 2001. [Online]. Available at: http://www.wftp3.itu.int/av-arch/video-site/0104_Aus/VCEG-M33.doc.
[57]Bjøntegaard, G.: AI11: improvements of the BD-PSNR model. ITU-T Study Group 16 Question 6. 35th Meeting in ITU-T SG16 Q, Berlin, Germany, 2008.
[58]Sullivan, G.; Ohm, J.; Han, W.-J.; Wiegand, T.: verview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol., 22 (12) (2012), 16491668.
[59]Ren, C.Y.; Reid, I.: gSLIC: A Real-Time Implementation of SLIC Superpixel Segmentation, University of Oxford, Department of Engineering Science, Technical Report, 2011.
[60]Bradski, G.: OpenCV Library. Dr. Dobb's Journal of Software Tools (2000). [Online]. Available at: http://code.opencv.org/projects/opencv/wiki/CiteOpenCV.
[61]Mainali, P.; Yang, Q.; Lafruit, G.; Van Gool, L.; Lauwereins, R.: Robust low complexity corner detector. IEEE Trans. Circuits Syst. Video Technol., 21 (4) (2011), 435445.
[62]Chum, O.; Matas, J.: Optimal randomized RANSAC. IEEE Trans. Pattern Anal. Mach. Intell., 30 (8) (2008), 14721482.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

APSIPA Transactions on Signal and Information Processing
  • ISSN: 2048-7703
  • EISSN: 2048-7703
  • URL: /core/journals/apsipa-transactions-on-signal-and-information-processing
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Keywords

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed