Image stitching

Two images stitched together. The photo on the right is distorted slightly so that it matches up with the one on the left.

Image stitching or photo stitching is the process of combining multiple

high-dynamic-range imaging in regions of overlap.^[3]^[4] Some digital cameras

can stitch their photos internally.

Applications

Image stitching is widely used in modern applications, such as the following:

Document mosaicing^[5]
Image stabilization feature in camcorders that use frame-rate image alignment
High-resolution image mosaics in digital maps and satellite imagery
Medical imaging
Multiple-image super-resolution imaging
Video stitching^[6]
Object insertion

Alcatraz Island, shown in a panorama created by image stitching

Process

The image stitching process can be divided into three main components: image registration, calibration, and blending.

Image stitching algorithms

In order to estimate image alignment, algorithms are needed to determine the appropriate mathematical model relating pixel coordinates in one image to pixel coordinates in another. Algorithms that combine direct pixel-to-pixel comparisons with gradient descent (and other optimization techniques) can be used to estimate these parameters.

Distinctive features can be found in each image and then efficiently matched to rapidly establish correspondences between pairs of images. When multiple images exist in a panorama, techniques have been developed to compute a globally consistent set of alignments and to efficiently discover which images overlap one another.

A final compositing surface onto which to warp or projectively transform and place all of the aligned images is needed, as are algorithms to seamlessly blend the overlapping images, even in the presence of parallax, lens distortion, scene motion, and exposure differences.

Image stitching issues

Since the illumination in two views cannot be guaranteed to be identical, stitching two images could create a visible seam. Other reasons for seams could be the background changing between two images for the same continuous foreground. Other major issues to deal with are the presence of

motion, and exposure differences. In a non-ideal real-life case, the intensity varies across the whole scene, and so does the contrast and intensity across frames. Additionally, the aspect ratio of a panorama image needs to be taken into account to create a visually pleasing composite

.

For panoramic stitching, the ideal set of images will have a reasonable amount of overlap (at least 15–30%) to overcome lens distortion and have enough detectable features. The set of images will have consistent exposure between frames to minimize the probability of seams occurring.

Keypoint detection

Harris corners, and differences of Gaussians

of Harris corners are good features since they are repeatable and distinct.

One of the first operators for interest point detection was developed by Hans Moravec in 1977 for his research involving the automatic navigation of a robot through a clustered environment. Moravec also defined the concept of "points of interest" in an image and concluded these interest points could be used to find matching regions in different images. The Moravec operator is considered to be a corner detector because it defines interest points as points where there are large intensity variations in all directions. This often is the case at corners. However, Moravec was not specifically interested in finding corners, just distinct regions in an image that could be used to register consecutive image frames.

Harris and Stephens improved upon Moravec's corner detector by considering the differential of the corner score with respect to direction directly. They needed it as a processing step to build interpretations of a robot's environment based on image sequences. Like Moravec, they needed a method to match corresponding points in consecutive image frames, but were interested in tracking both corners and edges between frames.

SIFT and SURF are recent key-point or interest point detector algorithms but a point to note is that SURF is patented and its commercial usage restricted. Once a feature has been detected, a descriptor method like SIFT descriptor can be applied to later match them.

Registration

matching features^[7] in a set of images or using direct alignment methods to search for image alignments that minimize the sum of absolute differences between overlapping pixels.^[8]

When using direct alignment methods one might first calibrate one's images to get better results. Additionally, users may input a rough model of the panorama to help the feature matching stage, so that e.g. only neighboring images are searched for matching features. Since there are smaller group of features for matching, the result of the search is more accurate and execution of the comparison is faster.

To estimate a robust model from the data, a common method used is known as

RANSAC

. The name RANSAC is an abbreviation for "RANdom SAmple Consensus". It is an iterative method for robust parameter estimation to fit mathematical models from sets of observed data points which may contain outliers. The algorithm is non-deterministic in the sense that it produces a reasonable result only with a certain probability, with this probability increasing as more iterations are performed. It being a probabilistic method means that different results will be obtained for every time the algorithm is run.

The RANSAC algorithm has found many applications in computer vision, including the simultaneous solving of the correspondence problem and the estimation of the fundamental matrix related to a pair of stereo cameras. The basic assumption of the method is that the data consists of "inliers", i.e., data whose distribution can be explained by some mathematical model, and "outliers" which are data that do not fit the model. Outliers are considered points which come from noise, erroneous measurements, or simply incorrect data.

For the problem of homography estimation, RANSAC works by trying to fit several models using some of the point pairs and then checking if the models were able to relate most of the points. The best model – the homography, which produces the highest number of correct matches – is then chosen as the answer for the problem; thus, if the ratio of number of outliers to data points is very low, the RANSAC outputs a decent model fitting the data.

Calibration

Image calibration aims to minimize differences between an ideal lens models and the camera-lens combination that was used, optical defects such as

Panotools

and its various derivative programs use this method.

Alignment

Alignment may be necessary to transform an image to match the view point of the image it is being composited with. Alignment, in simple terms, is a change in the coordinates system so that it adopts a new coordinate system which outputs image matching the required viewpoint. The types of transformations an image may go through are pure translation, pure rotation, a similarity transform which includes translation, rotation and scaling of the image which needs to be transformed, Affine or projective transform.

Projective transformation is the farthest an image can transform (in the set of two dimensional planar transformations), where only visible features that are preserved in the transformed image are straight lines whereas parallelism is maintained in an affine transform.

Projective transformation can be mathematically described as

\cdot

,

where x is points in the old coordinate system, x’ is the corresponding points in the transformed image and H is the homography matrix.

Expressing the points x and x’ using the camera intrinsics (K and K’) and its rotation and translation $[R t]$ to the real-world coordinates X and X’, we get

\cdot

and

\cdot

.

Using the above two equations and the homography relation between x’ and x, we can derive

\cdot

The homography matrix H has 8 parameters or degrees of freedom. The homography can be computed using Direct Linear Transform and Singular value decomposition with

\cdot

,

where A is the matrix constructed using the coordinates of correspondences and h is the one dimensional vector of the 9 elements of the reshaped homography matrix. To get to h we can simple apply SVD: $\cdot$ And h = V (column corresponding to the smallest singular vector). This is true since h lies in the null space of A. Since we have 8 degrees of freedom the algorithm requires at least four point correspondences. In case when RANSAC is used to estimate the homography and multiple correspondences are available the correct homography matrix is the one with the maximum number of inliers.

Compositing

Compositing is the process where the rectified images are aligned in such a way that they appear as a single shot of a scene. Compositing can be automatically done since the algorithm now knows which correspondences overlap.

Blending

Image blending involves executing the adjustments figured out in the calibration stage, combined with remapping of the images to an output projection. Colors are

high dynamic range merging is done along with motion compensation

and deghosting. Images are blended together and seam line adjustment is done to minimize the visibility of seams between images.

The seam can be reduced by a simple gain adjustment. This compensation is basically minimizing intensity difference of overlapping pixels. Image blending algorithm allots more weight to pixels near the center of the image. Gain compensated and multi band blended images compare the best. IJCV 2007.

Straightening is another method to rectify the image. Matthew Brown and David G. Lowe in their paper ‘Automatic Panoramic Image Stitching using Invariant Features’ describe methods of straightening which apply a global rotation such that vector u is vertical (in the rendering frame) which effectively removes the wavy effect from output panoramas. This process is similar to image rectification, and more generally software correction of optical distortions in single photographs.

Even after gain compensation, some image edges are still visible due to a number of unmodelled effects, such as vignetting (intensity decreases towards the edge of the image), parallax effects due to unwanted motion of the optical centre, mis-registration errors due to mismodelling of the camera, radial distortion and so on. Due to these reasons they propose a blending strategy called multi band blending.

Projective layouts

Comparison of Mercator and rectilinear projections

For image segments that have been taken from the same point in space, stitched images can be arranged using one of various map projections.

Rectilinear

Rectilinear projection, where the stitched image is viewed on a two-dimensional plane intersecting the panosphere in a single point. Lines that are straight in reality are shown as straight regardless of their directions on the image. Wide views – around 120° or so – start to exhibit severe distortion near the image borders. One case of rectilinear projection is the use of cube faces with cubic mapping

for panorama viewing. Panorama is mapped to six squares, each cube face showing 90 by 90 degree area of the panorama.

Cylindrical

Miller cylindrical

which have less distortion near the poles of the panosphere.

Spherical
2D plane of a 360° sphere panorama
(view as a 360° interactive panorama)

Spherical projection or equirectangular projection – which is strictly speaking another cylindrical projection – where the stitched image shows a 360° horizontal by 180° vertical field of view i.e. the whole sphere. Panoramas in this projection are meant to be viewed as though the image is wrapped into a sphere and viewed from within. When viewed on a 2D plane, horizontal lines appear curved as in a cylindrical projection, while vertical lines remain vertical.^[10]

Panini

Since a panorama is basically a map of a sphere, various other mapping projections from cartographers can also be used if so desired. Additionally there are specialized projections which may have more aesthetically pleasing advantages over normal cartography projections such as Hugin's Panini projection^[11] – named after Italian vedutismo painter Giovanni Paolo Panini^[12] – or PTGui's Vedutismo projection.^[13] Different projections may be combined in same image for fine tuning the final look of the output image.^[14]

Stereographic

Conformality
of the stereographic projection may produce more visually pleasing result than equal area fisheye projection as discussed in the stereo-graphic projection's article.

Artifacts

Artifacts due to parallax error

Artifacts due to subject movement

The use of images not taken from the same place (on a pivot about the
autostitch
), as opposed to manual selection and stitching, can cause imperfections in the assembly of the panorama.

Software

Dedicated programs include
Photoshop, which includes a tool known as Photomerge and, in the latest versions, the new Auto-Blend. Other programs such as VideoStitch make it possible to stitch videos, and Vahana VR
enables real-time video stitching. Image Stitching module for QuickPHOTO microscope software enables to interactively stitch together multiple fields of view from microscope using camera's live view. It can be also used for manual stitching of whole microscopy samples.

See also

ActionShot panoramic photography

Anaglyph 3D

Derivative work

Digital image mosaic

Document mosaicing

Panography

Panoramic photography

VR photography (interactive panoramas)

References

S2CID 16153752
.

ISBN 1-59593-429-4
.

^ Mann, Steve (May 9–14, 1993). Compositing Multiple Pictures of the Same Scene. Proceedings of the 46th Annual Imaging Science & Technology Conference.

^ S. Mann, C. Manders, and J. Fung, "The Lightspace Change Constraint Equation (LCCE) with practical application to estimation of the projectivity+gain transformation between multiple pictures of the same subject matter Archived 2023-03-14 at the Wayback Machine" IEEE International Conference on Acoustics, Speech, and Signal Processing, 6–10 April 2003, pp III - 481-4 vol.3

ISBN 978-0-7695-2877-9
.

doi:10.1049/joe.2015.0016. breszcz15mosaic.^{[permanent dead link}
]

^ Szeliski, Richard (2005). "Image Alignment and Stitching" (PDF). Retrieved 2008-06-01.

PMID 19547097
.

^ d'Angelo, Pablo (2007). "Radiometric alignment and vignetting calibration" (PDF).

^ ^a ^b Wells, Sarah; Gross, Barry; Gross, Michael; Frischer, Bernard; Donavan, Brian; Johnson, Eugene; Martin, Worthy; Reilly, Lisa; Rourke, Will; Stuart, Ken; Tuite, Michael; Watson, Tom; Wassel, Madelyn (2007). "Panorama Creation (Part 1): Methods And Techniques for Capturing Images". IATH Best Practices Guide to Digital Panoramic Photography. Archived from the original on 2008-10-06. Retrieved 2008-06-01.

^ "The General Panini Projection". PanoTools.org Wiki. 2019-08-21.

^ German, Daniel M. (2008-12-29). "new panini projection". Google Groups.

^ New House Internet Services BV. "Projections". PTGui.

^ Lyons, Max. "PTAssembler Projections". TawbaWare. section "Hybrid Projection".

^ Littlefield, Rik (2006-02-06). "Theory of the "No-Parallax" Point in Panorama Photography" (PDF). ver. 1.0. Retrieved 2008-06-01.

External links

Media related to Stitching at Wikimedia Commons

Retrieved from "https://en.wikipedia.org/w/index.php?title=Image_stitching&oldid=1287649762"

[1] S2CID 16153752
.

[Ward-2] ISBN 1-59593-429-4
.

[3] Mann, Steve (May 9–14, 1993). Compositing Multiple Pictures of the Same Scene. Proceedings of the 46th Annual Imaging Science & Technology Conference.

[4] S. Mann, C. Manders, and J. Fung, "The Lightspace Change Constraint Equation (LCCE) with practical application to estimation of the projectivity+gain transformation between multiple pictures of the same subject matter Archived 2023-03-14 at the Wayback Machine" IEEE International Conference on Acoustics, Speech, and Signal Processing, 6–10 April 2003, pp III - 481-4 vol.3

[5] ISBN 978-0-7695-2877-9
.

[breszcz15mosaic-6] :10.1049/joe.2015.0016. breszcz15mosaic.^{[permanent dead link}
]

[Szeliski-7] Szeliski, Richard (2005). "Image Alignment and Stitching" (PDF). Retrieved 2008-06-01.

[opticalstitch-8] PMID 19547097
.

[vignetting-9] 'Angelo, Pablo (2007). "Radiometric alignment and vignetting calibration" (PDF).

[Wells-10] Wells, Sarah; Gross, Barry; Gross, Michael; Frischer, Bernard; Donavan, Brian; Johnson, Eugene; Martin, Worthy; Reilly, Lisa; Rourke, Will; Stuart, Ken; Tuite, Michael; Watson, Tom; Wassel, Madelyn (2007). "Panorama Creation (Part 1): Methods And Techniques for Capturing Images". IATH Best Practices Guide to Digital Panoramic Photography. Archived from the original on 2008-10-06. Retrieved 2008-06-01.

[11] "The General Panini Projection". PanoTools.org Wiki. 2019-08-21.

[12] German, Daniel M. (2008-12-29). "new panini projection". Google Groups.

[13] New House Internet Services BV. "Projections". PTGui.

[14] Lyons, Max. "PTAssembler Projections". TawbaWare. section "Hybrid Projection".

[Littlefield-15] Littlefield, Rik (2006-02-06). "Theory of the "No-Parallax" Point in Panorama Photography" (PDF). ver. 1.0. Retrieved 2008-06-01.

[3]

[4]

[5]

[6]

[7]

[8]

[10]

[11]

[12]

[13]

[14]