• Rezultati Niso Bili Najdeni

v realnem ˇ casu z uporabo standardnih kamer

N/A
N/A
Protected

Academic year: 2022

Share "v realnem ˇ casu z uporabo standardnih kamer"

Copied!
126
0
0

Celotno besedilo

(1)

Univerza v Ljubljani

Fakulteta za raˇcunalniˇstvo in informatiko

Peter Peer

Gradnja globinskih panoramskih slik

v realnem ˇ casu z uporabo standardnih kamer

Doktorska disertacija

Mentor: prof. dr. Franc Solina

Ljubljana, 2003

(2)
(3)

University of Ljubljana

Faculty of Computer and Information Science

Peter Peer

Real Time Panoramic Depth Imaging Using Standard Cameras

Doctoral Dissertation

Supervisor: Prof. Dr. Franc Solina

Ljubljana, 2003

(4)
(5)

v

Real Time Panoramic Depth Imaging Using Standard Cameras

Peter Peer

Supervisor: prof. dr. Franc Solina

Abstract

A computer vision is a special kind of scientific challenge as we are all users of our own vision systems. Our vision is definitely a source of the major part of information we acquire and process each second. A stereo vision is perhaps even greater challenge, since our own vision system is a stereo one and it performs a complex task,which supplies us with 3D information on our surroundings in a very effective way.

Making machines see is a difficult problem. On one side we have psychological aspects of human visual perception,which try to explain how the visual information is processed in the human brain. On the other side we have technical solutions,which try to imitate human vision. Normally,it all starts with capturing digital images that store the basic information about the scene in a similar way that humans see.

But this information represents only the beginning of a difficult process. By itself it does not reveal the information about the objects on the scene,their color,distances etc. to the machine. For humans,visual recognition is an easy task,but the human brain processing methods are still a mistery to us.

One part of the human visual perception is estimating the distances to the objects on the scene. This information is also needed by robots if we want them to be completely autonomous.

In this dissertation we present a stereo panoramic depth imaging system.

The basic system is mosaic-based,which means that we use a single standard rotating camera and assemble the captured images in a multiperspective panoramic image. Due to a setoff of the camera’s optical center from the rotational center of the system we are able to capture the motion parallax effect,which enables the stereo reconstruction. The camera is rotating on a circular path with the step de- fined by an angle equivalent to one-pixel column of the captured image. To find the corresponding points on a stereo pair of panoramic images the epipolar geometry needs to be determined. It can be shown that the epipolar geometry is very simple if we perform the reconstruction based on a symmetric pair of stereo panoramic im- ages. We get a symmetric pair of stereo panoramic images when we take symmetric columns on the left and on the right side from the captured image center column.

(6)

This system however cannot generate panoramic stereo pair in real time. That is why we have suggested a real time extension of the system,based on simultaneously using many standard cameras. We have not physically built the real time sensor, but we have performed simulations to establish the quality of results.

Both systems have been comprehensively analysed and compared. The analyses revealed a number of interesting properties of the systems. According to the basic system accuracy we definitely can use the system for autonomous robot localization and navigation tasks. The assumptions made in the real time extension of the basic system have been proved to be correct,but the accuracy of the new sensor generally deteriorates in comparison to the basic sensor.

Generally speaking,the dissertation can serve as a guide for panoramic depth imaging sensor design and related issues.

Key words

computer vision,stereo vision,reconstruction,depth image,multiperspective panora- mic image,mosaicing,motion parallax effect,standard camera,real time,depth sensor

(7)

Contents

Abstract v

List of Figures x

List of Tables xii

1 Introduction 1

1.1 Description of the narrow scientific area . . . 2

1.2 Description of the problem . . . 3

1.3 Structure of the dissertation . . . 4

2 Basic System 5 2.1 Introduction . . . 6

2.1.1 Motivation . . . 6

2.1.2 Basics about the system . . . 6

2.1.3 Structure of the chapter . . . 7

2.2 Panoramic cameras . . . 7

2.3 Related work . . . 9

2.4 System geometry . . . 11

2.5 Epipolar geometry . . . 15

2.6 Stereo reconstruction . . . 17

2.7 Analysis of the system’s capabilities . . . 20

2.7.1 Time complexity of panoramic image creation . . . 20

2.7.2 Influence of parametersr,ϕand θ0 on the reconstruction ac- curacy . . . 21

2.7.3 Constraining the search space on the epipolar line . . . 22

2.7.4 Meaning of the one-pixel error in estimation of the angleθ . 25 2.7.5 Definition of the maximal reliable depth value . . . 28

2.7.6 Contribution of the vertical reconstruction . . . 29

2.7.7 Influence of using different cameras . . . 30

2.8 Experimental results . . . 33 vii

(8)

2.8.1 Influence of differentϕ values on the reconstruction accuracy

— The quantitative evaluation . . . 36

2.8.2 Time analysis of the stereo reconstruction process . . . 37

2.8.3 Influence of differentϕ values on the reconstruction accuracy — The qualitative evaluation . . . 40

2.8.4 Influence of addressing the vertical reconstruction . . . 48

2.8.5 Influence of different θ0 values on the reconstruction accuracy 49 2.8.6 Linear versus non-linear model for estimation of angleϕ . . . 50

2.8.7 Repeatability of results — Different room . . . 51

2.8.8 Repeatability of results — Different cameras . . . 52

2.8.9 Possibility of systematic error presence in the estimation ofr 53 2.8.10 Influence of lens distortion presence on the reconstruction ac- curacy . . . 55

2.9 Summary . . . 59

3 Real Time Extension 60 3.1 Introduction . . . 61

3.1.1 Motivation . . . 61

3.1.2 Structure of the chapter . . . 62

3.2 Building panoramic images from wider stripes . . . 62

3.2.1 Property of using stripes . . . 63

3.3 Achieving real time . . . 72

3.4 Stereo reconstruction from stripes . . . 74

3.5 Epipolar constraint . . . 76

3.6 Experimental results . . . 77

3.6.1 Reconstruction from non-symmetric pairs of panoramas . . . 78

3.6.2 Reconstruction from stripe panoramas . . . 80

3.6.3 Reconstruction from stripe panoramas — Different room . . 82

3.6.4 Reconstruction from stripe panoramas — Different cameras . 84 3.7 Summary . . . 86

4 Conclusion 87 4.1 Dissertation summary . . . 88

4.2 Conclusions . . . 88

4.3 Contributions to science . . . 91

4.4 Future work . . . 92

(9)

CONTENTS ix

Appendix 94

A Extended Abstract in Slovenian Language 94

A.1 Uvod . . . 95

A.1.1 Opis oˇzjega znanstvenega podroˇcja . . . 95

A.1.2 Opis problema . . . 95

A.1.3 Zgradba disertacije . . . 95

A.2 Osnovni sistem . . . 95

A.2.1 Uvod . . . 95

A.2.2 Sorodna dela . . . 96

A.2.3 Geometrija sistema . . . 97

A.2.4 Epipolarna geometrija . . . 97

A.2.5 Stereo rekonstrukcija . . . 97

A.2.6 Analiza zmogljivosti sistema . . . 98

A.2.7 Eksperimentalni rezultati . . . 100

A.2.8 Zakljuˇcek . . . 100

A.3 Delovanje v realnem ˇcasu . . . 100

A.3.1 Uvod . . . 100

A.3.2 Gradnja panoramskih slik iz ˇsirˇsih trakov . . . 102

A.3.3 Doseganje realnega ˇcasa . . . 102

A.3.4 Stereo rekonstrukcija iz trakov . . . 103

A.3.5 Epipolarna omejitev . . . 104

A.3.6 Eksperimentalni rezultati . . . 104

A.3.7 Zakljuˇcek . . . 104

A.4 Sklep . . . 105

A.4.1 Prispevki k znanosti . . . 105

Bibliography 107

Acknowledgement 111

Statement 112

(10)

2.1 Hardware part of our system. . . 7

2.2 Geometry of our system. . . 12

2.3 The viewing cylinder. . . 13

2.4 Two symmetric pairs of panoramic images. . . 14

2.5 The vertical reconstruction. . . 18

2.6 The relation for determining the radiusr. . . . 21

2.7 We can effectively constrain the search space on the epipolar line. . . 23

2.8 Constraining the search space on the epipolar line in case of 2ϕ = 29.9625. . . 24

2.9 The dependence of depthl on angle θ. . . . 25

2.10 The number of possible depth estimates is proportional to the angle ϕ. 26 2.11 The contribution of the vertical reconstruction. . . 30

2.12 Different cameras characterized by the horizontal view angle α give panoramic images with different horizontal resolutionWpan. . . 31

2.13 The plan of the reconstructed room. . . 40

2.14 Some stereo reconstruction results. . . 41

2.15 A ground-plan of the reconstructed scene (#1.1). . . 43

2.16 A ground-plan of the reconstructed scene (#1.2). . . 44

2.17 A ground-plan of the reconstructed scene (#2.1). . . 45

2.18 A ground-plan of the reconstructed scene (#2.2). . . 46

2.19 The influence of the lens distortion. . . 55

2.20 The camera model gained after the calibration process. . . 56

3.1 Panoramic images gained using different stripe widths. . . 63

3.2 Property of using stripes: not all the scene points are captured. . . . 64

3.3 The formation of the panoramic image from 14 pixel columns stripes with respect to the light rays. . . 66

3.4 The detail of the drawing presented in Fig. 3.3. . . 67

3.5 The formation of the panoramic image from one pixel column stripes with respect to the light rays. . . 68

3.6 The formation of the panoramic image from 14 pixel columns stripes taken from the center of the captured image. . . 69

x

(11)

LIST OF FIGURES xi

3.7 The formation of the panoramic image from 14 pixel columns stripes

at decreasedr. . . . 70

3.8 The formation of the panoramic image from 14 pixel columns stripes without the motion parallax effect. . . 71

3.9 The drawing of a real time sensor. . . 73

3.10 The geometric relations for stereo reconstruction from stripe panoramas. 75 3.11 The linear relation between AVG% and Ws. . . 81

3.12 The influence of the lens distortion when the panoramas are built from stripes. . . 85

A.1 Tloris geometrije sistema in strojni deli sistema. . . 96

A.2 Lastnost uporabe trakov: doloˇcene toˇcke scene niso zajete. . . 102

A.3 Geometrijske relacije za stereo rekonstrukcijo iz trakov. . . 103

(12)

2.1 Comparison of different types of panoramic cameras. . . 8 2.2 The one-pixel error ∆l in estimation of the angleθ. . . . 27 2.3 The one-pixel error ∆l in estimation of the angle θ for the minimal

and maximal possible depth estimation. . . 29 2.4 The one-pixel error ∆l in estimation of the angleθ for different cam-

eras (α). . . 32 2.5 The comparison of results for two different values of ϕ. . . . 36 2.6 The comparison of the stereo reconstruction times. . . 37 2.7 The comparison of results without and with addressing the vertical

reconstruction. . . 48 2.8 The comparison of results for two different values of θ0. . . 49 2.9 The comparison of results for two different values ofϕ— Linear versus

non-linear model for estimation of angle ϕ. . . . 50 2.10 Repeatability of results — Different room. . . 51 2.11 Repeatability of results — Different cameras. . . 52 2.12 The comparison of results before and after the optimization of pa-

rameterr. . . . 54 2.13 The comparison of results obtained without and with the lens distor-

tion correction. . . 58 3.1 The results obtained by processing symmetric and non-symmetric pair

of panoramas. . . 79 3.2 The results obtained with four different widths of the stripes (Ws). 80 3.3 Reconstruction from stripe panoramas — Different room. . . 82 3.4 Reconstruction from stripe panoramas — Different cameras. . . 84 A.1 Ilustracija napake (∆l) za en slikovni element pri oceni kotaθ. . . . 99 A.2 Rezultati eksperimentov (#1). . . 101 A.3 Rezultati eksperimentov (#2). . . 105

xii

(13)

To my family

(14)
(15)

Chapter 1

Introduction

(16)

1.1 Description of the narrow scientific area

In the last 30 years one of the most interesting areas of research is building machines that would complement human life with the help of artificial intelligence. This area is full of different challenges and one among them is to imitate human vision.

Analogically this discipline is called computer vision. The basic idea is to discover properties of 3D world by using only 2D information from an image. A lot of effort was put into this area of research,which eventually led to progress in areas such as object recognition,picture understanding and 3D reconstruction.

3D reconstruction is an important area,since it enables tasks like modeling,visu- alization,CAD (Computer Aided Design) model construction,localization,naviga- tion etc. Generally speaking 3D shape of objects and environments can be captured in three different ways: using CMM (Coordinate Measuring Machine) machine,Time of Flight method or with the help of optical devices. The latter approach is most widely used,partly because of favorable price and safety conditions. On the other hand such a system is the only real computer vision system for 3D reconstruction, since it is based merely on input images. With the help of optical scanners (range finders) we gather 3D data about the object surface by processing 2D images cap- tured with standard cameras. Optical scanners can be divided into two main groups.

Active scanners project light on the object,which assures effective and reliable 3D information. In many cases active scanners use structured light for reconstruction purpose,i.e. a light pattern is projected onto the object. A disadvantage of such a scanner is that the images should be taken under strict laboratory conditions,like scanning in complete darkness. On the other side we have passive scanners that estimate the distance to the object only on present textural information on images, which are captured completely without contact (physical contact,contact of a laser beam or structured light). Traditional approaches for acquiring depth from such images are based on stereo methods. Under the term stereo reconstruction we un- derstand the generation of depth images from two or more captured images. In this case the reconstruction suffers if the reconstructed object is not well textured. The result of the reconstruction is a depth image. Each depth image stores the estimates of distances to the object from one view point.

Currently,we are living in an era of vision research when some shape-from-X problems,for example stereo,have been almost completely solved,and furthermore are being used in industry. Other shape-from-X problems in their original formu- lations,like shape from motion,have proved to be very difficult,therefore some special cases are being tackled. The remaining shape-from-X problems,like shape from shading and shape from texture,have become less interesting,and,what is even more important,less applicable [50].

We wish that input images would have the property that the same points and lines are visible in all images of the scene,which facilitate stereo reconstruction.

This is the property of panoramic cameras. Standard cameras have a limited field of

(17)

1.2 Description of the problem 3

view,which is usually smaller than the human field of view. Because of that people have always tried to generate images with a wider field of view,up to full 360 degree panoramas.

As presented in the next chapter,one way to build panoramic images is by taking one column out of a captured image and mosaicing the columns. Such panoramic images are called multiperspective panoramic images. The crucial property of two or more multiperspective panoramic images is that they capture the information about the motion parallax effect,since the columns forming the panoramic images are captured from different perspectives.

The main problem we would like to solve in this dissertation is to analyze and determine the properties and the efficiency of the panoramic depth imaging system based on multiperspective panoramic images and see if these results can be used for robot localization and navigation. Only standard equipment is applied in the system construction process. A real time extension of the basic system is simulated to determine the efficiency of the new system in comparison to the basic one.

When we talk about the real time in the dissertation,we do not mean it so much from the processing power point of view (to increase the speed or reduce the time),but more from the accuracy point of view. Namely,nowadays there are many practical solutions how to increase processing power,but before we invest in the real system,we have to know whether the accuracy of the system is satisfactory.

Nevertheless,the processing power is also briefly addressed in the dissertation.

1.2 Description of the problem

For effective depth reconstruction we need high resolution images. As described in the next chapter,only mosaic-based procedures give high resolution results. Thus these procedures represent a good starting point for the development of our system.

First of all,we are interested in where are the efficiency borders of stereo pano- ramic depth imaging system,which is based on multiperspective panoramic images.

The basic system consists of only one standard camera,which is offset from the system’s rotational center. It is rotated around the rotational center in angular steps corresponding to one vertical pixel-column of the captured standard image. In this way the best possible accuracy of the depth reconstruction process is achieved and consequently the results of this system serve as a ground truth for subsequent comparisons.

Therefore the focus of the first part of the dissertation (Chapter 2) is on the ex- posed research issues. In it we also prove that we can effectively constrain the search space on the epipolar line,that the confidence in the estimated depth is variable and that the system can be used for depth reconstruction of small rooms (having in mind an application to autonomous mobile robot navigation). The relationship between different system parameters is also presented.

(18)

The disadvantage of the mosaicing procedures lies in the time needed to capture many images. Therefore we suggest a real time extension of the system based on simultaneously using many standard cameras. Only real time execution ensures the possibility to reconstruct dynamic scenes,which is in many cases of great importance for autonomous mobile robot navigation.

The second part of the dissertation (Chapter 3) explains this solution in detail.

It reveals a new panoramic depth sensor,its properties,analyze the number of needed standard cameras in order to ensure a good compromise between the speed and the accuracy of the new sensor,because the panoramic images are built from wider stripes and not from only one column of the captured image,and new results are compared to the results of the basic sensor,described in the first part of the dissertation.

As mentioned,suggested real time panoramic depth sensor can only be build if we use an adequate number of cameras simultaneously. We geometrically prove that such a sensor can be build out of standard cameras,which are available on the market. We have not physically built it,but we have performed simulations to establish the quality of results.

The in-depth analysis of such mosaicing approach should reveal whether the system could be used for real time panoramic depth imaging and consequently for autonomous mobile robot navigation.

Since our final goal is to determine the usability of our system for mobile robot navigation,we perform all the tests on real world images,so that the results reflect the applicability of implemented algorithms in the real world.

1.3 Structure of the dissertation

Basic introduction to the field of computer vision,related to the title of the disser- tation,and the problem statement are given in this chapter. In the next chapter (part one of the dissertation) the basic system based on a camera mounted on a rotational arm so that the optical center of the camera is offset from the vertical axis of rotation is introduced and evaluated. The descriptions of different panoramic cameras and related work are also part of this chapter. In Chapter 3 (second part of the dissertation),we suggest a real time extension of the basic system,reveal its properties and evaluate its effectiveness,also in comparison to the basic system. The summary of the dissertation is given in the last chapter,along with the conclusions, the contributions to science and the ideas for future work. We end the dissertation with the extended summary in Slovenian language,which is given in the appendix.

(19)

Chapter 2

Basic System

(20)

2.1 Introduction

2.1.1 Motivation

Standard cameras have a limited field of view,which is usually smaller than the human field of view. Because of that people have always tried to generate images with a wider field of view,up to full 360 degree panorama [16].

Under the term stereo reconstruction we understand the generation of depth images from two or more captured images. A depth image is an image that stores distances to points on the scene. The stereo reconstruction procedure is based on relations between points and lines on the scene and images of the scene. If we want to get a linear solution of the reconstruction procedure then the images can inter- act with the procedure in pairs,triplets or quadruplets,and relations are named accordingly to the number of images as epipolar constraint,trifocal constraint or quadrifocal constraint [22]. We want the images to have the property that the same points and lines are visible in all images of the scene,which facilitate stereo reconstruction. This is the property of panoramic cameras and it presents our fun- damental motivation. The stereo reconstruction in this dissertation is done from two symmetric multiperspective panoramic images.

In this work we address only the issue how to enlarge the horizontal field of view of images. The vertical field of view of panoramic images can be enlarged by using wide angle camera lenses [44],by using mirrors [25,32] or by moving the camera also in the vertical direction and not only in the horizontal direction [16].

If we tried to build two panoramic images simultaneously by using two standard cameras which are mounted on two rotational robotic arms,we would have problems with non-static scenes. Clearly,one camera would capture the motion of the other camera. So we have decided to use one camera only. In the first part of our work we develop a mosaic-based panoramic depth imaging system using only one standard camera and analyze its performance to see if it can be used for robot localization and navigation in a room.

2.1.2 Basics about the system

In Fig. 2.1 the hardware part of our system can be seen: a color camera is mounted on a rotational robotic arm so that the optical center of the camera is offset from the vertical axis of rotation. The camera is looking outward from the system’s rotational center. Panoramic images are generated by repeatedly shifting the rotational arm by an angle which corresponds to a single pixel column of the captured image. By assembling the center columns of these images,we get a mosaic panoramic image.

One of the drawbacks of mosaic-based panoramic imaging is that dynamic scenes are not well captured.

It can be shown that the epipolar geometry is very simple if we perform the reconstruction based on a symmetric pair of stereo panoramic images. We get a

(21)

2.2 Panoramic cameras 7

Figure 2.1: Hardware part of our system.

symmetric pair of stereo panoramic images when we take symmetric columns on the left and on the right hand side from the captured image center column. These columns are assembled in a mosaic stereo pair. The column from the left hand side of the captured image is mosaiced in the right eye panoramic image and the column from the right hand side of the captured image is mosaiced in the left eye panoramic image.

2.1.3 Structure of the chapter

In the next section we compare different panoramic cameras with emphasis on mo- saicing. In Sec. 2.3 we give an overview of related work and briefly present the contribution of our work towards the discussed subject. Sec. 2.4 describes the ge- ometry of our system,Sec. 2.5 is devoted to the epipolar geometry and Sec. 2.6 describes the procedure of stereo reconstruction. The focus of this chapter is on the analysis of system capabilities,given in Sec. 2.7. In Sec. 2.8 we present experimen- tal results. In the very end of this chapter we summarize the main conclusions of the first part of the dissertation.

2.2 Panoramic cameras

Every panoramic camera belongs to one of three main groups of panoramic cameras:

catadioptric cameras,dioptric cameras and cameras with moving parts. The basic property of a catadioptric camera is that it consists of a mirror (or mirrors [18]) and a camera. The camera captures the image which is reflected from the mirror.

A dioptric camera is using a special type of lens,e.g. fish-eye lens,which increases the size of the camera’s field of view. A panoramic image can also be generated by

(22)

moving the camera along some path and mosaicing together the images captured in different locations on the path.

Type of Number of Resolution of Real References panoramic camera images panoramic images time

catadioptric 1 low yes [15,18,25,28,29,33,52]

camera

dioptric 1 low yes [3,7]

camera

moving a lot high no [1,8,9,10,12,13,14,16]

parts [17,19,20,21,23,25,26]

[27,32,33,35,36,39,43]

[44]

Table 2.1: Comparison of different types of panoramic cameras with respect to the number of standard images needed to build a panoramic image,the resolution of panoramic images and the capability of building a panoramic image in real time.

The comparison of different types of panoramic cameras is shown in Tab. 2.1.

All types of panoramic cameras enable 3D reconstruction. The camera has a single viewpoint or a projection center if all light rays forming the image intersect in a single point. Cameras with this property are also called central cameras. Rays forming a non-central image do not pass through a single point,but rather intersect a line [10],a conic [25,39,40,49],do not intersect at all [46] or are bound by other constraints suiting the practical or the theoretical demands [13,17].

Mosaic-based procedures can be marked as non-central (we do not deal with a single center of projection),they do not execute in real time,but they give high resolution results. High resolution images enable effective depth reconstruction, since by increasing the resolution the number of possible depth estimates is also increasing. Thus mosaicing is not appropriate for capturing dynamic scenes and consequently not for reconstruction of dynamic scenes. The systems described in [1,16] are exceptions because the light rays forming the mosaic panoramic image intersect in the rotational center of the system. These two systems are central systems. The system presented in [30,41,42] could also be treated as mosaic- based procedure,though its concept for generating panoramic depth images is very different from our concept. Because the system is more related to the topic of the second part of the dissertation (Chapter 3),it is presented in Sec. 3.1.1.

Dioptric panoramic cameras with wide angle lenses can be marked as non-central [29],they build a panoramic image in real time and they give low resolution results.

Cameras with wide angle lenses are appropriate for fast capturing of panoramic images and processing of captured images,e.g. for detection of obstacles or for localization of a mobile robot,but are less appropriate for reconstruction. Please note that we are talking about panoramic cameras here. Generally speaking,dioptric

(23)

2.3 Related work 9

cameras can be central.

Only some of the catadioptric cameras have a single viewpoint. Cameras with a mirror (or mirrors) work in real time and they give low resolution results. Only two mirror shapes,namely hyperbolic and parabolic mirrors,can be used to con- struct a central catadioptric panoramic camera [29,52]. Such panoramic cameras are appropriate for low resolution reconstruction of dynamic scenes and for motion estimation. It is also true that only for panoramic systems with hyperbolic and parabolic mirrors the epipolar geometry can be simply generalized [29,52].

Since dioptric and catadioptric cameras give low resolution results,they are more appropriate for use with view-based systems [59] and less for use with reconstruction systems.

Of course,combinations of different cameras exist: e.g. a combination of the mosaicing camera and the catadioptric camera [25,32] or a combination of the mosaicing camera and the wide angle camera [44]. Their main purpose is to enlarge the camera’s vertical field of view.

2.3 Related work

We can generate panoramic images either with the help of special panoramic cameras or with the help of a standard camera and with mosaicing standard images into panoramic images. If we want to generate mosaic 360 degree panoramic images,we have to move the camera on a closed path,which is in most cases a circle.

One of the best known commercial packages for creating mosaic panoramic im- ages is QTVR (QuickTime Virtual Reality). It works on the principle of sewing together a number of standard images captured while rotating the camera [8]. Pe- leg et al. [27] introduced the method for creation of mosaiced panoramic images from standard images captured with a handheld video camera. A similar method was suggested by Szeliski and Shum [12],which also does not strictly constraint the camera path but assumes that a great motion parallax effect is not present. All methods mentioned so far are used only for visualization purposes since the authors did not try to reconstruct the scene.

The crossed-slits (X-slits) projection [53,56,61] uses a similar mosaicing tech- nique with one important difference: the mosaiced strips are sampled from varying positions in the captured images. This makes the generation of virtual walkthroughs possible,i.e. we are again dealing with the visualization with the help of image-based rendering or new view synthesis.

Ishiguro et al. [1] suggested a method which enables scene reconstruction. They used a standard camera rotating on a circular path. The scene is reconstructed by means of mosaicing panoramic images together from the central column of the captured images and moving the system to another location where the task of mo- saicing is repeated. The two created panoramic images are then used as the input

(24)

to a stereo reconstruction procedure. The depth of an object was first estimated using projections in two images captured in different locations of the camera on the camera path. But since their primary goal was to create a global map of the room, they preferred to move the system attached to the robot about the room. Clearly, by moving the robot to another location and producing the second panoramic im- age of a stereo pair in this location rather than producing a stereo pair in a single location,they enlarged the disparity of the system. But this decision also has a few drawbacks: we cannot estimate the depth for all points on the scene,the time of capturing a stereo pair is longer and we have to search for the corresponding points on the sinusoidal epipolar curves. The depth was then estimated from two panoramic images taken at two different locations of the robot in the room.

Peleg and Ben-Ezra [19,26] introduced a method for creation of stereo panoramic images without actually computing the 3D structure — the depth effect is created in the viewer’s brain.

In [20],Shum and Szeliski described two methods used for creation of panoramic depth images,which use standard procedures for stereo reconstruction. Both meth- ods are based on moving the camera on a circular path. Panoramic images are built by taking one column out of a captured image and mosaicing the columns. The authors call such panoramic imagesmultiperspective panoramic images. The crucial property of two or more multiperspective panoramic images is that they capture the information about the motion parallax effect,since the columns forming the panoramic images are captured from different perspectives. The authors use such panoramic images as the input in a stereo reconstruction procedure. In [21],Shum et al. proposed a non-central camera called an omnivergent sensor in order to re- construct scenes with minimal reconstruction error. This sensor is equivalent to the sensor presented in this chapter.

However,multiperspective panoramic images are not something new to the vision community [20]: they are a special case ofmultiperspective panoramic images for cel animation[13],a special case ofcrossed-slits (X-slits) projection[53,56,61],they are very similar to images generated by a procedure calledmultiple-center-of-projection [17],by the manifold projection procedure [27] and by the circular projection pro- cedure [19,26]. The principle of constructing multiperspective panoramic images is also very similar to the linear pushbroom camera principle for creating panoramic images [10].

The papers closest to our work [1,20,21] seem to lack two things: a comprehen- sive analysis of 1) the system’s capabilities and 2) the corresponding points search using the epipolar constraint. Therefore,the focus of this chapter is on these two issues. While in [1] the authors searched for corresponding points by tracking the feature from the column building the first panorama to the column building the sec- ond panorama,the authors in [20] used an upgraded plane sweep stereoprocedure.

A key idea behind the approach in [21] is that it enables optimizing the input to traditional computer vision algorithms for searching the correspondences in order to

(25)

2.4 System geometry 11

produce superior results.

Further details about the related work are revealed in in the following sections, where we discuss specifics of our system.

2.4 System geometry

Let us begin this section with description of how the stereo panoramic pair is gen- erated. From the captured images on the camera’s circular path we always take only two columns,which are equally distant from the middle column. We assume that the middle column that we are referring to in this work,is the middle column of the captured image,if not mentioned otherwise. The column on the right hand side of the captured image is then mosaiced in the left eye panoramic image and the column on the left hand side of the captured image is mosaiced in the right eye panoramic image. So,we are building each panoramic image from just a single pixel column of the captured image. Thus,we get a symmetric pair of stereo panoramic images,which yields a reconstruction with optimal characteristics (simple epipolar geometry and minimal reconstruction error) [21].

The geometry of our system for creating multiperspective panoramic images is shown in Fig. 2.2. The panoramic images are then used as the input to create panoramic depth images. Point C denotes the system’s rotational center around which the camera is rotated. The offset of the camera’s optical center from the rotational center C is denoted as r,describing the radius of the circular path of the camera. The camera is looking outward from the rotational center. The optical center of the camera is marked with O. The column of pixels that is sewn in the panoramic image contains the projection of pointP on the scene. The distance from point P to point C is the depth l,while the distance from point P to point O is denoted byd. Further, θ is the angle between the line defined by pointsC and O and the line defined by pointsCand P. In the panoramic image the horizontal axis represents the path of the camera. The axis is spanned byµand defined by pointC, a starting pointO0,where we start capturing the panoramic image,and the current point O. ϕdenotes the angle between the line defined by point O and the middle column of pixels of the image captured by the physical camera looking outward from the rotational center (the latter column contains the projection of the pointQ),and the line defined by pointO and the column of pixels that will be mosaiced into the panoramic image (the latter column contains the projection of the pointP). Angle ϕcan be thought of as a reduction of the camera’s horizontal view angleα.

The geometry of capturing multiperspective panoramic images can be described with a pair of parameters (r, ϕ). By increasing (decreasing) each of them,we increase (decrease) the baseline (2r0 [39],r0 =sinϕ(Fig. 2.2)) of our stereo system.

Wei et al. [43] proposed an approach to solve the parameter (r, ϕ) determination problem for a symmetric stereo panoramic camera. The image acquisition parame-

(26)

O0

ϕ−θ µ

C r r0

2θ O

l 2ϕ

d sinϕ

cosϕ Q

P

α W W

virtual camera

physical camera (image plane)

viewing circle camera path

important columns camera optical

axis

Figure 2.2: Geometry of our system for constructing multiperspective panoramic images. Note that a ground-plan is presented. The optical axis of the camera is kept horizontal.

ters (r, ϕ) are calculated based on (subjectively) given parameters: the nearest and the furthest distances of the region of interest,the height of the region of interest and the width of the angular disparity interval. They conclude that neither the parameter r nor ϕ can satisfactorily match application requirements on their own and report that a general study of relations among parameters is in progress as they have discovered certain exceptions in experiments that require further researches.

The system in Fig. 2.2 is obviously a non-central since the light rays forming the panoramic image do not intersect in one point called the viewpoint,but instead are tangent (ϕ= 0) to a cylinder with radius r0,called the viewing cylinder (Fig.

2.3). Thus,we are dealing with panoramic images formed by a projection from a number of viewpoints. This means that a captured point on the scene is seen in the panoramic image from one viewpoint only. This is why the panoramic images captured in this way are called multiperspective panoramic images.

For stereo reconstruction we need two images. If we look at only one circle on the viewing cylinder (Fig. 2.2) then we can conclude that our system is equivalent

(27)

2.4 System geometry 13

virtual camera

physical camera (image plane) C

O

viewing cylinder

light rays important

column

Figure 2.3: All the light rays forming the panoramic image are tangent to the viewing cylinder.

to a system with two cameras. In our case,two virtual cameras are rotating on a circular path,i.e. a viewing circle,with radius r0. The optical axis of a virtual camera is always tangent to the viewing circle. The panoramic image is generated from only one pixel from the middle column of each image captured by a virtual camera. This pixel is determined by the light ray which describes the projection of a scene point on the physical camera image plane. If we observe a point on the scene P,we see that both virtual cameras,which see this point,form a traditional stereo system of converging cameras.

Obviously,a symmetric pair of panoramic images used in the stereo reconstruc- tion process could be captured also with a bunch of cameras rotating on a circular path with radiusr0,where the optical axis of each camera is tangent to the circular path (Fig. 2.3).

Two images differing in the angle of rotation of the physical camera setup (for example,two image planes marked in Fig. 2.2) are used to simulate a bunch of virtual cameras on the viewing cylinder. Each column of the panoramic image is obtained from a different position of the physical camera on a circular path. In Fig.

2.4 we present two symmetric pairs of panoramic images.

To automatically register captured images directly from the knowledge of the camera’s viewing direction,the camera lens’ horizontal view angle α and vertical view angleβ are required. If we know this information,we can calculate the reso- lution of one angular degree,i.e. we can calculate how many columns and rows are within an angle of one degree. The horizontal view angle is especially important

(28)

2ϕ= 29.9625

2ϕ= 3.6125

Figure 2.4: Two symmetric pairs of panoramic images generated using different values of the angle ϕ. In Sec. 2.7.1 we explain where these values for the angleϕ come from. Each symmetric pair of panoramic images comprises the motion parallax effect. This fact enables the stereo reconstruction.

in our case,since we move the rotational arm only around it’s vertical axis. To calculate these two parameters,we use an algorithm described in [16]. It is designed to work with cameras whose zoom settings and other internal camera parameters are unknown. The algorithm is based on the mechanical accuracy of the rotational arm. The basic step of our rotational arm corresponds to an angle of 0.0514285. In general,this means that if we tried to turn the rotational arm for 360 degrees, we would perform 7000 steps. Unfortunately,the rotational arm that we use cannot turn for 360 degrees around it’s vertical axis. The basic idea of the algorithm is to calculate the translationdx (in pixels) between two images,while the camera is rotated for a known angle in the horizontal direction. Since we know the exact angle by which we move the camera,we can calculate the horizontal view angle of the camera:

α = W

dx·dγ, (2.1)

whereW is the width of the captured image in pixels.

The major drawback of this method is that it relies on the accuracy of the rotational arm. Because of that we rechecked the values of the view angles by calibrating the camera using a static camera and a checkboard pattern [11,31,54].

The input into the calibration procedure is a set of images with varying position of the pattern in each image. The results obtained were very similar,though the second method should be more reliable as it reveals more information about the camera model and also uses sub-pixel accuracy procedure. The latter calibration estimates the focal length,the principal point,the skew coefficient and distortions,

(29)

2.5 Epipolar geometry 15

to name just the most important parameters for us. It also reveals the errors of all estimated parameters. If we assume that the principal point is in the middle of the captured image,we can calculate the horizontal view angle of the camera from the estimated parameters:

α= 2 arctanW/2

f , (2.2)

wheref is the estimated focal length.

Distortion parameters are also important,because we also investigate the influ- ence of distortion on the system’s results.

In any case,now that we know the value ofα,we can calculate the resolution of one angular degreex0:

x0 = W α .

This equation enables us to calculate the width of the stripeWsthat will be mosaiced in the panoramic image when the rotational arm moves for an angleθ0:

Ws=x0·θ0. (2.3)

From the above equation we can also calculate the angle of the rotational arm for which we have to move the rotational arm if the stripe is only one pixel column wide.

We used three different cameras in the experiments:

a camera with the horizontal view angle α = 34 and the vertical view angle β = 25,

a camera with the horizontal view angleα= 39.72and the vertical view angle β = 30.54,

a camera with the horizontal view angleα= 16.53and the vertical view angle β = 12.55.

In the process of the panoramic image construction we did not vary these two param- eters. From here on,the first camera is used in the calculations and the experiments, if not stated differently.

2.5 Epipolar geometry

Searching for the corresponding points in two images is a difficult problem. Generally speaking,the corresponding point can be anywhere in the second image. That is why we would like to constrain the search space as much as possible. Using the epipolar constraint we reduce the search space from 2D to 1D,i.e. to an epipolar

(30)

line [4]. In Sec. 2.7.3 we prove that in our system we can effectively reduce the search space even on the epipolar line.

In this section we will only illustrate the procedure of the proof that the epipolar lines of the symmetric pair of panoramic images are image rows. This statement is true for our system geometry. For proof see [20,23,35,51].

The proof in [23] is based on radius r0 of the viewing cylinder (Figs. 2.2 and 2.3). We can expressr0 in the terms of known parametersr and ϕas:

r0=sinϕ .

We carry out the proof in three steps: first,we have to execute the projection equation for the line camera, then we have to write the projection equation for a multiperspective panoramic image and,in the final step,we prove the property of the epipolar lines for the case of a symmetric pair of panoramic images. In the first step,we are interested in how the point on the scene is projected to the camera’s im- age plane [4],which is of dimension1 pixels in our case,since we are dealing with a line camera. In the second step,we have to write the relation between different notations of a point on the scene and the projection of this point on the panoramic image: notation of the scene point in Euclidean coordinates of the world coordi- nate system and in cylindric coordinates of the world coordinate system,notation of the projected point in angular coordinates of the (2D) panoramic image coordinate system and in pixel coordinates of the (2D) panoramic image coordinate system.

When we know the relations between the above-mentioned coordinate systems,we can write the equation for projection of scene points on the cylindric image plane of the panorama. Based on the angular coordinates of the panoramic image coordinate system property,we can in the third step show that the epipolar lines of the sym- metric pair of panoramic images are actually rows of panoramic images. The basic idea for the last step of the proof is as follows: If we are given an image point in one panoramic image,we can express the optical ray defined by a given point and the optical center of the camera in 3D world coordinate system. If we project this optical ray described in world coordinate system on the second panoramic image,we get an epipolar line corresponding to the given image point in the first panoramic image.

After introducing proper relations valid for the symmetric case into the obtained equation,our hypothesis is confirmed.

The same result can be found in [20],where the authors proved the property of symmetric pair of panoramic images by directly investigating the presence of the vertical motion parallax effect in the panoramic images captured from the same rotational center. The generalization to the non-symmetric case for the camera looking inward and outward can be found in [51]. Even a more general case,in some respect,where the panoramic images can be captured from different rotational centers,is discussed in [35].

It was shown that the notion of the epipolar geometry,well known for both central perspective cameras [4,22,34] and central catadioptric cameras [28,29,52],

(31)

2.6 Stereo reconstruction 17

can be generalized to some non-central cameras [37,40,46,49]. The epipolar surfaces extend from planes to double-ruled quadrics: planes,rotational hyperboloids and hyperbolic paraboloids.

2.6 Stereo reconstruction

Let us go back to Fig. 2.2. Using trigonometric relations evident from the sketch, we can write the equation for the depth estimationl of a pointP on the scene. By the basic law of sines for triangles,we have:

r

sin(ϕ−θ) = d

sinθ = l

sin(180−ϕ). (2.4)

From this equation we can express the equation for depth estimationl as:

l= sin(180−ϕ)

sin(ϕ−θ) = sinϕ

sin(ϕ−θ). (2.5)

Eq. (2.5) implies that we can estimate depthlonly if we know three parameters:

r, ϕ and θ. r is given. Angle ϕ can be calculated on the basis of the camera’s horizontal view angleα (Eq. (2.1)) as:

2ϕ= α

W ·W, (2.6)

whereW is the width of the captured image in pixels andW is the width of the captured image between columns forming the symmetric pair of panoramic images, given also in pixels. To calculate the angleθ,we have to find corresponding points on panoramic images. Our system works by moving the camera for the angle corre- sponding to one pixel column of the captured image. If we denote this angle byθ0, we can express the angleθas:

θ=dx·θ0

2, (2.7)

wheredxis the absolute value of difference between the corresponding points image coordinates on the horizontal axisxof the panoramic images.

Note that Eg. (2.5) does not contain the focal lengthf explicitly,but since the relationships between α and f on one side (Eq. (2.2)) and α and ϕ on the other side (Eq. (2.6)) exist, ϕalso depends uponf (the two models for estimating angle ϕ(Eqs. (2.6) and (2.8)) are discussed in Sec. 2.7.2):

ϕ= arctanW/2

f . (2.8)

Eq. (2.5) estimates the distance l to the perpendicular projection of the scene pointP on the plane defined by the camera’s circular (planar) path. The projection

(32)

C

O

r

P

l ( α ) d ( α )

l ( α, β ) d ( α, β )

ω

1

ω

2

Y P

Figure 2.5: Important relations between system parameters for addressing the ver- tical reconstruction.

of the scene point P is marked with P in Fig. 2.5. Since this estimation is an approximation of the real l,we have to improve the estimation by addressing the vertical reconstruction,i.e. by incorporating the vertical view angleβ into Eq. (2.5).

Let us here adopt the following notation to introduce the influence of β on estimation of l: if a variable l or d depends on α only,we mark that as l(α) and d(α) (until now,these variables were marked simply l and d),but if a variable l or ddepends on α and β,we mark that asl(α, β) andd(α, β). According to Fig. 2.5 the distance to the point P on the scene can be calculated as:

l(α, β) =

l(α)2+Y2 =

l(α)2+ (l(α)·tanω2)2.

Because the value of ω2 is unknown,we have to express it in terms of known pa- rameters. We can do that,whileY can also be written as:

Y =d(α)·tanω1.

We can calculateω1 similarly as we calculatedϕ (Eqs. (2.6) and (2.8)):

1 = β

H ·H1 or ω1 = arctanH1/2 f ,

(33)

2.6 Stereo reconstruction 19

whereH is the height of the captured image in pixels andH1 is the height of the captured image between the image row that contains the projection of the scene point P and the symmetric row on the other side from the middle row,given also in pixels. Andd(α) follows from Eq. (2.4):

d(α) = l(α)·sinθ sinϕ . Now,we can write the equation forl(α, β) as:

l(α, β) =

l(α)2+

l(α)·sinθ

sinϕ ·tanω1

2

. (2.9)

From now on,l=l(α) and when l(α, β) is used,this is explicitly stated.

The influence of addressing the vertical reconstruction on the reconstruction accuracy is discussed in Secs. 2.7.6 and 2.8.4.

(34)

2.7 Analysis of the system’s capabilities

2.7.1 Time complexity of panoramic image creation

The biggest disadvantage of our system is that it cannot produce panoramic images in real time since we create them stepwise by rotating the camera for a very small angle. Because of mechanical vibrations of the system,we also have to ensure to capture an image when the system is completely still. The time that the system needs to create a panoramic image is much too long to allow it work in real time.

In a single circle around the system’s vertical axis our system constructs 11 panoramic images: 5 symmetric pairs and a panoramic image from the middle columns of the captured images. It captures and saves 1501 images with resolution of 160×120 pixels,where radius isr = 30 cm and the shift angle isθ0= 0.205714. We have choosen the resolution of 160×120 pixels because it represents a good compromise between overall time complexity of the system and its accuracy,as it is shown in the following sections. We cannot capture 360/θ0 images because of the limitation of the rotational arm. Namely,the rotational arm cannot turn for 360 degrees around its vertical axis.

The middle column of the captured image was in our case the 80th column. The distances between the columns building up symmetric pairs of panoramic images were 141,125,89,53 and 17 columns. These numbers include two columns building up each pair. In consequence the values of the angle 2ϕ(Eq. (2.6)) are 29.9625(141 columns),26.5625(125 columns),18.9125(89 columns),11.2625(53 columns) and 3.6125 (17 columns),respectively. (Here we used the camera with the horizontal view angleα= 34.)

The acquisition process takes little over 15 minutes on a 350 MHz Intel PII PC.

The steps of the acquisition process are as follows:

1. Move the rotational arm to its initial position.

2. Capture and save the image.

3. Contribute image parts to the panoramic images.

4. Move the arm to the new position.

5. Check in the loop if the arm is already in the new position. The communication between the program and the arm is written in the file for debugging purposes.

After the program exits the loop,it waits for 300 ms in order to stabilize the arm in the new position.

6. Repeat steps 2 to 5 until the last image is captured.

7. When the last image has been captured,contribute image parts to the panoramic images and save them.

(35)

2.7 Analysis of the system’s capabilities 21

We could achieve faster execution since our code is not optimized. For example,we did not optimize the waiting time (300 ms) after the arm is in the new position. No computations are done in parallel.

2.7.2 Influence of parameters r, ϕ and θ0 on the reconstruction accuracy

In order to estimate the depth as precisely as possible,the parameters involved in the calculation also have to be estimated precisely. In this section we reveal the methods used for estimation of parametersr,ϕand θ0.

θ0 denotes the angle corresponding to one pixel column of the captured image, for which we rotate the camera. It can be calculated from Eq. (2.3):

θ0= α

W. (2.10)

Forα = 34 and W=160 pixels,we get θ0 = 0.2125. On the other hand,we know that the accuracy of our rotational arm is ε = 0.0514285,so the best possible approximate value isθ0 = 0.205714. Since each column in the panoramic image in reality describes the latter angleθ0,we always use in calculationsθ0=n·ε,n∈IN, which is closest to the result obtained from Eq. (2.10). The experiment in Sec. 2.8.5 confirms that this decision is correct. To discriminate the two values between each other,let us mark them as θ0(α) (Eq. (2.10)) and θ0(ε) (the estimation based on the accuracy of our rotational arm). We use them from now on,but where onlyθ0

is given,thenθ0 =θ0(ε).

C O

r α

α d

di W

i

/2

mm grid paper

Figure 2.6: The relation between the parameters,which are important for determin- ing the radiusr.

r represents the distance between the rotational center of the system and the optical center of the camera. Since the exact position of the optical center is normally not known (not given by the manufacturer),we have to estimate its position. Optical firms with their special equipment would do the best job,but since this has not been an option for us,we have used a simple method,which has been proved quite useful

(36)

(Fig. 2.6): First the camera horizontal view angleα has been estimated. Then we have captured a few images of the mm grid paper from known distances di from one point on the camera to the paper. The optical axis has been assumed to be perpendicular to the paper surface. From each image we have read the width Wi

of it in mm and used all now known values (α,di and Wi) to estimate the distance dfrom the paper to the optical center by manually drawing a geometrically precise relation between the parameters. More distances di have been used to check the consistency of all estimates. At the end the position of the optical center has been calculated as an average over all estimated values. Because we know the distances di andd,we also know the position of the optical center with respect to the point on the camera from which we have measured the distancesdi. Finally,we can measure the distancer. Nevertheless,this is a rough estimation of the optical center position, but it can be optimized as shown in the experiment in Sec. 2.8.9.

ϕ determines the column of each captured image,which is mosaiced into the panoramic image. The two models for estimating angle ϕ (Eqs. (2.6) and (2.8)) differ from one another: the first one is linear,while the second one is not. But since we use cameras with the maximal horizontal view angle α = 39.72,the biggest possible difference between the models is only 0.3137 (at the point,where ratio W/W = 91/160). In the experiments we use such values of ϕthat the difference is very small,i.e. the biggest difference is lower than 0.1. The experiment in Sec.

2.8.6 shows that we obtain slightly better results with the linear model for a given (estimated) set of parameters. This is why the linear model was used in all other experiments.

We discuss the angle θ0 and the radiusr in relation with the one-pixel error in estimation of the angle ϕin the end of Sec. 2.7.4.

2.7.3 Constraining the search space on the epipolar line

Knowing that the width of the panoramic image is much bigger than the width of the captured image,we would have to search for a corresponding point along a very long epipolar line (Fig. 2.7a). Therefore we would like to constraint the search space on the epipolar line as much as possible. This means that the stereo reconstruction procedure executes faster. A side effect is also an increased confidence in the estimated depth.

From Eq. (2.5) we can derive two conclusions,which nicely constraint the search space:

1. Theoretically,the minimal possible estimation of depth is lmin = r. This is true forθ= 0. However,this is impossible in practice since the same point on the scene cannot be seen in the column that will be mosaiced in the panorama for the left eye and at the same time in the column that will be mosaiced in the panorama for the right eye. If we observe the horizontal axis of the panoramic

(37)

2.7 Analysis of the system’s capabilities 23

a) unconstrained length of the epipolar line: 1501 pixels

b) constrained length of the epipolar line: 145 pixels,2ϕ= 29.9625

c) constrained length of the epipolar line: 17 pixels,2ϕ= 3.6125 Figure 2.7: We can effectively constrain the search space on the epipolar line.

image with respects to the direction of the rotation,we can see that every point on the scene that is shown on both panoramic images (Fig. 2.4) is first imaged in the panorama for the left eye and then in the panorama for the right eye. Therefore,we have to wait until the point imaged in the column building up the left eye panorama moves in time to the column building up the right eye panorama. If θ0 presents the angle by which the camera is shifted,then 2θmin = θ0. In consequence,we have to make at least one basic shift of the camera to enable a scene point projected in a right column of the captured image forming the left eye panorama to be seen in the left column of the captured image forming the right eye panorama.

Based on this fact,we can search for the corresponding point in the right eye panorama starting from the horizontal image coordinate x+ θmin0 = x+ 1 forward,wherexis the horizontal image coordinate of the point in the left eye panorama for which we are searching the corresponding point. Thus,we get the value +1 since the shift for the angle θ0 describes the shift of the camera for a single column of the captured image.

In our system,the minimal possible depth estimationlmindepends on the value of the angle ϕ:

lmin(2ϕ= 29.9625) = 302 mm ...

lmin(2ϕ= 3.6125) = 318 mm.

(38)

Figure 2.8: Constraining the search space on the epipolar line in case of 2ϕ = 29.9625. In the left eye panorama (top image) we have denoted the point for which we are searching the corresponding point with a green cross. In the right eye panorama (bottom image) we have used green color to mark the part of the epipolar line on which the corresponding point must lie. The best corresponding point is marked with a red cross. With blue crosses we have marked a number of points which presented temporary best corresponding point before we actually found the point with the maximal correlation.

2. Theoretically,the estimation of depth is not constrained upwards,but from Eq. (2.5) it is evident that the denominator must be non-zero. Practically, this means that for the maximal possible depth estimationlmaxthe difference ϕ−θmax must be equal to the value in the interval (0,θ20). We can write this fact as: θmax=θ20,wheren=ϕdiv θ20 and ϕmod θ20 = 0.

If we write the constraint for the last point,which can be a corresponding point on the epipolar line,in analogy with the case of determining the starting point that can be a corresponding point on the epipolar line,we have to search for the corresponding point in the right eye panorama to including the horizontal image coordinatex+θmax

0 =x+n. Herexis the horizontal image coordinate of the point on the left eye panorama for which we are searching the corresponding point.

(39)

2.7 Analysis of the system’s capabilities 25

Equivalently,like in case of the minimal possible depth estimation lmin,the maximal possible depth estimation lmax also depends upon the value of the angle ϕ:

lmax(2ϕ= 29.9625) = 54687 mm ...

lmax(2ϕ= 3.6125) = 86686 mm.

In the following sections we show that we cannot trust the depth estimates near the last point of the epipolar line search space,but we have proven that we can effectively constrain the search space.

To illustrate the use of specified constraints on real data,let us present the following example which describes the working process of our system: while the width of the panorama is 1501 pixels,when searching for a corresponding point,we have to check only ϕdiv θ20 = 145 pixels in case of 2ϕ= 29.9625 (Figs. 2.7b and 2.8) and only 17 in case of 2ϕ= 3.6125 (Fig. 2.7c).

From the last paragraph we could conclude that the stereo reconstruction pro- cedure is much faster for a smaller angle ϕ. However,in the next section we show that a smaller angleϕ,unfortunately,has also a negative property.

2.7.4 Meaning of the one-pixel error in estimation of the angle θ

a) 2ϕ= 29.9625 b) 2ϕ= 3.6125

Figure 2.9: The dependence of depth l on angle θ (Eq. (2.5), r = 30 cm and two different values ofϕare used). To visualize the one-pixel error in estimation of the angle θ,we have marked the interval of width θ20 = 0.102857 between the vertical lines near the third point.

Let us first define what we mean under the term one-pixel error. As the images are discrete,we would like to know what is the value of the error in the depth estimation if we miss the right corresponding point for only one pixel. And we would like to have this information for various values of the angleϕ.

Reference

POVEZANI DOKUMENTI

Within the empirical part, the author conducts research and discusses management within Slovenian enterprises: how much of Slovenian managers’ time is devoted to manage

As shown in this article, this can be done by a value process aiming at developing new values within the enterprise, developing trust within the relationships among employees

The research attempts to reveal which type of organisational culture is present within the enterprise, and whether the culture influences successful business performance.. Therefore,

– Traditional language training education, in which the language of in- struction is Hungarian; instruction of the minority language and litera- ture shall be conducted within

A single statutory guideline (section 9 of the Act) for all public bodies in Wales deals with the following: a bilingual scheme; approach to service provision (in line with

We analyze how six political parties, currently represented in the National Assembly of the Republic of Slovenia (Party of Modern Centre, Slovenian Democratic Party, Democratic

The comparison of the three regional laws is based on the texts of Regional Norms Concerning the Protection of Slovene Linguistic Minority (Law 26/2007), Regional Norms Concerning

Following the incidents just mentioned, Maria Theresa decreed on July 14, 1765 that the Rumanian villages in Southern Hungary were standing in the way of German