This tutorial provides an guide to research on model reconstruction from surface measurements. Range sensors capture dense 3D point measurements of an objects surface. Reverse engineering of accurate 3D model of real objects from 3D surface measurements is recognised as an important research goal. Potential application domains include industrial product design and computer generated imagery for film and multimedia.
The general problem of model reconstruction from surface measurements does not assume any prior knowledge of the object shape or topology. Conventional range sensors capture 3D surface measurements with respect to a 2D image plane, this is referred to as a 2.5D range image. Multiple view range images are required to capture data for the entire surface of an object. Model reconstruction from multiple range images is addressed in three stages:
A review of each of the above areas is presented to give an overview of the problem and proposed solutions. At the end of each section references to key research papers are provided for further detailed information. A pictorial overview shows each stage in the process of model reconstruction for an arbitrary unknown object.
Recently hand-held range sensors with six degrees-of-freedom have been developed for capturing measurements of objects with complex surface shape. The hand-held sensor position and orientation are measured with respect to a global coordinate system. Model reconstruction from hand-held sensor data does not require a registration step. Viewpoint planning is also not required with a hand-held sensor as this function is performed by the operator who moves the sensor to capture data for the entire object surface. Model reconstruction from both conventional 2.5D range images and hand-held range sensor data is discussed.
The goal of registration is to transform sets of surface measurements into a common coordinate system. To capture a complete object surfaces multiple 2.5D range images from different viewpoints are required as illustrated .
Besl and McKay 1992 introduced the Iterative Closest Point (ICP) algorithm to register two sets of points on a free-form surface. ICP is 'a general-purpose, representation-independent method for the accurate and computationally efficient registration of 3-D shapes including free-form curves and surfaces'. Extensions of this algorithm are now widely used for registration of multiple sets of surface data. The original algorithm registers two point sets provided one is a subset of the other and that the transform between the sets is approximately known. The original ICP algorithm operates as follows given point set P and surface Q where P is a subset of Q :
This approach will converge to the nearest local minimum of the sum of squared distances between closest points. A good initial estimate of the transformation between point sets is required to ensure convergence to the correct registration. Incorrect registration may occur if the error in the initial transformation is too large (typically greater than 20 degrees) or if the surface is does not contain sufficient shape information for convergence.
Modifications to the original ICP algorithm have been made to improve the rate of convergence and register partially overlapping sets of points. Chen and Medioni 1992 demonstrated the registration of partially overlapping range image data. A modified cost function was used to compute the registration which minimises the squared distance in the direction of the surface normal. This cost function gives improved rates of convergence. Zhang 1994 uses a modified cost function based on robust statistics to limit the maximum distance between closed points allowing registration of partially overlapping data. Turk and Levoy 1994 modified the original ICP algorithm to register partially overlapping triangulated meshes constructed from range image data. Nearest point correspondences are not used if either point is on the mesh boundary or the distance exceeds a constant threshold. The modification to the original ICP are now widely used to achieve accurate registrations of pairs of overlapping range images.
The cost of performing ICP registration depends on the efficiency of the nearest point evaluation. Turk and Levoy 1994 used uniform spatial subdivision to partition the set of mesh vertices to achieve efficient local search. Similarly, Zhang 1994 used a K-dimensions tree to partition the point sets. Both approach give significant reduction in computational costs of registering large range image data sets.
Multiple range images are often required to capture the complete surface of an object. The extended ICP algorithm computes the registration between pairs of overlapping point sets. Registration of greater than two point sets into a common coordinate frame has been achieved by pairwise registration of pairs of overlapping point sets for example Chen and Medioni 1992. However, pairwise registration does not compute the optimal result. For example if we have three overlapping points sets and we compute the pairwise registration between sets one and two followed by pairwise registration between sets one and three then we do not necessarily minimise the mean square distance between sets two and three.
Turk and Levoy 1994 performed pairwise registration of multiple range images against a single cylindrical scan of the object. This approach is limited as in general a cylindrical scan of the entire object is not available. Blais and Levine 1995 report simultaneous pairwise registration of multiple view range images including registration of the first and last views. Bergevin et al.1996 registered multiple range images simultaneously using an extended ICP algorithm to minimise the sum of squared distances for all views. This approach ensures an even distribution of registration errors between overlapping views and achieves errors less than the range image measurement noise for multiple views of complex objects. Eggert et al.1996 and Stoddart et al. 1996 both perform registration of multiple overlapping point sets using a force-based optimisation. Interconnection between nearest points are represented by springs to give stable convergence to a local minimum for all point sets.
The accuracy of registration obtained using the ICP algorithm depends on the surface shape. If insufficient shape information is available then inaccurate or incorrect registration may occur. Pennec and Thirion 1996 developed a framework for characterisation of uncertainty in point set registration. Stoddart et al. 1996 extended this framework to the problem of surface registration and introduced a registration index to indicate the quality of registration based on the surface shape. Brujic and Ristic 1996 performed an empirical study of registration accuracy using a Monte Carlo simulation. This study confirmed that the registration accuracy depends on surface shape and that the ICP registration error converges to zero as the number of points increases. Pito 1997 presented a registration aid which can be placed in the scene with the object to ensure sufficient shape information for accurate registration of multiple range image views. The surface of the aid is designed such that a range image taken of it from any viewpoint can be accurately registered on a model of the aid.
The goal of geometric fusion is to integrate registered sets of surface measurements into a single 3D surface model. The generic problem of surface reconstruction is to estimate a manifold surface that approximates the unknown object surface from a set of sampled 3D points. This makes no prior assumptions about the surface shape or topology. Given a method we want to be able to specify conditions on the original surface and the measurement process that allow a correct approximation. Correct approximation of the object surface requires that the model has the same topology and is an accurate approximation of the surface geometry.
In the general problem of reconstruction of a manifold surface model from a set of unorganised points no information is available about measurement connectivity. Surface reconstruction from multiple 2.5D range images is a sub-problem which can also be addressed as a set of unorganised points. However, reconstruction from 2.5D range image is a more constrained problem as the 2D image structure can be used to estimate the local surface topology.
Reconstruction of an integrated 3D surface from multiple 2.5D range images has received considerable interest. Recently, fusion of surface measurements from hand-held sensor data has also been achieved. Two approaches have been proposed for fusion of multiple overlapping surface measurements into a single surface model:
A common step in the fusion of multiple 2.5D range images is an initial triangulation of each image to obtain a 3D mesh which approximates the local surface topology and geometry. A step discontinuity constrained triangulation is performed using the 2D image structure to estimate the local surface continuity as illustrated . A continuous surface between adjacent measurements is assumed if their distance is less than a constant threshold. Threshold setting should take into account two factors the measurement sampling resolution and the measurement uncertainty. Correct reconstruction requires that measurement are only connected if there is a high probability of them originating from the same surface. This assumes that a sufficiently small threshold exists that allows the detection of discontinuities. Further information on step discontinuity constrained triangulation is presented in Soucy and Laurendeau 1995, Turk and Levoy 1994 and Hilton et al. 1996.
Mesh integration techniques aim to merge multiple overlapping 3D triangulated meshes into a single triangulated mesh. The use of a triangulated model allows the representation of surfaces of arbitrary geometry and topology. The triangulation based integration algorithms all share a common first step of step discontinuity constrained triangulation of individual 2.5D range images.
Boissonnat 1984 and Rutishauser et al.1994 use a graph-based approach to determine correspondences between overlapping meshes. Retriangulation of two overlapping meshes to form a single mesh is then performed using local 2D constraints on triangle shape. This approach may fail to correctly reconstruct the surface in regions of high curvature and thin object sections.
Soucy and Laurendeau 1995 integrate triangulated meshes using canonic subsets of the Venn diagram. The canonic subsets each represent the overlap between a subset of the 2.5D range images and are associated with a 2D viewpoint reference frame. The 2D reference frames are used to eliminate redundant data and merge intersecting regions.
Turk and Levoy 1994 integrate overlapping triangulated meshes using a `zippering' approach. Overlapping meshes are eroded and the boundary correspondences found by operations in 3D space. A local 2D constrained triangulation is then used to join the overlapping mesh boundaries. This approach may fail for complex geometries if incorrect boundary correspondences are found. This can occur in regions of high surface curvature.
Each of these algorithms requires the 3D surface measurements to be projected to a 2D plane. For correct surface reconstruction it is required that the 3D to 2D projection is injective (i.e. it respects the spatial ordering of the surface measurements). This requirement results in an inherent limitation on the maximum surface curvature for correct reconstruction.
Mesh integration techniques enable fusion of multiple range images without loss of accuracy as the mesh vertices coincide with the measured data points. In general mesh integration is sensitive to erroneous measurements which may cause catastrophic failure. Pito 1996 addressed the sensitivity to erroneous measurements by incorporating visibility constraints in the mesh integration. Current mesh integration techniques are computationally expensive and do not allow reliable reconstruction from arbitrary surface measurements such as those obtained from a hand-held range sensor.
Volumetric fusion of surface measurements constructs an intermediate implicit surface which combines overlapping measurements into a single representation. The implicit surface representation is an iso-surface of a spatial field function, f(x,y,z)=constant. For example if the field function is defined as the distance to the nearest point on the object surface then the implicit surface is represented by all points where the field function is zero, f(x,y,z)=0. This representation allows modelling of unknown objects with arbitrary topology and geometry. This is distinct from implicit shape functions such as quadrics which assume a fixed topology and require a shape fitting procedure.
An example of the volumetric field function for range data of a real object is shown. The right hand side of the figure shows a rendered version of the triangulated mesh with a plane cutting through the volume. The left hand side of the figure shows a cross-section on the cutting plane through the volumetric field function for all points where the magnitude is less than a constant threshold. The colour indicates the magnitude of the field function red corresponds to maximum value blue to the minimum value and the yellow/green boundary to the implicit surface, f(x,y,z)=0.
Hoppe et al. 1992 introduced the use of an intermediate implicit surface representation for reconstruction from unstructured sets of 3D point. A Euclidean minimum spanning tree is constructed to estimate the local surface topology based on the point-to-point distance. A local planar fit to a point neighbourhood is used to estimate the surface orientation. An implicit surface representation is then constructed based on the zero-set of a signed field function which is the distance to the nearest point on the surface. Marching Cubes implicit surface polygonisation is then used to reconstruct an explicit triangulated surface model. Lorenson and Cline 1987 developed the Marching Cubes algorithm for triangulation of iso-surfaces in discrete volumetric field function representation such as 3D medical images. Implementations of Marching Cubes are widely available such as Bloomenthal 1994. There are two major limitations on this approach due to the computational cost of constructing the minimum spanning tree and restrictions on the object shape. Estimation of surface topology based on point-to-point distance limits the complexity of the surface geometry that is correctly reconstructed. This approach will fail for regions of high curvature or thin surface sections. To overcome this problem techniques have been developed to construct an intermediate implicit surface representation from multiple 2.5D range images using the image structure to estimate the local surface topology.
Hilton et al. 1996 introduced a continuous implicit surface representation constructed from step discontinuity constrained triangulations of multiple 2.5D range images. For a single triangulated mesh a scalar field function, f(x,y,z), is evaluated as the signed-distance to the nearest point on the measured surface triangulation. The resulting implicit surface, f(x,y,z)=0, is continuous and corresponds exactly to the original mesh. Fusion of multiple overlapping range images into a single implicit surface representation is performed by combining field functions evaluated for each range image triangulation. Combination of field functions ensures that overlapping measurements correspond to the same surface region using geometric constraints and statistical tests based on measurement uncertainty. Marching cubes implicit surface polygonisation is then used to generate a single triangulated mesh model of the object surface. This achieves reliable fusion of surface measurements for objects with complex geometry and topology without loss of accuracy.
Curless and Levoy 1996 introduced a discrete implicit surface representation constructed from constrained triangulations of multiple 2.5D range images by carving out the region of empty space between the sensor and the object surface. A run-length encoded volumetric data structure is used to achieve computationally efficient fusion and reduce storage costs. This resulted in an order of magnitude reduction in computational cost compared to all previous geometric fusion algorithms enabling model reconstruction from millions of points in seconds. However, the discrete implicit surface representation does not enable reliable geometric fusion for complex geometry such as regions of high curvature or thin surface sections. A secondary advantage of this approach is that the space carving enables filling of holes in the reconstructed model due to unseen surface regions or errors in the fusion. Roth and Wibowo 1997 achieved further reductions in computational cost by developing a discrete volumetric data structure based on points. This approach is demonstrated for a wide range of data sets including both unstructured points, triangulated mesh and range images.
Volumetric fusion of range data for the left and right results in a single integrated model . The right hand side of each figure shows the 3D triangulated mesh and the left hand side a cross section through the field function for an envelope around the surface. Fusion of the two overlapping field functions from the left and right images results in a single continuous implicit surface representation of the final model.
An inherent limitation of all geometric fusion algorithms based on an intermediate discrete volumetric data structure is a reduction in accuracy resulting in a loss of surface detail. The reconstructed model accuracy depends on the resolution of the discrete volumetric representation. However, the representation storage cost for a free-form object depends on the square (or higher power) of the resolution which is proportional to the surface area. Doubling the resolution achieves twice the accuracy for the reconstructed model but result in at least a factor four increase in storage costs. For a uniform discrete representation the resolution is determined by the smallest surface feature to be reconstructed. Representation costs are prohibitively expensive for accurate reconstruction of small surface features on large objects. This is a limiting factor for reliable reconstruction of detailed environment models and accurate reverse engineering.
Hilton and Illingworth 1997 introduced a multi-resolution geometric fusion algorithm to address this problem. A hierarchical discrete volumetric structure for implicit surface representation with bounded error was introduced. This approach constructs a discrete representation with low resolution in smooth surface regions and high resolution in regions of high surface detail such as crease edges. This approach achieved a factor three in representation without a significant reduction in the accuracy of the reconstructed model. Accurate and efficient reconstruction independent of object size remains an open research problem.
Reconstruction of a model for a complete object surface requires range images from multiple views. For an unknown object the number of views and their optimal positions is not known prior to data acquisition. Therefore, techniques are required for select the next best view based on measurement of part of an objects surface. The optimal set of views for capturing a complete object surface will depend on both the unknown surface shape and the known sensor geometry and degrees-of-freedom.
Connolly 1985 partitions the object space using an octree representation. The surface of the object is then scanned and regions close to surface label as seen, regions between the sensor and surface as empty and all other regions as unseen. A 'planetarium' algorithm is proposed to compute the next best view based on evaluating the visibility of the surface for each candidate sensor position. The next best view is selected as the scanner position which maximises the amount of unseen cubes. This algorithm is computationally expensive and does not incorporate the sensor geometry. A second 'normal' algorithm counts the faces of cubes which separate empty and unseen regions. The best scanner position is determined by the three directions with highest number of unseen/empty faces. This is a fast but simple approach which only determines a limited number of six views and does not test visibility.
Maver and Bajcsy 1993 developed a viewpoint planning algorithm for a light stripe range sensor constrained to translations and rotations in a plane above the object. Unseen regions of the objects are represented as polygons. Visibility constraints for the sensor to view the unseen region are computed from the polygon boundaries. However, this solution is limited to a particular sensor configuration.
Whaite and Ferrie 1994 fit parametric (superquadric) models to the sensed data. The next best view is chosen which minimises the uncertainty in the current model by scanning the model region where the data fits the model worst. The range sensor is moved in the direction which minimises the uncertainty in the current approximate model. This algorithm enables data-driven exploration of an object to build a model. This approach is limited by the use of a parametric model which can not accurately represent objects with a high-level of surface detail. Surface visibility constraints are also not incorporated in the viewplanning process resulting in limitations for complex shapes due to occlusion.
Pito 1996 present a general algorithm for viewpoint planning based on an intermediate position space representation of both sensor visibility constraints and unseen portions of the viewing volume. The next best view is determined as the sensor position which maximises the unseen portion of the object volume. Visibility constraints are only computed a limited number of times for each unseen surface region to avoid prohibitive computational cost. This approach is demonstrated to achieve automatic viewpoint planning for a range sensor constrained to move on a cylindrical path around the object.
All of the approaches described above address the problem of viewpoint selection when the sensor is outside the region of space occupied by the object. However, to achieve accurate measurement for complex objects it may be necessary for the sensor to enter the object space. This requires viewpoint planning algorithms which avoid collisions of the sensor with the object surface. Papadopoulos-Orfanos and Schmitt 1997 propose a path planning algorithm which avoids collisions for a sensor with three translational degrees-of-freedom. Next best view selection avoiding collisions for a sensor with six degrees-of-freedom remain an open problem.
An advantage of hand-held range sensors is that the next best view problem is solved by the operator who manually controls the sensor viewpoint. This enables reconstruction of models of complex objects where the sensor must enter the space occupied by the object.