Current & Past Projects

This page is out of date and circa 2009 - it will be updated over the summer.

This collection of illustrated abstracts describes some recent projects I have worked on - many are still ongoing. For more information on a particular project, either visit the project's page (if there is one) or download the papers which should all be online. You can also contact me by email.

 

3D Snapshots for Mobile Augmented Reality

Mobile AR / 3D Snapshotting Mobile AR / 3D Snapshotting In collaboration with Vodafone R&D, Munich and partners at Technische Universitat Munchen we are developing a mobile application enabling the capture of 3D objects using cameraphones, and their rendering "in situ" using a video-rate augmented reality application running on the phone. Technical problems to overcome include automatic segmentation of the object, dealing with the poor optics in most mobile cameraphones, tracking highly agile motion, and the engineering challenges involved in video-rate processing on a mobile. The UK team comprising colleague Yi-Zhe Song and I are involved in the 3D reconstruction aspects.

Keitler et al. Mobile augmented reality based 3D snapshots. Submitted to ISMAR 2009

 

Artistic Rendering of Consumer Video

Artistic Rendering of Consumer Video Falling hardware costs have prompted an explosion in casual video capture by domestic consumers. However, once captured, this video is infrequently accessed and often lies dormant on users' PCs. In collaboration with the MIUL team at HP Labs Bristol, We are undertaking a programme of research to breathe life into consumer video repositories, allowing users to effortlessly visualize and 'rediscover' their video collections. We are exploring ways to create compelling displays of casually captured video content, and investigate techniques to intelligently select content for display (e.g. through visual information retrieval, and automatic video directing).

T. Wang, A. Mansfield, R. Hu, J. P. Collomosse. An Evolutionary Approach to Automatic Video Editting. Submitted to CVMP 2009

 

Reverse Storyboarding for Sketch Based Video Retrieval

Sketch Based Video Retrieval Sketch Based Video Retrieval We are developing a Content Based Video Retrieval (CBVR) system that harnesses storyboards as an efficient and intuitive input mechanism for querying video databases. Storyboards encapsulate high level semantics describing both objects in the scene (spatially) and their movement (dynamics). Dynamics are depicted using a rich vocabulary of motion cues borrowed from animation; streak and ghosting lines, deformations as well as conventional indicators such as arrowheads. Fusing spatial and dynamic cues for CBVR promises advantages not only in terms of usability, but also constraining search to yield improved precision and performance over conventional approaches.

J. P. Collomosse, G. McNeill, Y. Qian. Storyboard sketches for content based video retrieval. Intl. Conf on Computer Vision (ICCV) (Sep. 2009).
J. P. Collomosse, G. McNeill, Y. Qian. Motion sketches for video retrieval. BMVA Symposium on Machine Learning (talk/non-publication).
J. P. Collomosse, G. McNeill, L. Watts. Free-hand sketch grouping for video retrieval. Intl. Conf on Pattern Recognition (ICPR) (Dec. 2008).
G. McNeill and J. P. Collomosse. Reverse Storyboarding for Video Retrieval Proc. Conf. Visual Media Production (CVMP). 2007. Work in progress poster.

 

Video-rate Mobile Code Reading (Symbian/WM5)

Mobile Code Reading Camera equipped mobile phones are ubiquitious, but there has been limited exploitation of these devices beyond the space of casual photo-driven applications. This project explored the use of visual tags (mobile codes) to link physical objects e.g. posters to digital content. I was responsible for designing optimized software capable of reading tags at video rates (~10 fps) on small footprint ARM mobile devices (Symbian, WM5/.Net CF). As well as supporting common ISO code standards, a novel form of code was developed that traded space for time to support for applications demanding high data densities and improved aesthetics.

This project is described in more detail on the industrial secondment page.

J. P. Collomosse and T. Kindberg Screen Codes: Visual Hyperlinks for Displays Proc. HOTmobile 2008 (Springer LNCS), to appear (Feb 2008).
J. P. Collomosse and T. Kindberg Screen Codes: Efficient Data Transfer from Video Displays to Mobile Devices 3rd European Conf. on Visual Media Production (CVMP). (November 2007). Work in Progress paper.

 

Artistic Projection (RTCams)

RTCam Rendering RTCam Rendering RTCam Rendering RTCams (Rational Tensor Cameras) are a simple but versatile camera model that subsume several important contemporary camera models in both Computer Graphics and Vision. They can be used alone, or compounded to produce more complicated visual effects. In this project, led by colleague Peter Hall, we developed RTCams and the frameworks to use them to generate synthetic artwork exhibiting novel perspective effects, from photographs. Steps involved along the way included mosaicing, 3D dense reconstruction, matching, and development of novel warping models. Previous image-based NPR was constrained to the projection inherent in the source photograph, which is most often linear. RTcams lift this restriction and so contribute to NPR via multi-perspective projection.
Photo left: Various RTCam effects on a 3D reconstructed Rubik's cube. The UI involved in manipulating an RTCam rendering - resulting in a child's crayon rendering of my neighbours house.

P. M. Hall, J. P. Collomosse, P. Shen, and Y. Z. Song. RTcams: A New Perspective on Non-Photorealistic Rendering from Photographs IEEE Transactions on Visualization and Computer Graphics (TVCG). 13(5) pp.966-979. September 2007.

 

Mobile Landmark Recognition for Context Discovery

Mobile Landmark Recognition / CBIR The inclusion of cameras in mobile devices has great potential for novel, context-aware pervasive applications. We have developed a prototype image recognition system capable of identifying landmarks (typically, city buildings and structures) from photographs captured on camera phones. Salient characteristics of the skyline are extracted and correlated using a minimal edit distance metric to identify a subset of potential matches in a database. This subset is then refined using a slower but more precise feature comparison to identify the best match. Our system is envisaged as an alternative to GPS fix e.g. indoors or where orientation may be a complicating factor (several landmarks at a single location).

J. P. Collomosse, K. Al Mosawi, and E. O'Neill. Viewpoint Invariant Image Retrieval for Context in Urban Environments Proc. Conf. Visual Media Production (CVMP). 2006.
Download the CVMP poster

 

Video Paintbox (Video Stylization)

Video paintbox Video paintbox Video paintbox The Video Paintbox is an integrated suite of novel NPR frameworks, capable of transforming real video clips into artistically stylised animations. Our automated system is able to render video into a wide gamut of visual styles, encompassing cartoon-shading, oil painting, and sketching, and is also able to emphasise motion using traditional animation cues such as streak-lines, anticipation and deformation.
We argue that comprehensive video analysis forms a necessary first step in the artistic rendering (AR) process; salient information (such as object boundaries or trajectories) must be extracted prior to re-presentation in an artistic style. Prior to our work, AR algorithms performed a significantly lower level of analysis . for example, simply distorting small image regions to form brush strokes, and rendering frames independently. This resulted in animations of poor aesthetic quality and restricted style, often exhibiting a distracting flicker that required many hours of manual correction. We have mitigated many of these long-standing problems; for example reducing flicker in animations by an order of magnitude. Our results enjoyed media exposure both on broadcast television (e.g. CNN, DW-TV) and in print (e.g. The Times).

Selected papers (see here for full list):

J. P. Collomosse, D. Rowntree and P. M. Hall. Stroke Surfaces: Temporally Coherent Non-photorealistic Animations from Video. IEEE Transactions on Visualization and Computer Graphics, 11(5), pp.540-549. IEEE. ISSN: 1077-2626. (September 2005)
J. P. Collomosse and P. M. Hall. Motion analysis in video: dolls, dynamic cues and Modern Art Proc. Video, Vision and Graphics (VVG). pp 109-116. Eurographics Assoc. (July 2005).
J. P. Collomosse and P. M. Hall. A Mid-level Description of Video, with Application to Non-photorealistic Animation Proceedings 15th British Machine Vision Conference (BMVC), vol.1, pp. 7-16. BMVA Press. (September 2004). Joint Winner: BMVA Industry Prize 2004.
J. P. Collomosse, D. Rowntree and P. M. Hall. Video analysis for Cartoon-style Special Effects In Proceedings 14th British Machine Vision Conference (BMVC), vol.2, pp. 749-758. BMVA Press. (September 2003). Winner: BMVA Industry Prize 2003.

 

Autonomous Underwater Vehicle (AUV) - Real-time Vision

SAUC-E AUV Entry at Pinewood Studios Each year Bath enters the Submersible AUV (SAUC-E) competition run by the MOD/DSTL. I supervise design of the vision system, which performs real-time pattern recognition and tracking to guide the robot through various tasks. Usually we have one or two students on the vision team to perform implementation of the AI and Vision algorithms, as part of their final year dissertation. Last year we came up with a number of interesting approaches to locating buoys and target gates using incremental learning approaches. Incredibly this runs in real-time using OpenCV over an onboard WinNT box, but this year we are moving to a lighter-weight Linux distro!
Photo left: William Meggill and Paul Riggs (with AUV), Chris Wallis and I training the vision system via a surface WiFi link to the AUV.

Visit BURST project page

 

Empathic Painting (Expression Recognition)

User interacts with the Empathic Painting
Empathic Painting example states Goal directed painterly rendering
This project was undertaken in collaboration with Boston University and funded by the VVG Network. The "Empathic Painting" is an interactive painterly rendering whose appearance adapts in real time to reflect the perceived emotional state of the viewer. The empathic painting is an experiment into the feasibility of using high level control parameters (namely, emotional state) to replace the plethora of low-level constraints users typically set to affect artistic rendering algorithms.
We developed a suite of Computer Vision algorithms capable of recognising users' facial expressions through the detection of facial action units derived from FACS. Action units are mapped to a space representing emotional state, from which we in turn derive a mapping to the style parameters of a fast segmentation-based painterly rendering algorithm. The result is a digital canvas capable of smoothly varying its painterly style at approximately 4 frames per second, providing a novel user interactive experience using only commodity hardware.
As an adjunct to this project we also explored Sims-like evolutionary algorithms to allow users to interactively specify painterly style.

J. P. Collomosse. Supervised genetic search for parameter selection in painterly rendering Lecture Notes in Computer Science (Proc. EvoMUSART), vol. 3907, pp. 599-610. Springer-Verlag. (April 2006). Winner: Best Paper 2006.
M. Shugrina, M. Betke and J. P. Collomosse. Empathic Painting: Interactive stylization using observed emotional state 4th Intl. Symposium on Non-photorealistic Animation and Rendering (NPAR). pp. 87-96. ACM Press. (June 2006)

 

Genetic Paint (Salience adaptive painting) - Paint by Search

Genetic Paint
Genetic Paint
When an artistic paints a scene, he/she does not paint every detail but instead a salient abstraction created through perception of the scene. This project demonstrated the first automatic painting algorithms based on image salience. An early pilot study explored a rarity based model of salience. This definition was later expanded to a trainable model of salience, to account for the subjectivity and task dependency of any such definition. The rendering process itself is treated as a search (optimization), for a painting in which the level of detail of a painted region corresponds to the perceptual salience of that region in the original photograph. You can read more about this process on the project page or the publications below.

J. P. Collomosse and P. M. Hall. Salience-adaptive Painterly Rendering using Genetic Search Intl. Journal on Artificial Intelligence Tools (IJAIT), 15(4) pp.551-576. World Scientific. ISSN: 0218-2130. (August 2006)
J. P. Collomosse and P. M. Hall. Genetic Paint: A Search for Salient Paintings Lecture Notes in Computer Science (Proc. EvoMUSART), vol. 3449, pp. 437-447. Springer-Verlag. (March 2005). Runner-up: Best Paper 2005.
P. M. Hall, M. J. Owen and J. P. Collomosse. A Trainable Low-level Feature Detector Proceedings Intl. Conference on Pattern Recognition (ICPR), vol.1, pp. 708-711. IEEE. (August 2004).
J. P. Collomosse and P. M. Hall. Painterly Rendering using Image Salience In Proceedings 20th Eurographics UK Conference, pp. 122-128. Eurographics. Leicester. Eurographics. (June 2002).Winner: Best student paper.

 

Cubist Style Rendering

Synthetic Picasso-esque Guitar
Synthetic Picasso-esque John
We developed a technique that accepted a set of photographs of identical subject, taken from multiple points of view, then identified, distorted and composited salient features within those images to produce Cubist style paintings.
Prior image-based NPR followed Haeberli's stroke-based paradigm; small atomic primitives (strokes) are composited on a canvas to generate artwork. The attributes of strokes are point-sampled from local image regions. This approach successfully emulates a traditional artistic styles, e.g. impressionism.
Our work asked: (a) whether there was any merit in using higher level features (e.g. eyes or noses) as NPR rendering primitives - with an aim to producing abstract artistic compositions.; (b) whether control of the artwork could be exerted at a compositional (high) level, rather than at the low level of stroke parameters such as length, angle, etc.

Visit the Cubist Rendering Project page

Front page article in the Times Higher (THES) 3rd January 2003 - Cubist rendering of Charles Clarke MP

J. P. Collomosse and P. M. Hall. Cubist Style Rendering of Photographs. IEEE Transactions on Visualization and Computer Graphics (TVCG), 9(4), pp.443-453. (October 2003)

 

Artistic Video Conferencing (Painting for Compression)

NPR Video Conferencing
NPR Video Conferencing
Creating artwork is a process of abstraction; paintings and drawings are compact representations of salience scene content. This project explored the possibility of exploiting the compactness of an animated sequence of drawings/paintings to facilitate low-bandwidth video conferencing. In collaboration with my M.Sc. students Chengcheng Li and Amber Pachuri we developed efficient video-rate painterly rendering algorithms and wrapped these in a GUI and custom application layer protocol over TCP/IP to build a prototype artistic video conferencing system. Various comparisons were made with H264 and other photorealistic video codecs often used in video conferencing.

Artistic Rendering Video Conferencing System M.Sc. Thesis Chengcheng Li. 2005.
Performance of Non-photorealistic video conferencing M.Sc. Thesis Amber Pachuri. 2006.

 

Automated Shredded Document Recovery & Jigsaw Solution

Jigsaw Solution
Shredded Document Recovery
This project explored the application of Evoluationary Search strategies (such as Genetic Algorithms) to non-overlapping image stitching (mosaicing). Two applications were identified; shredded document recovery (1D stitching) and automated jigsaw solution (2D stitching). In collaboration with my B.Sc. project students T. Porteous and A. Skeoch we developed a number of fast search strategies able to reconstruct documents/jigsaws in the order of around fifty pieces with good accuracy.
 

Projector-Camera Systems (libprojCalib)

Digital Finger Painting
projCalib calibration
I have supervised a number of student projects exploring projector-camera systems to create interactive systems driven by Computer Vision. Two successful examples were the Digital Fingerpainting (M.Sc.) project by Alia Alabdoli and the Projector Driven Touch Screen (B.Sc.) by Daniel Sanders. These projects were built over my projCalib library, which can automatically determine relative geometry within the system from unknown screen, projector and camera positions and orientations. This is necessary to establish a common coordinate system between the camera and projector, required by most applications. This enables the systems built upon projCalib to be easily and quickly deployed in temporary installations.
 

Licence Plate Recognition System (LPRS)

Licence Plate Location on a Mini

This project is old but is included for nostalgic reasons as it was my B.Sc. dissertation/project. The system uses unashameadly bottom up processing approaches to find UK vehicle licence plates in VGA resolution images. A multi-classifier "voting" approach to Optical Character Recognition was then used to read candidate plate areas. There were three classifiers in the stack, using moment based shape descriptor and template matching. It could cope with poor illumination on the plate and so some extent with "vanity" plates with italic fonts.

Licence Plate Recognition System (LPRS) Documentation (on request)
Licence Plate Recognition System (LPRS) Appendices (on request)