|
HdrMatting
GSOC2010: HDR and matting operators for GEGL
For the first period of GSOC I concentrated on HDR workflow, and for the latter period on a matting operator.
The following sections describe each major piece of work, and provides examples of processed inputs. All images have been downscaled for space/bandwidth, but full scale images can be rebuilt locally using a the example package. High Dynamic Range workflow
While GEGL offers the advantage of high depth data types for processing, there were few areas where this capability was explicitly targeted in its operations. I proposed the development of a collection of native GEGL operations which target HDR processing.
To produce a useful HDR workflow, in terms of GEGL ops, we required the ability to load, store, create, and usefully process HDR images.
The bulk of this work was based on code from the pfscalibration project. RGBE file load
To effectively use HDR images we likely need some method of storing and loading this data for future use. To facilitate this I developed a source, and a sink operation for the RGBE file format. Choosing this format allows interoperation with common test data in the field.
The RGBE format uses a simple encoding scheme which allows extremely large ranges for component values, whie using a maximum of 4 bytes per pixel.
For each pixel of the image, we store a common 8 bit exponent representative of the pixel's channels (the 'E' component). To represent the colour channels we store the corresponding mantissas for R, G, and B in 8 bits.
The relevant operations are gegl:rgbe-load and gegl:rgbe-save, while the RGBE specific code is found in $(GEGL_ROOT)/libs/rgbe.
My implementation can load the RLE and uncompressed formats, and stores in the uncompressed format. Camera response curve recovery
An effective and popular method for obtaining HDR inputs is the combining of multiple exposures of a scene with varying brightness; storing the combined dynamic range of the input collection in one the image.
This technique is particularly popular as it can be easily applied with any camera supporting manual exposures.
Note how we lack some detail in the floor of the left-most image, while the ceiling is blown out in the right-most.
To combine these images we need to know how the camera's sensor responds to various intensities of light, ie. the response curve.
The estimated response curve from the test exposures on a Nikon D40. Note the trends for the darkest and brightest areas of the curve. Click for SVG. We use the algorithm of robertson03 to jointly solve for the camera's response curve and the combined scene pixels. This is an iterative algorithm, as with many others in the field, and starts with an initial candidate response curve. The algorithm allows us to specify a confidence weighting across the curve, minimising the inaccuracy often present in the extremes of the sensor's response curve (shown above).
Unfortunately, it was discovered that using this algorithm directly for constructing a final HDR image can create areas of false colours in the output. To counter this, we followed the approach of Luminance HDR, where we use the curve that we obtained using robertson03 as input to the exposure combining algorithm of debevec97. Usage
A dedicated tool for exposure combination, exp_combine, was created which will automatically extract the brightness information from the EXIF data of a list of images and process them using this operation. On the command line: $(GEGL_ROOT)/tools/exp_combine <output> <input> [<input> ...]
Alternatively the gegl:exp-combine operation may be used directly. Parameters
A space seperated string specifying the brightness of each image in EV. The number of quantization levels in the source image (ie, colour depth), specified in bits. For JPEGs this would be 8, however up to 14 bits are not uncommon when using RAW files. There should be little negative impact when overestimating this parameter. How highly the centre regions of the response curve are weighted for both curve estimation and exposure combining. This avoids using the more inaccurate portions of the input images. Tone-mapping operators (TMO)
After we have an HDR image, we need some method of restricting its high range brightness into a range which is displayable, while retaining its aethetics.
I chose to target three 'tone-mapping operators' for this task (designated by the author and year):
These operators were chosen for their diversity in usability, aethetics, and algorithms (and the amount I had used each previously in my own work). The bulk of this work was based on code from the pfstmo project.
Counter to my initial proposal, it was discovered that the creation of a dedicated image statistics operation (for the calculation of the global min, max, avg, etc) would only have benefited the reinhard05 operation. As this operation has by far the most efficient execution of the TMOs, the image statistics operation was deemed unnecessary at this point. reinhard05
This is a resolution independent filter, which does not take into account small scale variations across the image (a local operator) reinhard05.
It is one of the most intuitive algorithms; using limited and readily understood parameters. It attempts to model some mechanisms of the human visual system, and tends to produce more 'natural' outputs than fattal02 or mantiuk06. The image to the left can be re-created using the following
GEGL XML:
The operator works by weighting the global image brightness against the brightness of each pixel. This mixing is controlled by the light adaptation parameter (light). Intuitively it evens out differences in brightness across an image.
The chromatic adaptation parameter (chromatic) specifies how highly to weight each channel's intensity against the pixel, or global, luminance. Inituively it can be seen as evening the intensity of each colour channel.
The resulting pixels are then normalised to fill the range of displayable colour intensities. Parameters
The level of chromatic adaptation, coupling the scaling of all colour channels across the image. The level of light adaptation. Higher levels emphasise local scaling over global scaling Examples
Note the colours here cool as chromatic adaptation (chromatic) increases, while the light across the image equalizes as lightness (light) increases. fattal02
This is a resolution dependant operator outlined in fattal02; different results may be produced with an otherwise identical, though differently scaled, image. It is a local operator; considering each pixel and it's neighbours individually rather than the image as a whole.
This operation works primarily on the luminance of an image. It attempts to attenuate the size of luminance changes (gradients) across the image, with greater emphasis on larger gradients. The image to the left can be re-created using the following
GEGL XML:
A significant amount of the operation's complexity comes from the use of multi-resolution (pyramid) edge detection, allowing small and large gradients to be manipulated. To avoid introducing halos, the gradient scaling is applied as the results are propogated back up the pyramid.
The compressed luminance is then used to scale each channel's intensity for output, with saturation calculated purely on the individual input pixel. Parameters
The threshold at which luminance changes are considered for enhancement A threshold beneath which detail enhancements are attenuated to avoid introducing noise. If left at the default value of 0, this will be set proportionally to alpha Examples
Note the gradients are less distinct with a low alpha, while the contrast is greater with a larger beta. mantiuk06
The mantiuk06 algorithm is a local operator, and largely resolution independant. It produces results which emphasise the underlying detail of an image, and tends to give less saturation and brightness than reinhard05 or fattal02. The image to the left can be re-created using the following
GEGL XML:
As with fattal02, mantiuk06 operates almost exclusively on the luminance channel of an image. The operator works by converting an image into a gaussian pyramid of contrasts.
Each level is converted to a form which predicts the response the human visual system will give to the contrast. This form can be processed and then an approximately inverse procedure is followed to get the final image.
By scaling these predicted responses of the visual system, we effectively enhance low contrast areas (which we are most sensitive to), and compress high contrast areas (which we are less sensitive to).
The paper outlining the algorithm gives a selection of additional uses (such as colour-to-gray conversion and a histogram equalisation variant of the rescaling algorithm) which may be interesting future projects. Parameters
Examples
Image Matting
Image matting is concerned with the separation of foreground elements from background elements. Many techniques have been proposed, with some of the most broadly useful implementations incorporating user input in the form of a tri-map: specification of background, foreground, and uncertain regions.
This operator uses the levin06 algorithm, and was based upon the reference MATLAB implementation.
To select a region the user supplies a tri-map: an image which specifies known foreground, known background, and unknown elements of the image. In our implementation we receive a buffer with white as foreground, black as background, and transparent as unknown.
The optimal final alpha-mat using the above tri-map. A sparse system of cost functions, involving alpha, in the window surrounding each unknown pixel is created; the cost function makes the assumption that the contribution of the foreground and background pixels are approximately constant within each window.
By minimizing the linear system is we obtain the final alpha-mat. For practical reasons we use the UMFPACK solver and parts of the CBLAS, which are readily available for most current platforms. The image to the left can be re-created using the following
GEGL XML:
To make the operation less computationally and memory intensive, the parameter 'levels' allows the user to specify the number of times the image is down-sampled (halved resolution) before the operation is carried out. The result is propagated back to the original resolution.
The parameter 'active_levels' specifies how many of these down-sampled levels use the full solver, removing some of the errors which are introduced by up-sampling the lowest solution. Notes
unknowns x radius. This can be reduced by: blocky artefacts in the resultant alpha-mat. Parameters
The log value of the error term, higher values create less defined alpha-mats. How greatly the tri-map foreground and background marks are weighted. The number of times the input is sub-sampled, and the results up-sampled. The number of sub-sampled levels which the full solver uses The value at which to consider a sub-sampled result pixel to be part of the new tri-map. ExamplesSubsampling Effects
Increasing levels and active_levels can decrease the required resources, but also decreases quality.
Note the decrease in mat details as active_levels increases, and the appearance of dark regions as levels increases. Effect of weighting parametersVarying epsilon and lambda can produce finer results.
Note the region around the head as lambda increases, and the definition of the hair tuft as epsilon decreases.
Future Work
the matting operation as part of their calculations. As the matting operator was more complex than anticipated, these had to be dropped from the schedule. These are currently in development, and will be made available upon completion. some somewhat haphazard reference code. There are still areas of both which could be cleaned up for style and performance gains. investigated. The current implementation, while very useful, could be made less computationally and memory intensive. and matting-levin could allow other operations to more easily incorporate this optimisation. Thanks
A big thanks to my mentor Martin, for his patience and feedback; and to everyone in #gegl, for making me feel welcome! Hopefully I can repay you all by refining the current work, and continuing in new areas. References
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
► Sign in to add a comment






















































I don't understand this, but it looks very interesting! Keep up the good work!
Cool I can't wait for these features to show up in gimp :D
Great, I have seen a lot of HDR software and they cost a lot. If you can add the painterly effects, saturation and kind of that effects that cool. Keep up the good work, I cannot wait to see this integrated into GIMP.