doc anchors and cleanup

author: nicholas-leonard <nick@nikopia.org> 2014-10-16 06:56:25 +0400
committer: nicholas-leonard <nick@nikopia.org> 2014-10-16 06:56:25 +0400
commit: 5967a156a7d39b0d551716c46a66c01a7d787e5d (patch)
tree: f29d6ce3037d599169db0f65d03a1928b5ccf618 /README.md
parent: 0994bad0134266d3d557eb2b5a2132932d82ddb6 (diff)
1 files changed, 93 insertions, 31 deletions
diff --git a/README.md b/README.md
index 3c07fc0..6618cc5 100644
--- a/README.md
+++ b/README.md
@@ -1,16 +1,22 @@
 # image Package Reference Manual #
-Unless speficied otherwise, this package deals with images of size 
-`nChannel x height x width`.
+__image__ is the [Torch7 distribution](http://torch.ch/) package for processing 
+images. It contains a wide variety of functions divided into the following categories:
  * [Saving and loading](#image.saveload) images as JPEG, PNG, PPM and PGM;
  * [Simple transformations](#image.simpletrans) like translation, scaling and rotation;
  * [Parameterized transformations](#image.paramtrans) like convolutions and warping;
  * [Graphical user interfaces](#image.grapicalinter) like display and window;
  * [Color Space Conversions](#image.colorspace) from and to RGB, YUV, Lab, and HSL;
- * [Constant Tensors](#image.constanttensor) like Lenna, Fabio and Gaussian and Laplacian kernels;
+ * [Tensor Constructors](#image.tensorconst) for creating Lenna, Fabio and Gaussian and Laplacian kernels;
+
+Note that unless speficied otherwise, this package deals with images of size 
+`nChannel x height x width`.
 
 <a name="image.saveload"/>
 ## Saving and Loading ##
+This sections includes functions for saving and loading different types 
+of images to and from disk.
 
+<a name="image.load"/>
 ### [res] image.load(filename, [depth, tensortype]) ###
 Loads an image located at path `filename` having `depth` channels (1 or 3)
 into a [Tensor](https://github.com/torch/torch7/blob/master/doc/tensor.md#tensor)
@@ -27,6 +33,7 @@ The returned `res` Tensor has size `nChannel x height x width` where `nChannel`
 1 (greyscale) or 3 (usually [RGB](https://en.wikipedia.org/wiki/RGB_color_model) 
 or [YUV](https://en.wikipedia.org/wiki/YUV).
 
+<a name="image.save"/>
 ### image.save(filename, tensor) ###
 Saves Tensor `tensor` to disk at path `filename`. The format to which 
 the image is saved is extrapolated from the `filename`'s extension suffix.
@@ -34,17 +41,22 @@ The `tensor` should be of size `nChannel x height x width`.
 
 <a name="image.simpletrans"/>
 ## Simple Transformations ##
+This section includes simple but very common image transformations 
+like cropping, translation, scaling and rotation. 
 
+<a name="image.crop"/>
 ### [res] image.crop([dst,] src, x1, y1, [x2, y2]) ###
 Crops image `src` at coordinate `(x1, y1)` up to coordinate 
 `(x2, y2)`. If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.translate"/>
 ### [res] image.translate([dst,] src, x, y) ###
 Translates image `src` by `x` pixels horizontally and `y` pixels 
 vertically. If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.scale"/>
 ### [res] image.scale(src, width, height, [mode]) ###
 Rescale the height and width of image `src` to have 
 width `width` and height `height`.  Variable `mode` specifies 
@@ -64,14 +76,17 @@ width of the output, respectively.
 Rescale the height and width of image `src` to fit the dimensions of 
 Tensor `dst`. 
 
+<a name="image.rotate"/>
 ### [res] image.rotate([dst,], src, theta) ###
 Rotates image `src` by `theta` radians. 
 If `dst` is specified it is used to store the results of the rotation.
 
+<a name="image.hflip"/>
 ### [res] image.hflip([dst,] src) ###
 Flips image `src` horizontally (left<->right). If `dst` is provided, it is used to
 store the output image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.vflip"/>
 ### [res] image.vflip([dst,], src) ###
 Flips image `src` vertically (upsize<->down). If `dst` is provided, it is used to
 store the output image. Otherwise, returns a new `res` Tensor.
@@ -96,6 +111,7 @@ When `saturate=true`, the result of the compression is passed through
 When provided, Tensor `tensorOut` is used to store results. 
 Note that arguments should be provided as key-value pairs (in a table).
 
+<a name="image.gaussianpyramid"/>
 ### [res] image.gaussianpyramid([dst,] src, scales) ###
 Constructs a [Gaussian pyramid](https://en.wikipedia.org/wiki/Gaussian_pyramid)
 of scales `scales` from a 2D or 3D `src` image or size 
@@ -110,7 +126,11 @@ Internally, this function makes use of functions [image.gaussian](#image.gaussia
 
 <a name="image.paramtrans"/>
 ## Parameterized transformations ##
+This section includes functions for performing transformations on 
+images requiring parameter Tensors like a warp `field` or a convolution
+`kernel`.
 
+<a name="image.warp"/>
 ### [res] image.warp([dst,]src,field,[mode,offset,clamp]) ###
 Warps image `src` (of size`KxHxW`) 
 according to flow field `field`. The latter has size `2xHxW` where the 
@@ -124,6 +144,7 @@ Permitted values are strings *clamp* (the default) or *pad*.
 If `dst` is specified, it is used to store the result of the warp.
 Otherwise, returns a new `res` Tensor.
 
+<a name="image.convolve"/>
 ### [res] image.convolve([dst,] src, kernel, [mode]) ###
 Convolves Tensor `kernel` over image `src`. Valid string values for argument 
 `mode` are :
@@ -135,24 +156,7 @@ Note that this function internally uses
 If `dst` is provided, it is used to store the output image. 
 Otherwise, returns a new `res` Tensor.
 
-<a name="image.toDisplayTensor"/>
-### [res] image.toDisplayTensor(input, padding, nrow, scaleeach, min, max, symmetric, saturate) ###
-Returns a single `res` Tensor that contains a grid of all in the images in `input`.
-The latter can either be a table of image Tensors of size `height x width` (greyscale) or 
-`nChannel x height x width` (color), 
-or a single Tensor of size `batchSize x nChannel x height x width` or `nChannel x height x width` 
-where `nChannel=[3,1]`, `batchSize x height x width` or `height x width`.
-
-Unless `input` is a table and `scaleeach=false` (the default), all detected images 
-are compressed with successive calls to [image.minmax](#image.minmax):
-```lua
-image.minmax{tensor=input[i], min=min, max=max, symm=symmetric, saturate=saturate}
-```
-`padding` specifies the number of padding pixels between images. The default is 0.
-`nrow` specifies the number of images per row. The default is 6.
-
-Note that arguments can also be specified as key-value arguments (in a table).
-
+<a name="image.lcn"/>
 ### [res] image.lcn(src, [kernel]) ###
 Local contrast normalization (LCN) on a given `src` image using kernel `kernel`.
 If `kernel` is not given, then a default `9x9` Gaussian is used 
@@ -162,10 +166,19 @@ To prevent border effects, the image is first global contrast normalized
 (GCN) by substracting the global mean and dividing by the global 
 standard deviation.
 
+Then the image is locally contrast normalized using the following equation:
 ```lua
 res = (src - lm(src)) / sqrt( lm(src) - lm(src*src) )
 ```
+where `lm(x)` is the local mean of each pixel in the image (i.e. 
+`image.convolve(x,kernel)`) and  `sqrt(x)` is the element-wise 
+square root of `x`. In other words, LCN performs 
+local substractive and divisive normalization. 
 
+Note that this implementation is different than the LCN Layer defined on page 3 of 
+[What is the Best Multi-Stage Architecture for Object Recognition?](http://yann.lecun.com/exdb/publis/pdf/jarrett-iccv-09.pdf).
+
+<a name="image.erode"/>
 ### [res] image.erode(src, [kernel, pad]) ###
 Performs a [morphological erosion](https://en.wikipedia.org/wiki/Erosion_(morphology)) 
 on binary (zeros and ones) image `src` using odd 
@@ -174,6 +187,7 @@ The default is a kernel consisting of ones of size `3x3`. Number
 `pad` is the value to assume outside the image boundary when performing 
 the convolution. The default is 1.
 
+<a name="image.dilate"/>
 ### [res] image.dilate(src, [kernel, pad]) ###
 Performs a [morphological dilation](https://en.wikipedia.org/wiki/Dilation_(morphology)) 
 on binary (zeros and ones) image `src` using odd 
@@ -184,9 +198,34 @@ the convolution. The default is 0.
 
 <a name="image.grapicalinter"/>
 ## Graphical User Interfaces ##
-The following functions require package [qtlua](https://github.com/torch/qtlua).
+The following functions, except for [image.toDisplayTensor](#image.toDisplayTensor), 
+require package [qtlua](https://github.com/torch/qtlua) and can only be 
+accessed via the `qlua` Lua interpreter (as opposed to the 
+[th](https://github.com/torch/trepl) or luajit interpreter).
+
+<a name="image.toDisplayTensor"/>
+### [res] image.toDisplayTensor(input, [...]) ###
+Optional arguments `[...]` expand to `padding`, `nrow`, `scaleeach`, `min`, `max`, `symmetric`, `saturate`.
+Returns a single `res` Tensor that contains a grid of all in the images in `input`.
+The latter can either be a table of image Tensors of size `height x width` (greyscale) or 
+`nChannel x height x width` (color), 
+or a single Tensor of size `batchSize x nChannel x height x width` or `nChannel x height x width` 
+where `nChannel=[3,1]`, `batchSize x height x width` or `height x width`.
+
+Unless `input` is a table and `scaleeach=false` (the default), all detected images 
+are compressed with successive calls to [image.minmax](#image.minmax):
+```lua
+image.minmax{tensor=input[i], min=min, max=max, symm=symmetric, saturate=saturate}
+```
+`padding` specifies the number of padding pixels between images. The default is 0.
+`nrow` specifies the number of images per row. The default is 6.
+
+Note that arguments can also be specified as key-value arguments (in a table).
 
-### [res] image.display(input, zoom, min, max, legend, w, ox, oy, scaleeach, gui, offscreen, padding, symm, nrow) ###
+<a name="image.display"/>
+### [res] image.display(input, [...]) ###
+Optional arguments `[...]` expand to `zoom`, `min`, `max`, `legend`, `win`, 
+`x`, `y`, `scaleeach`, `gui`, `offscreen`, `padding`, `symm`, `nrow`.
 Displays `input` image(s) with optional saturation and zooming. 
 The `input`, which is either a Tensor of size `HxW`, `KxHxW` or `Kx3xHxW`, or list,
 is first prepared for display by passing it through [image.toDisplayTensor](#image.toDisplayTensor):
@@ -200,67 +239,86 @@ The resulting `input` will be displayed using [qtlua](https://github.com/torch/q
 The displayed image will be zoomed by a factor of `zoom`. The default is 1.
 If `gui=true` (the default), the graphical user inteface (GUI) 
 is an interactive window that provides the user with the ability to zoom in or out. 
-This can be turned off for a faster display.
+This can be turned off for a faster display. `legend` is a legend to be displayed,
+which has a default value of `image.display`. `win` is an optional qt window descriptor.
+If `x` and `y` are given, they are used to offset the image. Both default to 0.
+When `offscreen=true`, rendering (to generate images) is performed offscreen.
 
-
-### [window, painter] image.window(resize, mousepress, mousedoublepress) ###
-Creates a window context for images.
+<a name="image.window"/>
+### [window, painter] image.window([...]) ###
+Creates a window context for images. 
+Optional arguments `[...]` expand to `hook_resize`, `hook_mousepress`, `hook_mousedoublepress`.
+These have a default value of `nil`, but may correspond to commensurate qt objects.
 
 <a name="image.colorspace"/>
 ## Color Space Conversions ##
+This section includes functions for performing conversions between 
+different color spaces.
 
+<a name="image.rgb2lab"/>
 ### [res] image.rgb2lab([dst,] src) ###
 Converts a `src` RGB image to [Lab](https://en.wikipedia.org/wiki/Lab_color_space). 
 If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.rgb2yuv"/>
 ### [res] image.rgb2yuv([dst,] src) ###
 Converts a RGB image to YUV. If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.yuv2rgb"/>
 ### [res] image.yuv2rgb([dst,] src) ###
 Converts a YUV image to RGB. If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.rgb2y"/>
 ### [res] image.rgb2y([dst,] src) ###
 Converts a RGB image to Y (discard U and V). 
 If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.rgb2hsl"/>
 ### [res] image.rgb2hsl([dst,] src) ###
 Converts a RGB image to [HSL](https://en.wikipedia.org/wiki/HSL_and_HSV). 
 If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.hsl2rgb"/>
 ### [res] image.hsl2rgb([dst,] src) ###
 Converts a HSL image to RGB. 
 If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.rgb2hsv"/>
 ### [res] image.rgb2hsv([dst,] src) ###
 Converts a RGB image to [HSV](https://en.wikipedia.org/wiki/HSL_and_HSV). 
 If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.hsv2rgb"/>
 ### [res] image.hsv2rgb([dst,] src) ###
 Converts a HSV image to RGB. 
 If `dst` is provided, it is used to store the output
 image. Otherwise, returns a new `res` Tensor.
 
+<a name="image.rgb2nrgb"/>
 ### [res] image.rgb2nrgb([dst,] src) ###
 Converts an RGB image to normalized-RGB. 
 
-
-## Constant Tensors ##
-The following functions construct Tensor constants like Gaussian or 
+<a name="image.tensorconst"/>
+## Tensor Constructors ##
+The following functions construct Tensors like Gaussian or 
 Laplacian kernels, or images like Lenna and Fabio.
 
+<a name="image.lena"/>
 ### [res] image.lena() ###
 Returns the classic `Lenna.jpg` image as a `3 x 512 x 512` Tensor.
 
+<a name="image.fabio"/>
 ### [res] image.fabio() ###
 Returns the `fabio.jpg` image as a `257 x 271` Tensor.
 
+<a name="image.gaussian"/>
 ### [res] image.gaussian([size, sigma, amplitude, normalize, [...]]) ###
 Returns a 2D [Gaussian](https://en.wikipedia.org/wiki/Gaussian_function) 
 kernel of size `height x width`. When used as a Gaussian smoothing operator in a 2D 
@@ -285,6 +343,7 @@ of it.
 
 Note that arguments can also be specified as key-value arguments (in a table).
 
+<a name="image.gaussian1D"/>
 ### [res] image.gaussian1D([size, sigma, amplitude, normalize, mean]) ###
 Returns a 1D Gaussian kernel of size `size`, mean `mean` and standard 
 deviation `sigma`. 
@@ -299,6 +358,7 @@ while a standard deviation of 0.25 is a quarter of it.
 
 Note that arguments can also be specified as key-value arguments (in a table).
 
+<a name="image.laplacian"/>
 ### [res] image.laplacian([size, sigma, amplitude, normalize, [...]]) ###
 Returns a 2D [Laplacian](https://en.wikipedia.org/wiki/Blob_detection#The_Laplacian_of_Gaussian) 
 kernel of size `height x width`. 
@@ -322,18 +382,20 @@ where the top-left corner is the origin. In other works, a mean of 0.5 is
 the center of the kernel size, while a standard deviation of 0.25 is a quarter
 of it.
 
+<a name="image.colormap"/>
 ### [res] image.colormap(nColor) ###
 Creates an optimally-spaced RGB color mapping of `nColor` colors. 
 Note that the mapping is obtained by generating the colors around 
 the HSV wheel, varying the Hue component.
 The returned `res` Tensor has size `nColor x 3`. 
 
+<a name="image.jetColormap"/>
 ### [res] image.jetColormap(nColor) ###
 Creates a jet (blue to red) RGB color mapping of `nColor` colors.
 The returned `res` Tensor has size `nColor x 3`. 
 
 ## Dependencies:
-Torch7 (www.torch.ch)
+[Torch7](www.torch.ch)
 
 ## Install:
 ```
author	nicholas-leonard <nick@nikopia.org>	2014-10-16 06:56:25 +0400
committer	nicholas-leonard <nick@nikopia.org>	2014-10-16 06:56:25 +0400
commit	5967a156a7d39b0d551716c46a66c01a7d787e5d (patch)
tree	f29d6ce3037d599169db0f65d03a1928b5ccf618 /README.md
parent	0994bad0134266d3d557eb2b5a2132932d82ddb6 (diff)