Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/torch/image.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authornicholas-leonard <nick@nikopia.org>2014-10-16 06:56:25 +0400
committernicholas-leonard <nick@nikopia.org>2014-10-16 06:56:25 +0400
commit5967a156a7d39b0d551716c46a66c01a7d787e5d (patch)
treef29d6ce3037d599169db0f65d03a1928b5ccf618 /README.md
parent0994bad0134266d3d557eb2b5a2132932d82ddb6 (diff)
doc anchors and cleanup
Diffstat (limited to 'README.md')
-rw-r--r--README.md124
1 files changed, 93 insertions, 31 deletions
diff --git a/README.md b/README.md
index 3c07fc0..6618cc5 100644
--- a/README.md
+++ b/README.md
@@ -1,16 +1,22 @@
# image Package Reference Manual #
-Unless speficied otherwise, this package deals with images of size
-`nChannel x height x width`.
+__image__ is the [Torch7 distribution](http://torch.ch/) package for processing
+images. It contains a wide variety of functions divided into the following categories:
* [Saving and loading](#image.saveload) images as JPEG, PNG, PPM and PGM;
* [Simple transformations](#image.simpletrans) like translation, scaling and rotation;
* [Parameterized transformations](#image.paramtrans) like convolutions and warping;
* [Graphical user interfaces](#image.grapicalinter) like display and window;
* [Color Space Conversions](#image.colorspace) from and to RGB, YUV, Lab, and HSL;
- * [Constant Tensors](#image.constanttensor) like Lenna, Fabio and Gaussian and Laplacian kernels;
+ * [Tensor Constructors](#image.tensorconst) for creating Lenna, Fabio and Gaussian and Laplacian kernels;
+
+Note that unless speficied otherwise, this package deals with images of size
+`nChannel x height x width`.
<a name="image.saveload"/>
## Saving and Loading ##
+This sections includes functions for saving and loading different types
+of images to and from disk.
+<a name="image.load"/>
### [res] image.load(filename, [depth, tensortype]) ###
Loads an image located at path `filename` having `depth` channels (1 or 3)
into a [Tensor](https://github.com/torch/torch7/blob/master/doc/tensor.md#tensor)
@@ -27,6 +33,7 @@ The returned `res` Tensor has size `nChannel x height x width` where `nChannel`
1 (greyscale) or 3 (usually [RGB](https://en.wikipedia.org/wiki/RGB_color_model)
or [YUV](https://en.wikipedia.org/wiki/YUV).
+<a name="image.save"/>
### image.save(filename, tensor) ###
Saves Tensor `tensor` to disk at path `filename`. The format to which
the image is saved is extrapolated from the `filename`'s extension suffix.
@@ -34,17 +41,22 @@ The `tensor` should be of size `nChannel x height x width`.
<a name="image.simpletrans"/>
## Simple Transformations ##
+This section includes simple but very common image transformations
+like cropping, translation, scaling and rotation.
+<a name="image.crop"/>
### [res] image.crop([dst,] src, x1, y1, [x2, y2]) ###
Crops image `src` at coordinate `(x1, y1)` up to coordinate
`(x2, y2)`. If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.translate"/>
### [res] image.translate([dst,] src, x, y) ###
Translates image `src` by `x` pixels horizontally and `y` pixels
vertically. If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.scale"/>
### [res] image.scale(src, width, height, [mode]) ###
Rescale the height and width of image `src` to have
width `width` and height `height`. Variable `mode` specifies
@@ -64,14 +76,17 @@ width of the output, respectively.
Rescale the height and width of image `src` to fit the dimensions of
Tensor `dst`.
+<a name="image.rotate"/>
### [res] image.rotate([dst,], src, theta) ###
Rotates image `src` by `theta` radians.
If `dst` is specified it is used to store the results of the rotation.
+<a name="image.hflip"/>
### [res] image.hflip([dst,] src) ###
Flips image `src` horizontally (left<->right). If `dst` is provided, it is used to
store the output image. Otherwise, returns a new `res` Tensor.
+<a name="image.vflip"/>
### [res] image.vflip([dst,], src) ###
Flips image `src` vertically (upsize<->down). If `dst` is provided, it is used to
store the output image. Otherwise, returns a new `res` Tensor.
@@ -96,6 +111,7 @@ When `saturate=true`, the result of the compression is passed through
When provided, Tensor `tensorOut` is used to store results.
Note that arguments should be provided as key-value pairs (in a table).
+<a name="image.gaussianpyramid"/>
### [res] image.gaussianpyramid([dst,] src, scales) ###
Constructs a [Gaussian pyramid](https://en.wikipedia.org/wiki/Gaussian_pyramid)
of scales `scales` from a 2D or 3D `src` image or size
@@ -110,7 +126,11 @@ Internally, this function makes use of functions [image.gaussian](#image.gaussia
<a name="image.paramtrans"/>
## Parameterized transformations ##
+This section includes functions for performing transformations on
+images requiring parameter Tensors like a warp `field` or a convolution
+`kernel`.
+<a name="image.warp"/>
### [res] image.warp([dst,]src,field,[mode,offset,clamp]) ###
Warps image `src` (of size`KxHxW`)
according to flow field `field`. The latter has size `2xHxW` where the
@@ -124,6 +144,7 @@ Permitted values are strings *clamp* (the default) or *pad*.
If `dst` is specified, it is used to store the result of the warp.
Otherwise, returns a new `res` Tensor.
+<a name="image.convolve"/>
### [res] image.convolve([dst,] src, kernel, [mode]) ###
Convolves Tensor `kernel` over image `src`. Valid string values for argument
`mode` are :
@@ -135,24 +156,7 @@ Note that this function internally uses
If `dst` is provided, it is used to store the output image.
Otherwise, returns a new `res` Tensor.
-<a name="image.toDisplayTensor"/>
-### [res] image.toDisplayTensor(input, padding, nrow, scaleeach, min, max, symmetric, saturate) ###
-Returns a single `res` Tensor that contains a grid of all in the images in `input`.
-The latter can either be a table of image Tensors of size `height x width` (greyscale) or
-`nChannel x height x width` (color),
-or a single Tensor of size `batchSize x nChannel x height x width` or `nChannel x height x width`
-where `nChannel=[3,1]`, `batchSize x height x width` or `height x width`.
-
-Unless `input` is a table and `scaleeach=false` (the default), all detected images
-are compressed with successive calls to [image.minmax](#image.minmax):
-```lua
-image.minmax{tensor=input[i], min=min, max=max, symm=symmetric, saturate=saturate}
-```
-`padding` specifies the number of padding pixels between images. The default is 0.
-`nrow` specifies the number of images per row. The default is 6.
-
-Note that arguments can also be specified as key-value arguments (in a table).
-
+<a name="image.lcn"/>
### [res] image.lcn(src, [kernel]) ###
Local contrast normalization (LCN) on a given `src` image using kernel `kernel`.
If `kernel` is not given, then a default `9x9` Gaussian is used
@@ -162,10 +166,19 @@ To prevent border effects, the image is first global contrast normalized
(GCN) by substracting the global mean and dividing by the global
standard deviation.
+Then the image is locally contrast normalized using the following equation:
```lua
res = (src - lm(src)) / sqrt( lm(src) - lm(src*src) )
```
+where `lm(x)` is the local mean of each pixel in the image (i.e.
+`image.convolve(x,kernel)`) and `sqrt(x)` is the element-wise
+square root of `x`. In other words, LCN performs
+local substractive and divisive normalization.
+Note that this implementation is different than the LCN Layer defined on page 3 of
+[What is the Best Multi-Stage Architecture for Object Recognition?](http://yann.lecun.com/exdb/publis/pdf/jarrett-iccv-09.pdf).
+
+<a name="image.erode"/>
### [res] image.erode(src, [kernel, pad]) ###
Performs a [morphological erosion](https://en.wikipedia.org/wiki/Erosion_(morphology))
on binary (zeros and ones) image `src` using odd
@@ -174,6 +187,7 @@ The default is a kernel consisting of ones of size `3x3`. Number
`pad` is the value to assume outside the image boundary when performing
the convolution. The default is 1.
+<a name="image.dilate"/>
### [res] image.dilate(src, [kernel, pad]) ###
Performs a [morphological dilation](https://en.wikipedia.org/wiki/Dilation_(morphology))
on binary (zeros and ones) image `src` using odd
@@ -184,9 +198,34 @@ the convolution. The default is 0.
<a name="image.grapicalinter"/>
## Graphical User Interfaces ##
-The following functions require package [qtlua](https://github.com/torch/qtlua).
+The following functions, except for [image.toDisplayTensor](#image.toDisplayTensor),
+require package [qtlua](https://github.com/torch/qtlua) and can only be
+accessed via the `qlua` Lua interpreter (as opposed to the
+[th](https://github.com/torch/trepl) or luajit interpreter).
+
+<a name="image.toDisplayTensor"/>
+### [res] image.toDisplayTensor(input, [...]) ###
+Optional arguments `[...]` expand to `padding`, `nrow`, `scaleeach`, `min`, `max`, `symmetric`, `saturate`.
+Returns a single `res` Tensor that contains a grid of all in the images in `input`.
+The latter can either be a table of image Tensors of size `height x width` (greyscale) or
+`nChannel x height x width` (color),
+or a single Tensor of size `batchSize x nChannel x height x width` or `nChannel x height x width`
+where `nChannel=[3,1]`, `batchSize x height x width` or `height x width`.
+
+Unless `input` is a table and `scaleeach=false` (the default), all detected images
+are compressed with successive calls to [image.minmax](#image.minmax):
+```lua
+image.minmax{tensor=input[i], min=min, max=max, symm=symmetric, saturate=saturate}
+```
+`padding` specifies the number of padding pixels between images. The default is 0.
+`nrow` specifies the number of images per row. The default is 6.
+
+Note that arguments can also be specified as key-value arguments (in a table).
-### [res] image.display(input, zoom, min, max, legend, w, ox, oy, scaleeach, gui, offscreen, padding, symm, nrow) ###
+<a name="image.display"/>
+### [res] image.display(input, [...]) ###
+Optional arguments `[...]` expand to `zoom`, `min`, `max`, `legend`, `win`,
+`x`, `y`, `scaleeach`, `gui`, `offscreen`, `padding`, `symm`, `nrow`.
Displays `input` image(s) with optional saturation and zooming.
The `input`, which is either a Tensor of size `HxW`, `KxHxW` or `Kx3xHxW`, or list,
is first prepared for display by passing it through [image.toDisplayTensor](#image.toDisplayTensor):
@@ -200,67 +239,86 @@ The resulting `input` will be displayed using [qtlua](https://github.com/torch/q
The displayed image will be zoomed by a factor of `zoom`. The default is 1.
If `gui=true` (the default), the graphical user inteface (GUI)
is an interactive window that provides the user with the ability to zoom in or out.
-This can be turned off for a faster display.
+This can be turned off for a faster display. `legend` is a legend to be displayed,
+which has a default value of `image.display`. `win` is an optional qt window descriptor.
+If `x` and `y` are given, they are used to offset the image. Both default to 0.
+When `offscreen=true`, rendering (to generate images) is performed offscreen.
-
-### [window, painter] image.window(resize, mousepress, mousedoublepress) ###
-Creates a window context for images.
+<a name="image.window"/>
+### [window, painter] image.window([...]) ###
+Creates a window context for images.
+Optional arguments `[...]` expand to `hook_resize`, `hook_mousepress`, `hook_mousedoublepress`.
+These have a default value of `nil`, but may correspond to commensurate qt objects.
<a name="image.colorspace"/>
## Color Space Conversions ##
+This section includes functions for performing conversions between
+different color spaces.
+<a name="image.rgb2lab"/>
### [res] image.rgb2lab([dst,] src) ###
Converts a `src` RGB image to [Lab](https://en.wikipedia.org/wiki/Lab_color_space).
If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.rgb2yuv"/>
### [res] image.rgb2yuv([dst,] src) ###
Converts a RGB image to YUV. If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.yuv2rgb"/>
### [res] image.yuv2rgb([dst,] src) ###
Converts a YUV image to RGB. If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.rgb2y"/>
### [res] image.rgb2y([dst,] src) ###
Converts a RGB image to Y (discard U and V).
If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.rgb2hsl"/>
### [res] image.rgb2hsl([dst,] src) ###
Converts a RGB image to [HSL](https://en.wikipedia.org/wiki/HSL_and_HSV).
If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.hsl2rgb"/>
### [res] image.hsl2rgb([dst,] src) ###
Converts a HSL image to RGB.
If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.rgb2hsv"/>
### [res] image.rgb2hsv([dst,] src) ###
Converts a RGB image to [HSV](https://en.wikipedia.org/wiki/HSL_and_HSV).
If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.hsv2rgb"/>
### [res] image.hsv2rgb([dst,] src) ###
Converts a HSV image to RGB.
If `dst` is provided, it is used to store the output
image. Otherwise, returns a new `res` Tensor.
+<a name="image.rgb2nrgb"/>
### [res] image.rgb2nrgb([dst,] src) ###
Converts an RGB image to normalized-RGB.
-
-## Constant Tensors ##
-The following functions construct Tensor constants like Gaussian or
+<a name="image.tensorconst"/>
+## Tensor Constructors ##
+The following functions construct Tensors like Gaussian or
Laplacian kernels, or images like Lenna and Fabio.
+<a name="image.lena"/>
### [res] image.lena() ###
Returns the classic `Lenna.jpg` image as a `3 x 512 x 512` Tensor.
+<a name="image.fabio"/>
### [res] image.fabio() ###
Returns the `fabio.jpg` image as a `257 x 271` Tensor.
+<a name="image.gaussian"/>
### [res] image.gaussian([size, sigma, amplitude, normalize, [...]]) ###
Returns a 2D [Gaussian](https://en.wikipedia.org/wiki/Gaussian_function)
kernel of size `height x width`. When used as a Gaussian smoothing operator in a 2D
@@ -285,6 +343,7 @@ of it.
Note that arguments can also be specified as key-value arguments (in a table).
+<a name="image.gaussian1D"/>
### [res] image.gaussian1D([size, sigma, amplitude, normalize, mean]) ###
Returns a 1D Gaussian kernel of size `size`, mean `mean` and standard
deviation `sigma`.
@@ -299,6 +358,7 @@ while a standard deviation of 0.25 is a quarter of it.
Note that arguments can also be specified as key-value arguments (in a table).
+<a name="image.laplacian"/>
### [res] image.laplacian([size, sigma, amplitude, normalize, [...]]) ###
Returns a 2D [Laplacian](https://en.wikipedia.org/wiki/Blob_detection#The_Laplacian_of_Gaussian)
kernel of size `height x width`.
@@ -322,18 +382,20 @@ where the top-left corner is the origin. In other works, a mean of 0.5 is
the center of the kernel size, while a standard deviation of 0.25 is a quarter
of it.
+<a name="image.colormap"/>
### [res] image.colormap(nColor) ###
Creates an optimally-spaced RGB color mapping of `nColor` colors.
Note that the mapping is obtained by generating the colors around
the HSV wheel, varying the Hue component.
The returned `res` Tensor has size `nColor x 3`.
+<a name="image.jetColormap"/>
### [res] image.jetColormap(nColor) ###
Creates a jet (blue to red) RGB color mapping of `nColor` colors.
The returned `res` Tensor has size `nColor x 3`.
## Dependencies:
-Torch7 (www.torch.ch)
+[Torch7](www.torch.ch)
## Install:
```